0:00:06um i haven't unique challenges you i'm
0:00:09it in that case
0:00:10um and we can see i'm sad also
0:00:12but that's my reading
0:00:14or
0:00:15and then
0:00:15i was
0:00:16so
0:00:17i have to prevent this
0:00:18instead of
0:00:19fast
0:00:19the colours
0:00:21but
0:00:21to begin
0:00:22which
0:00:23a two
0:00:25some D V C R
0:00:27which mentioned simple code you know it's
0:00:32oh
0:00:35oh
0:00:37um this that is
0:00:38on that
0:00:42oh
0:00:43it could mean human speech
0:00:44so topic
0:00:46so what can we
0:00:47all copyrights
0:00:48good
0:00:49and use it
0:00:49in in
0:00:50two
0:00:51yeah i'm
0:00:52fine
0:00:53fig
0:00:53in the future you may be possible meaning i'm one sport
0:00:56that
0:00:57uh but
0:00:58equation
0:00:58just one
0:00:59yes
0:01:00well
0:01:01that are taken as you both
0:01:03just
0:01:03oh
0:01:04problem is
0:01:05okay well
0:01:05the big one speak
0:01:07search
0:01:08and then
0:01:10yep
0:01:10this problem
0:01:11cool
0:01:13and
0:01:13because then if you're still
0:01:15yeah
0:01:15two
0:01:17so we in this
0:01:18talk
0:01:18we evaluate
0:01:20how to secure
0:01:21the speaker verification systems uh
0:01:23okay
0:01:24fig speech
0:01:25yeah cool
0:01:26someone
0:01:27speech voices using just and sent
0:01:30but once and
0:01:31but we can call
0:01:32and a speaker's voice from ten sentences
0:01:37but this is a content
0:01:38my talk
0:01:39i to talk about
0:01:40some
0:01:41yeah now introductions
0:01:42um i'm wrong
0:01:44and then we
0:01:45there's a lot
0:01:46recognition ideals
0:01:48the S U N
0:01:49then
0:01:50i we show some of its work
0:01:52which
0:01:53right then
0:01:55i cats
0:01:56this year
0:01:57and then i think
0:01:58panes of a speaker verification systems
0:02:01for by
0:02:02i think the system
0:02:03useful
0:02:04yeah
0:02:05and then i mean streak payment conditions
0:02:07um
0:02:08uh
0:02:09and then
0:02:10i wish also
0:02:11or some
0:02:12quiet
0:02:13to detect
0:02:14synthetic speech
0:02:16speaker verification
0:02:17using
0:02:17i mean so
0:02:18cool
0:02:19oh yeah yeah yeah
0:02:20and is that what it right
0:02:22and they are somewhat item i
0:02:28so do you know about that
0:02:31but it's not a you know
0:02:32no my kids will be
0:02:34because we can assist them
0:02:36how some
0:02:37you know
0:02:37but i know we used to
0:02:39tts systems that used to be
0:02:41and
0:02:42in the conventional we thought
0:02:44conventional scenarios
0:02:45pdf is then that's true
0:02:47ah
0:02:47it's great you need selection tts system is
0:02:50right
0:02:51so or what combos on technique
0:02:53unit selection if the just
0:02:54and peso
0:02:56equated with ones
0:02:57and he and then transform
0:02:59so someone's voice
0:03:00to target speaker with
0:03:02using your joint probability of gmm
0:03:05trained on all right
0:03:06right
0:03:07um
0:03:08three
0:03:09in any of that
0:03:10six
0:03:11well beatification
0:03:12of course
0:03:13can be things i
0:03:14from only you can see but
0:03:16um also since uh
0:03:18oh
0:03:19with a fair fight
0:03:20five
0:03:21can be transformed into
0:03:23basically how good
0:03:24ha ha
0:03:25a voice
0:03:26you think this document
0:03:27but this
0:03:29combination
0:03:30oh
0:03:30problem
0:03:31speaker
0:03:31or something
0:03:32i think it's probably
0:03:33speaker verification system
0:03:36but
0:03:36all the is distance
0:03:38is it
0:03:39this
0:03:40yeah fictional
0:03:41our tts systems uh
0:03:43it can be
0:03:44speech synthesis
0:03:45cross speaker adaptation
0:03:47such as embedded uh
0:03:49it
0:03:50um
0:03:50this
0:03:51just a
0:03:52also
0:03:53what's the problem
0:03:54speaker vacations because speaker adaptation scan possible
0:03:57speaker independent agent
0:03:59which
0:04:00which
0:04:01which
0:04:02um
0:04:03which are cool but this was more distinctly dsp use
0:04:06into the target
0:04:08okay
0:04:08a voice using small amount of data
0:04:11and then
0:04:12uh that
0:04:13any i don't use for verification
0:04:15can
0:04:16insights from
0:04:16and update more
0:04:18so
0:04:19this
0:04:19justin also
0:04:21probably
0:04:22speaker recognition
0:04:23but
0:04:24we'll be the justice system
0:04:25it's more
0:04:27probably
0:04:28more
0:04:29interest
0:04:30it's combinations i needed
0:04:31i think
0:04:32basically
0:04:33that's
0:04:33and this problem was fast
0:04:35reported by my scores
0:04:37and you have a go
0:04:39so why do we need
0:04:40why do we need
0:04:41this is used
0:04:43there are several times
0:04:44um
0:04:46the positive and its performance
0:04:48it is it and basically
0:04:50thanks
0:04:50the whole month of its fear
0:04:52it can be this way
0:04:53that's right
0:04:54quite different ways
0:04:56in power
0:04:56the quality of a ten base
0:04:58yeah it's no problem
0:05:00with
0:05:00well detection systems
0:05:02and
0:05:03and then
0:05:04well
0:05:04it
0:05:05it in disobedience
0:05:06of holding elections
0:05:08more specifically in basically field agent based
0:05:12it's
0:05:12same as human
0:05:15uh under
0:05:15speaker adaptation techniques
0:05:17what speech is a hot
0:05:19what
0:05:19maybe
0:05:20cochlea
0:05:21it
0:05:21well yes
0:05:22we can do
0:05:23speaker adaptation
0:05:24unsupervised
0:05:25but uh
0:05:26like
0:05:27is that
0:05:28which add up to a much past
0:05:31we got a job which is
0:05:33and also we need
0:05:34be able to use
0:05:35we can use
0:05:36when the us
0:05:37in part
0:05:38clean speech data
0:05:39uh
0:05:40fig
0:05:40adaptation data
0:05:43so
0:05:45taken together
0:05:46it is now possible
0:05:48automatically create how did it
0:05:49because tts voices from any at all
0:05:52what about it
0:05:54which
0:05:55i thought that
0:05:55right
0:05:56no
0:05:57which means
0:05:58what do by
0:05:59oh
0:06:00available ones of it
0:06:01can be used
0:06:02oh
0:06:02affecting
0:06:04speaker but it was just
0:06:07so i think should not you
0:06:08yes
0:06:09not you
0:06:10he
0:06:12that's fine
0:06:13speech data
0:06:13five a quite well
0:06:15well the gas
0:06:17well look at
0:06:19or texture
0:06:20like this
0:06:21you know we can record my speech
0:06:23i think
0:06:24my speech might be like
0:06:26right
0:06:26um anyway
0:06:27so
0:06:28we can
0:06:29why a speech one board
0:06:31all the cows
0:06:32well cast
0:06:32because jazz
0:06:33well it
0:06:35then using this
0:06:36well
0:06:36p2p does
0:06:37that that uh
0:06:39yeah about it can be a speech
0:06:40systems
0:06:41right
0:06:42the other ones
0:06:43then
0:06:44you think about
0:06:45what is
0:06:45but
0:06:46yeah
0:06:46right speech
0:06:47or beatification
0:06:48useful
0:06:50because it gives
0:06:51it is
0:06:51and then
0:06:52we prepared
0:06:54accept samples
0:06:56um
0:06:56which
0:06:57much
0:06:58to the scenarios
0:06:59so we really is terrific speech from yeah
0:07:02does it have a year
0:07:03which
0:07:04but cats
0:07:05and also
0:07:06clean it up
0:07:08they can
0:07:08well i guess
0:07:10i go
0:07:10you know
0:07:12it's got
0:07:13and there
0:07:14um see fig speech
0:07:15is
0:07:16you know
0:07:16i pray
0:07:17couple samples on this in six speech samples
0:07:20oh okay from a genocide that
0:07:22yeah
0:07:23it's put together and how
0:07:25yes
0:07:25so what's up with
0:07:26george bush
0:07:29yeah
0:07:42so he's adapted with this meeting we keep a T S
0:07:45one george bush
0:07:46or not
0:07:47and then
0:07:49clean it up
0:07:50fig
0:07:57yeah
0:07:59right
0:08:02can you identify how
0:08:04oh
0:08:05oh meeting people communicate
0:08:08yeah
0:08:09maybe
0:08:09and and of course
0:08:11yeah it's inside
0:08:12speech
0:08:13yeah i know
0:08:36so
0:08:36the
0:08:37with this
0:08:37just
0:08:38octaves
0:08:39have also
0:08:40but uh
0:08:42oh
0:08:45size
0:08:45so is this poses
0:08:47yeah times
0:08:47with
0:08:48fig
0:08:51so
0:08:52um
0:08:52yeah
0:08:53let's go back to
0:08:54sorry
0:08:56um
0:08:57but
0:08:59okay
0:09:02so
0:09:03i hope you understand that
0:09:05the security issues of this
0:09:07and then we use um
0:09:09explain
0:09:09i'm sure that is uh
0:09:10yeah i guess
0:09:11two thousand and
0:09:12that's
0:09:13so we
0:09:14we use it in this
0:09:15databases
0:09:16which
0:09:17ah
0:09:18i agree
0:09:19but
0:09:19speech
0:09:20uh why
0:09:21when you and john
0:09:22because
0:09:23and then we you really really simple speaker verification system is in place because
0:09:27um but we well i yeah i know
0:09:30this then
0:09:30but yeah so what standard gmm ubm
0:09:33and also you know gaussian
0:09:34but but if it's at the end
0:09:36which
0:09:37you know some people
0:09:38this
0:09:38yeah
0:09:39use
0:09:40right now
0:09:40um you know
0:09:41so the with
0:09:42score normalisation feature
0:09:44normalisation
0:09:45but when he is
0:09:47there's no significant device
0:09:49this from a point of views
0:09:51because in
0:09:52most cases
0:09:53the speaker verification system
0:09:55a tape
0:09:56green
0:09:56fig speech
0:09:57voice
0:09:58you know i think
0:10:00um
0:10:00so
0:10:01in the store
0:10:02i
0:10:02it was one indian
0:10:04you'd be in it
0:10:06but
0:10:06what we have used
0:10:08but
0:10:09um which are basically the same
0:10:12so this is the design
0:10:14previous
0:10:15so it's
0:10:16oh
0:10:17oh
0:10:17well
0:10:18what distributions one ten german speakers
0:10:21um
0:10:22this
0:10:22we do not
0:10:23sure
0:10:24school
0:10:24what human speech
0:10:26target
0:10:27because
0:10:28um
0:10:28the
0:10:29human
0:10:29that
0:10:30ha
0:10:30sure the human
0:10:32each
0:10:32well this was just
0:10:34and
0:10:35this is a
0:10:35i see fig speech about
0:10:37impostors
0:10:38it is not
0:10:39with a button
0:10:42uh
0:10:42you did
0:10:43and then this is a
0:10:44fig speech about
0:10:45oh i guess
0:10:47and
0:10:47green one
0:10:48really
0:10:50right
0:10:50these figures
0:10:51sure
0:10:52scene six speech will again
0:10:55that you can see
0:10:56these qualities previews on
0:10:58for human size
0:10:59speech
0:11:00for both
0:11:01postures
0:11:02and and also
0:11:03what to do green
0:11:04i need
0:11:05i think
0:11:07yeah that was
0:11:08it can
0:11:10okay
0:11:11it's not
0:11:11and
0:11:12but was yes
0:11:12but the problem is you know pretty
0:11:14payment
0:11:15because number of speakers is
0:11:17yeah
0:11:18yeah
0:11:19too small
0:11:20and then
0:11:21the speech data use
0:11:22was a
0:11:23read speech tagged as
0:11:26but you know oh i think it's not you speech data
0:11:29why
0:11:30be you know
0:11:31it's assumed to be not a
0:11:33clean
0:11:35so
0:11:37in this
0:11:38cool
0:11:40in this new book
0:11:41so we use three hundred speakers
0:11:43included
0:11:43was the channel zero
0:11:45i would say to
0:11:46eight
0:11:46so what
0:11:47right
0:11:47they were
0:11:49this this
0:11:50oh
0:11:51much of it that some tts corpora because
0:11:53yes
0:11:54this is
0:11:55yeah i agree
0:11:56you know it's not perfect
0:11:58three
0:11:59you know and vitamin
0:12:00and stuff
0:12:02is
0:12:02because
0:12:03and p2p
0:12:04uh what is the point or something else
0:12:06i think snotty
0:12:08and also we therefore happiness to you
0:12:10formation on missile could detect
0:12:11fig
0:12:12speech
0:12:13because it cation systems
0:12:15what sample sample
0:12:16it with
0:12:17sup sup
0:12:18i thought it was a mess
0:12:20fig
0:12:20speech
0:12:21in speaker verification
0:12:23wow
0:12:24but again
0:12:25which is you
0:12:26speech becomes much better
0:12:28someone
0:12:28so we have a body
0:12:30dismissal
0:12:31obvious
0:12:32certainly
0:12:33um
0:12:33probably more
0:12:34this impostors
0:12:36a lot it's and it's
0:12:37hmmm
0:12:41um
0:12:42histology about your name
0:12:44ubm
0:12:45guns
0:12:46i think right
0:12:46but
0:12:47you
0:12:48uh the way you want it
0:12:50uh we use
0:12:51if the end of the
0:12:52the the stuff
0:12:53no energy on it
0:12:55data
0:12:56um we a bright future
0:12:57one thing
0:12:58right
0:12:59robustness
0:12:59proposed
0:13:00by then
0:13:01uh we had that is
0:13:03G and then you'd mark
0:13:04adaptation
0:13:06um
0:13:06in addition to
0:13:07what janet was we evaluate it
0:13:10yeah but you didn't system
0:13:12you'd be used for the whole process
0:13:14which
0:13:14we have a
0:13:15but
0:13:15because
0:13:16and uh
0:13:17okay right
0:13:17what
0:13:19right okay about
0:13:21right
0:13:22and
0:13:23which is
0:13:24level or more
0:13:26is that it
0:13:27right
0:13:28so
0:13:29probably
0:13:29this
0:13:30she's
0:13:30be
0:13:31and uh
0:13:33um
0:13:34this is the
0:13:35quite well but
0:13:36that over the whole
0:13:38i don't speak about it
0:13:39it's
0:13:40so quite because it's in this piece of this
0:13:42it's the complex it's
0:13:44that's that's
0:13:45in march possibility
0:13:46i really want in speech right
0:13:48but
0:13:49no
0:13:49speaking
0:13:50so we use
0:13:51this guy same technique uh
0:13:53it starts
0:13:54training
0:13:54average for some of this
0:13:56which is
0:13:57basically
0:13:57yeah
0:13:58i did
0:13:59ubm
0:14:00or
0:14:01speaker independent agenda
0:14:02so we use
0:14:04because of it i mean
0:14:05yes it is hot
0:14:06it is with some of the
0:14:08yeah we
0:14:09uh
0:14:09uh what is
0:14:10you think
0:14:11adidas
0:14:12functional like houdini regulations
0:14:14well you know pulse train and made it off or
0:14:17it's not about
0:14:18see in the data
0:14:20yeah
0:14:20small amount of because
0:14:21okay
0:14:22be
0:14:23then
0:14:24we generate
0:14:25acoustic on that
0:14:27such as
0:14:28but
0:14:28um uh so
0:14:29each duration so some
0:14:31noise
0:14:32for me
0:14:32citations from the side of it and then
0:14:35you mean maximum likelihood
0:14:36on occasion as well i
0:14:38proposed by
0:14:39with a ninety five
0:14:41for this taken out
0:14:42can you it's
0:14:42yeah
0:14:43how much someone says
0:14:45and then
0:14:46and then
0:14:47you think it is generated
0:14:48acoustic um it does
0:14:49we run
0:14:50and i would be
0:14:53with the whole
0:14:54right
0:14:55proposed by colour
0:14:59and then
0:15:00this is about patience
0:15:01so we can create
0:15:02new
0:15:04tts voice
0:15:05from
0:15:07um
0:15:08senior
0:15:09just
0:15:10that's from three minutes of speech data
0:15:12was
0:15:13if
0:15:14speech database
0:15:15a bit of more quickly becomes bit
0:15:17but
0:15:17minimum
0:15:19the meeting
0:15:20if
0:15:20where am i
0:15:21yeah
0:15:21i think i'm leery ha
0:15:24of them with this or that
0:15:26and this
0:15:27small
0:15:28sure
0:15:29at that
0:15:29individual speakers
0:15:31and then they
0:15:32well actually the
0:15:34a female speakers
0:15:35and in this
0:15:36remark
0:15:37sure the male speaker
0:15:38other people will
0:15:40uh as you can see
0:15:42this paper
0:15:42how about
0:15:43his point
0:15:45and also china
0:15:46and so on
0:15:47um
0:15:47you
0:15:48and that's it
0:15:50and sounds
0:15:51which one
0:15:54and that was my question
0:15:56how many voices available in
0:15:58mark
0:15:58can be
0:16:00who the speaker verification systems
0:16:06so again
0:16:07our scenario
0:16:08it's not building tts system
0:16:10on speaker verification databases
0:16:12it is no money you don't narrow band
0:16:14ooh
0:16:15go to the noise
0:16:16or maybe all five microphones
0:16:19oh what can i do
0:16:20is
0:16:20you know
0:16:21most of my nearest acquire speech because
0:16:24um you know we
0:16:26why you
0:16:27crises
0:16:28like this
0:16:29they adapt
0:16:30yeah
0:16:31fine
0:16:31so we use
0:16:33okay i think we we use
0:16:34also i don't know
0:16:35um
0:16:36data bases
0:16:37sort of this
0:16:39database
0:16:39yes
0:16:40um
0:16:41two hundred eighty four speakers
0:16:42uh we
0:16:43weeks
0:16:44once because
0:16:45fig
0:16:46can you got even
0:16:47uh we use
0:16:48and it's a speech
0:16:49and then we buy
0:16:51excited for it
0:16:53speaker but you in to see it
0:16:55if you see the old
0:16:56and that it
0:16:57it's for
0:16:58training data source
0:17:00tts
0:17:01um in the set they retrain
0:17:03average voice models
0:17:05or by speaker adaptation
0:17:07individual speakers
0:17:08we use
0:17:09she made it out was trained and data for the patient
0:17:12and that be it
0:17:13training data that's for speaker recognition systems
0:17:16um
0:17:17right
0:17:18any buzz about that one
0:17:19what is
0:17:20in a moment
0:17:21uh we have that
0:17:23yeah that's what
0:17:24but
0:17:25set see it has been as
0:17:27which have
0:17:28these accounts
0:17:30all speech data part
0:17:31but also
0:17:31that's because
0:17:32and this
0:17:33to be
0:17:35if that's true
0:17:36speech data
0:17:37just from
0:17:38useful cations
0:17:39um i did for a couple of samples
0:17:42um
0:17:43data from this
0:17:44yes
0:17:45trained on this was original data
0:17:49oh
0:17:57come on
0:17:58one
0:17:58this policy
0:18:01yeah
0:18:09yeah
0:18:24yeah
0:18:24so
0:18:25is this too long reverberation
0:18:27you you
0:18:28huh
0:18:28this thing
0:18:30is um
0:18:32yeah
0:18:32a big car
0:18:33they show yeah
0:18:34right
0:18:34ready to
0:18:35additionally the weight of a
0:18:37oh
0:18:37you must
0:18:38not
0:18:39um
0:18:40and of the you know the
0:18:42equal error rate
0:18:43it
0:18:44just
0:18:44the point five
0:18:45this is a
0:18:46false alarm probabilities and
0:18:48season
0:18:48diction
0:18:50um
0:18:51so
0:18:52we can
0:18:52see
0:18:53speaker verification
0:18:54for human speech
0:18:56so you don't know
0:18:57yeah
0:18:59but that's why you know we can say our speaker verification systems channel
0:19:04they are
0:19:04can't distinguish
0:19:06because yeah speakers part
0:19:07almost part
0:19:09and the
0:19:09this is that is that
0:19:11human
0:19:12speech
0:19:12but
0:19:13speech
0:19:14um
0:19:15if the score distributions
0:19:17uh
0:19:19similar to create
0:19:21i mean this is the human speech
0:19:23what are you
0:19:24fig
0:19:24because
0:19:25um
0:19:26this is the same sex
0:19:27speech about
0:19:28target because
0:19:29um this is a human speech
0:19:31input just
0:19:32well this is
0:19:33six
0:19:33speech but also
0:19:34just
0:19:36and
0:19:37the distribution
0:19:39all this
0:19:40was good
0:19:40this for distribution
0:19:42um no
0:19:43i don't know anymore
0:19:44but as you can
0:19:46these
0:19:46they uh
0:19:47significant or whatever
0:19:49in but
0:19:50in march
0:19:50claimant is
0:19:52where
0:19:52lies voice
0:19:53okay
0:19:54you know
0:19:54maybe the extreme um hum
0:19:57about
0:19:57ninety percent
0:19:59speech
0:19:59but
0:20:00it it
0:20:01so
0:20:03see
0:20:03much
0:20:05train
0:20:06uh two hundred
0:20:07sixty
0:20:08was
0:20:10oh
0:20:11fig
0:20:11two hundred six people
0:20:13was actually
0:20:15so someone is of course
0:20:17but despite
0:20:18excellent performance
0:20:19because the case was this thing which
0:20:21uh
0:20:21one
0:20:22why
0:20:22it out i
0:20:23all right
0:20:24the speaker i didn't
0:20:25speaker
0:20:26his eyes
0:20:27before
0:20:28speaker out of it
0:20:29it is because this is
0:20:31hi
0:20:32enough to allow the use
0:20:33right
0:20:34pause
0:20:35to do human
0:20:35right
0:20:36going on
0:20:38see what i keep up
0:20:40well
0:20:40what
0:20:41yeah
0:20:42um
0:20:43because they have significant overlap
0:20:45i just meant
0:20:46decision
0:20:47the shooting was
0:20:48one of my vision
0:20:50uh uh like
0:20:51the head
0:20:53so of course problem is how can we
0:20:55this
0:20:57yeah we are not
0:20:58all right
0:20:58right
0:20:59it was
0:20:59so yeah i
0:21:00yeah i just
0:21:01so we
0:21:02why
0:21:03yeah
0:21:03extra missile
0:21:04yeah the commission on it
0:21:06which uh
0:21:07nothing
0:21:08if i see them like we do
0:21:10what's your idea i propose but so what
0:21:13and also we use
0:21:13what is that what data rate
0:21:15um we can from the us
0:21:18curious
0:21:19you know
0:21:19oh no
0:21:21um
0:21:22a base
0:21:22and define
0:21:23right
0:21:24it's pretty
0:21:25both
0:21:26it just
0:21:27define sees the right kind of video thing
0:21:31this is the like
0:21:33right
0:21:33right on
0:21:34yeah
0:21:35um
0:21:36we
0:21:37of it
0:21:39but this is simple
0:21:40but he was
0:21:41useful
0:21:42do they
0:21:42six
0:21:42speech
0:21:43because
0:21:44p2p anything from a challenge
0:21:49and how
0:21:50or was
0:21:50this project we switch out
0:21:52it's more of a spy
0:21:53i
0:21:55that's
0:21:56and also things expedia the unit selection
0:21:58and have
0:21:59john
0:22:00trajectories
0:22:01uh
0:22:01uh
0:22:02change from point
0:22:03which is that
0:22:04data is
0:22:05yeah i yeah
0:22:06i
0:22:07but
0:22:08and tedious and it can be a speeding is
0:22:11included
0:22:12some global time but is from all this
0:22:14by
0:22:15the
0:22:16with
0:22:16kind of
0:22:17for what some of them
0:22:18effect
0:22:19both
0:22:19project for the
0:22:20in fact
0:22:21this
0:22:22um this is that is that
0:22:24average five year
0:22:25we do right sure
0:22:26human speech
0:22:28i think it
0:22:29a few months
0:22:30um
0:22:31the same one
0:22:31well
0:22:32speech
0:22:34and it
0:22:35if if angel
0:22:36that's okay
0:22:37speech
0:22:37and you can be
0:22:39they have
0:22:40quite
0:22:40all brought up
0:22:41and therefore
0:22:43this measure
0:22:44no longer robust
0:22:45you know
0:22:46fig
0:22:47speech
0:22:48cool
0:22:48they yeah
0:22:49it ended
0:22:50this
0:22:51and
0:22:53uh
0:22:54because i
0:22:55yes uh well you know it
0:22:57because
0:22:58in speech patterns to use
0:23:00if the school or
0:23:02six
0:23:02speech
0:23:03maybe
0:23:04okay
0:23:05fictions
0:23:06uh based in p2p humour
0:23:09speech
0:23:09so we sort
0:23:11it might be possible to save in p2p you
0:23:13fig speech
0:23:14yeah what they like
0:23:15it's all
0:23:16um
0:23:17we
0:23:18p2p up to a month
0:23:20yeah marcy
0:23:21it it E G
0:23:22okay i'm up for it
0:23:23yeah
0:23:24oh
0:23:24um evaluate
0:23:26well be right
0:23:27human speech
0:23:28um
0:23:29fig
0:23:29it
0:23:30um
0:23:31this is the weather right
0:23:32this is a
0:23:33yeah there are a
0:23:35as you can
0:23:36the
0:23:37we tested it
0:23:38fig speech
0:23:39was found to have
0:23:40data where there are a few
0:23:43or both
0:23:43grammar
0:23:45a few months while they're writing about
0:23:47involved in
0:23:47for the first six speech where they just say well
0:23:51in it means that
0:23:52if you go
0:23:52grammar
0:23:54yeah there are huge differences
0:23:56uh
0:23:57and then
0:23:58this
0:23:58it's too
0:23:59even for the adaptation data is
0:24:01just
0:24:02one me
0:24:02speech today
0:24:04so
0:24:04it is not i
0:24:06you yeah
0:24:07what they write
0:24:07is that
0:24:08fig
0:24:09fig
0:24:15um
0:24:15i to summarise my talk
0:24:17um
0:24:18this but
0:24:19the extent
0:24:20almost
0:24:20speaker verification
0:24:21yeah
0:24:22yeah
0:24:23speaker age and it
0:24:24because i didn't
0:24:26speech
0:24:26yeah
0:24:27i got that
0:24:27a channel
0:24:28yeah
0:24:28this
0:24:29something
0:24:30school it's tedious
0:24:31it's high enough of these
0:24:33inside was
0:24:34possible
0:24:34to the human right
0:24:36this thing brought it
0:24:37the speech data available
0:24:40i guess
0:24:41can be
0:24:42you import
0:24:43speaker verification
0:24:44this can
0:24:44in
0:24:45i don't know how many
0:24:47well i guess
0:24:47but
0:24:49or support because
0:24:51oh
0:24:51it is
0:24:52impostors
0:24:53okay
0:24:54fig
0:24:55yeah
0:24:55and then i'll mention a missile
0:24:57you think
0:24:59uh commissioning but yes i hear it
0:25:01or what they write
0:25:03fig
0:25:04fig
0:25:04what
0:25:05no
0:25:05moreover
0:25:06robust
0:25:07no
0:25:09but
0:25:10yeah but it is
0:25:10this you know security issues
0:25:12we
0:25:13and we like to do these
0:25:14this
0:25:15voice going
0:25:16speaker adaptation
0:25:17two
0:25:19for free or on the way
0:25:21right right
0:25:22provides a base
0:25:23what's going on
0:25:27well but you don't know why
0:25:29um
0:25:30so
0:25:30this technique
0:25:32um
0:25:33um
0:25:34we have about them and
0:25:35from all speakers
0:25:36what
0:25:37and
0:25:39so
0:25:39national
0:25:40in it is not his fantasies you please
0:25:43and uh i
0:25:44and you like to
0:25:45it's hard
0:25:47that's you you can
0:25:49because this technique
0:25:50can cool
0:25:52people's
0:25:52has
0:25:53yeah
0:25:54talking some T Vs
0:25:56cool
0:25:56sample
0:25:57and you want to use
0:25:59we have
0:26:00right
0:26:01be
0:26:01um just techniques
0:26:02can
0:26:03because
0:26:04welcome
0:26:04hoping
0:26:05someone
0:26:06that's just
0:26:06because voiced and use the voice
0:26:09um we can associate with
0:26:10they are embedded devices
0:26:13that's
0:26:14voice
0:26:14indication eight
0:26:17so
0:26:18yeah that was
0:26:18we need
0:26:19they do you future
0:26:21but it is
0:26:22the screen
0:26:23voice
0:26:24since it
0:26:24voice
0:26:25and he must
0:26:26oh
0:26:26oh
0:26:27that um this
0:26:31that's all
0:26:37right
0:26:38presentation
0:26:40uh
0:26:40we should
0:26:45oh
0:26:46so
0:26:49oh
0:26:50or
0:26:52four
0:26:53sure
0:26:57oh
0:26:58uh
0:27:01with
0:27:02oh you do
0:27:05oh
0:27:06right
0:27:07which
0:27:09hmmm
0:27:10so
0:27:11but
0:27:13oh
0:27:14replica guns working on speech transmission to
0:27:21oh
0:27:25your your
0:27:26yes
0:27:29which
0:27:31i see
0:27:32but i
0:27:34ninety percent of the voices box that accent
0:27:37so even speaker verification
0:27:39cranes
0:27:40to start with
0:27:41sure uh
0:27:42identical people
0:27:44well i think of puzzles
0:27:45we have to that
0:27:46the uh
0:27:48to speech
0:27:48one
0:27:50ooh
0:27:52mark
0:27:55oh
0:27:55four
0:27:56yeah
0:27:58oh
0:28:01hmmm
0:28:01oh
0:28:03for a moment
0:28:06we
0:28:08um
0:28:10or
0:28:13oh
0:28:14or
0:28:15but
0:28:18oh
0:28:19well when we were different circumstance
0:28:24yeah
0:28:28oh
0:28:28oh
0:28:30oh
0:28:31sorry
0:28:32sure
0:28:33true
0:28:34oh
0:28:36oh
0:28:37well if we
0:28:39yeah
0:28:41for the money
0:28:42would be to model
0:28:45i'm not like
0:28:47uh_huh drawn from from uh yeah that too
0:28:52well
0:28:52right
0:28:53one
0:28:54actually going on
0:28:57we we do
0:28:59um
0:29:00right
0:29:01perhaps a big challenge
0:29:04yeah i think that's that's the that's the crystal
0:29:06this
0:29:07see fig speech maybe they can variables in doing that
0:29:10right
0:29:11hmmm
0:29:18uh_huh
0:29:20okay
0:29:20i am i
0:29:22we have some similar work and
0:29:25i
0:29:26some
0:29:27paper
0:29:27so there we also
0:29:29um
0:29:30right
0:29:30um
0:29:31yeah then fine yeah
0:29:32um see signs
0:29:34oh
0:29:34transform tonight
0:29:36basically to intermediate
0:29:37speech will be
0:29:38back to the speaker identification system
0:29:41so um
0:29:42also i don't know
0:29:44what street journal
0:29:45in a nice
0:29:46and then you so we can
0:29:49and
0:29:50and you had to to to type
0:29:53and speaker identities instead
0:29:55right
0:29:55based on like
0:29:56ubm agenda like using that low level
0:29:59acoustic features
0:30:01in the other one is
0:30:03a novel speaker identification system
0:30:05and
0:30:06such as no phonetic
0:30:07that
0:30:08right
0:30:09so what we are used and
0:30:10that um
0:30:12and they generate
0:30:13generated
0:30:14the
0:30:14it is and
0:30:15and
0:30:17i think that now and a novel feature based speaker identification
0:30:21hmmm
0:30:22small
0:30:22one double
0:30:23hmmm
0:30:25well
0:30:26well whatever bottleneck and
0:30:28and you really to be selected by now
0:30:30generative
0:30:32i
0:30:32um but yeah
0:30:34looks like a high level
0:30:35yeah
0:30:36speaker I D's
0:30:37i didn't use instant
0:30:38it's not
0:30:40make
0:30:40it's
0:30:40not robust
0:30:42okay
0:30:42at low levels
0:30:44so like
0:30:44mm
0:30:46they stand for
0:30:47and just got in speech you
0:30:50yes
0:30:50right and then
0:30:51it looks like and
0:30:53hmmm
0:30:53there you can
0:30:55do you like
0:30:56a i mean
0:30:57i yeah
0:30:58this
0:30:58p2p
0:30:59a speech
0:31:00reason or
0:31:02no it's not
0:31:02yeah
0:31:03so probably
0:31:04um
0:31:06and that's what you have done
0:31:08yeah experiments also you see
0:31:10and at that time
0:31:11speaker verification system using
0:31:13try using a novel
0:31:15and features
0:31:16yeah
0:31:16yeah
0:31:17temporal features mike
0:31:18not only not long range and speed
0:31:21make some characteristics
0:31:23probably
0:31:23and now be
0:31:25more robust against that
0:31:26that generated
0:31:28speech
0:31:29so basically
0:31:30and
0:31:30so
0:31:31hmmm the speaker I D C
0:31:33that was
0:31:34transformation all three
0:31:36yeah
0:31:37can be too
0:31:39yeah
0:31:41we see each other
0:31:42and then something to do with this
0:31:44uh_huh
0:31:45D C
0:31:46so
0:31:47yeah we can probably also borrow
0:31:49yeah
0:31:50symphonies
0:31:51um speech since it's it's uh
0:31:53jenny generation you know
0:31:56yeah
0:31:56try to make
0:31:57speaker and
0:31:58so
0:31:59but
0:32:00yes and and and probably
0:32:01and now also on how expensive
0:32:03where is it is fine to use
0:32:04for the speech thing is is that you probably
0:32:07two
0:32:08okay normally
0:32:09and
0:32:10teachers are
0:32:12that's
0:32:12right
0:32:13sure
0:32:14yeah
0:32:14yeah it is
0:32:17yeah
0:32:18okay
0:32:18so no time i i just got my question
0:32:22uh no no no no no
0:32:24you you you you you you and uh yeah that should be used them on the same
0:32:31what would happen if you change the
0:32:34yeah
0:32:37so uh i
0:32:38questions um
0:32:39we use you know
0:32:41gmmubm systems and svm
0:32:45with
0:32:45you know caution
0:32:47it's a contest
0:32:48um but we haven't ones that in a long time
0:32:51future
0:32:53yeah it's real time values
0:32:55um
0:32:57but
0:32:57um
0:33:00um
0:33:00but
0:33:02uh_huh
0:33:07we have a new features
0:33:09um we have one
0:33:10you huge
0:33:11which is we
0:33:11really
0:33:12one
0:33:13so i reassured that is uh
0:33:15right next
0:33:16i guess
0:33:17next
0:33:18with bonds and you
0:33:20yeah
0:33:21that's not a long time
0:33:22yeah
0:33:24right
0:33:26right
0:33:27yeah