0:00:07i have a
0:00:08no i mcgill or talk about that
0:00:11not all background model
0:00:13well speaker verification
0:00:14and the author of this paper
0:00:17they are
0:00:18we found the assumption and the downhill
0:00:21and uh and a friend of which are down
0:00:23and the assumption
0:00:25my name is on the other hand i you would
0:00:27talk
0:00:28uh
0:00:28that
0:00:29this
0:00:30um
0:00:31the idea i mean these people are used to run into a single i think all for you
0:00:36and
0:00:37then
0:00:37yeah
0:00:38clean up the only
0:00:39no
0:00:40being
0:00:41yeah this is the one that is having
0:00:43cable
0:00:43first introduction
0:00:45in this introduction
0:00:46i will
0:00:47uh in
0:00:48why we propose gift ideas and bases also our motivation
0:00:53second
0:00:54we really nice okay that vocal tract yeah school to speaker recognition
0:00:59and then to to uh idea
0:01:01when we do some
0:01:03and then
0:01:03and this
0:01:05we can fix
0:01:06parenthood
0:01:07that ah
0:01:08finally
0:01:09we can raise the ticket
0:01:10and and the multiple background models are proposed
0:01:14and then
0:01:15no
0:01:15i have a cable components
0:01:19first
0:01:21got a mixture model
0:01:23mixture model when was so background model
0:01:25is that because speaker or occasions this term
0:01:28is that way
0:01:30forty
0:01:30quality of that case
0:01:32the
0:01:32they all thought
0:01:33this term
0:01:34such as
0:01:35but i do not see it
0:01:37and the whole thing
0:01:37attribute projection
0:01:39is based on
0:01:40um
0:01:41uh
0:01:42but that but that the
0:01:43the
0:01:44at that
0:01:45gmm
0:01:46you'll be an ad is basic
0:01:48structure
0:01:49and the
0:01:50the most important
0:01:52rolling in this
0:01:53in this this
0:01:54this
0:01:54basics but
0:01:55strong is the
0:01:56you'll be an
0:01:57and that we think are a complete ubm is supposed to
0:02:02right yeah that the speaker independent feature distribution
0:02:06and uh there are no man starts to counting the quantity of ubm
0:02:10first
0:02:11which we are
0:02:12and misc data
0:02:14and the means we pay for all the data better
0:02:16two three but when it was so yeah
0:02:18as a second
0:02:19right right
0:02:20gender
0:02:21oh channel
0:02:22and then
0:02:23but
0:02:23ubm
0:02:25but uh
0:02:26there may be
0:02:27uh
0:02:28there
0:02:28there
0:02:29there are other approaches
0:02:32first that
0:02:33okay
0:02:34well okay yeah
0:02:35the speaker our unity expensive i might have back
0:02:40such that
0:02:41speech rate
0:02:42speech what we'll
0:02:43emotion
0:02:44well collector and the song
0:02:46but the major differences between the speaker
0:02:48is due to the difference
0:02:50between their average week yeah
0:02:52so in speech recognition
0:02:54well we'll check yes no medication
0:02:56is
0:02:57is also used that to obtain
0:02:59speaker independence insurance
0:03:04now here is like
0:03:06uh
0:03:07you're only kills the frequency warping function
0:03:10it's the crew though
0:03:11this is the original frequency and bases there
0:03:15what the frequency
0:03:16and this is what you that current
0:03:18we want
0:03:19and that is what vector
0:03:21is that when we want to
0:03:23they want to get
0:03:25but unfortunately
0:03:26there is no closed expression for days but still
0:03:32we use
0:03:32this
0:03:33this
0:03:34great
0:03:34to get
0:03:36okay yes
0:03:38and that this is what the speech
0:03:40what what the features
0:03:41is that what models
0:03:43and then the rate of this value
0:03:45is that they are
0:03:47zero point
0:03:48eight
0:03:48eighty two
0:03:49one point yeah
0:03:51well to waste that's that's
0:03:53zero
0:03:54point
0:03:54zero eight
0:03:55zero two
0:04:01now we look at
0:04:02i would think that
0:04:03her mental state at the paper
0:04:05okay
0:04:05parents were having a a
0:04:08yeah
0:04:08i thought i was on a six
0:04:10corpora encode past
0:04:11condition
0:04:12and in crosschannel conditions
0:04:15that you'll be answering data were selected
0:04:17from use
0:04:18S I to solve
0:04:19four
0:04:20one side
0:04:21there are about
0:04:22sixties
0:04:23and what it and the sixteen afternoon
0:04:25and that
0:04:26as i see two thousand and three at
0:04:28as i tucson and two corpora
0:04:30yeah about five hundred utterances
0:04:34notice the feature where you mean you
0:04:36and then a fifty
0:04:37sexual mean subtraction feature what you know they're accepted
0:04:41acceleration
0:04:42and that right there
0:04:43i use the
0:04:44so we come back yeah
0:04:46the feature weights that if you choose dimension
0:04:48then hlda is used
0:04:51uh the final
0:04:52dimension of the feature you starting now
0:04:57oh
0:04:58this the finger of readout distribution
0:05:01we present this encourage us
0:05:03not want to
0:05:04industry that the difference between male and female
0:05:07we want to
0:05:09focus on the S
0:05:11it's not as bad
0:05:12uh
0:05:13the wedding is
0:05:15uh he's
0:05:16is that
0:05:17uh the value range from this
0:05:18if we used this value to it wide
0:05:21paper
0:05:22two three will be a
0:05:23may be
0:05:24maybe we can get that arnold yeah
0:05:29so
0:05:31that it has that much attention
0:05:33we might need ubm training they turned into a to use pointed it here
0:05:37says holding to the warping factor for example
0:05:41uh database
0:05:42first the
0:05:43the walking factories
0:05:45zero point eight eight
0:05:47we have one hundred and the
0:05:49it's it's three utterances
0:05:54no this is the
0:05:55whole structure of our proposed multiple background model
0:06:00uh i think that this structure is that right
0:06:04gmm you yeah
0:06:06uh
0:06:06i'm the think different these days
0:06:09new gmm ubm
0:06:11there is only one ubm
0:06:13and then
0:06:14in this way how ubm who
0:06:17you're you
0:06:18map adaptation
0:06:20each ubm is adapted to generate
0:06:23uh unique
0:06:24a speaker model
0:06:26and they
0:06:27ubm and the speaker models have warmed up here
0:06:30the only in the test the framework
0:06:32and i is used for
0:06:35yeah ubm and the speaker
0:06:37G G M and to
0:06:39table
0:06:39the
0:06:40but phonetically
0:06:41we shall score
0:06:46that
0:06:46well that's what kinda stuff
0:06:48results
0:06:49baseline baseline performance
0:06:51and that
0:06:52uh
0:06:53wait you gender independent yeah and ubm sister
0:06:58uh
0:06:59the eer for the forecast conditions are about ten percent
0:07:07and then
0:07:08wait while the
0:07:09and by the data
0:07:10you show and the
0:07:12where you gender dependent
0:07:14ubm
0:07:19the results
0:07:20if
0:07:21if the gender all pool ubm an agenda of confusion unmatched
0:07:26then
0:07:27the performance
0:07:28i'm not be improved
0:07:30but
0:07:30if the cross gender confusion
0:07:33just contact dave
0:07:34and the days
0:07:35we can
0:07:36how were bad
0:07:37without
0:07:39now this is that we have
0:07:42dependent ubm
0:07:44problem
0:07:44it's table we can see that
0:07:46four female condition
0:07:48ubm to game
0:07:49the bastards are
0:07:52and a four male can you should ubm six
0:07:54scale the best result
0:08:02not have a good some performance
0:08:04comparing that you'll be able to resolve for female conditions
0:08:08and the ubm six results
0:08:10for many conditions admits that it's not
0:08:12we can find that i
0:08:13are you yeah maze
0:08:15for an aspect and will so that get the training data
0:08:18that are contained in the back the performance
0:08:21in the ubm with
0:08:22all the training before
0:08:26now that's reached into his finger again
0:08:28uh
0:08:30we can get or
0:08:31a lot of space
0:08:33but there is wise enough to have his if you will
0:08:40for a test utterance
0:08:41which we are hampered you also but
0:08:43okay
0:08:44racial or just connect the ends
0:08:47and B M can obtain a score vector
0:08:50we can use
0:08:51coffee remastered
0:08:52to obtain the final results
0:08:54when we talk about you and and the contributions of it all in singapore
0:08:59vocal
0:08:59and that's really
0:09:01uh powerful tools
0:09:02back to
0:09:04but here we just want to
0:09:06some simple
0:09:08and the
0:09:09a simple
0:09:10simple
0:09:11uh
0:09:12buster
0:09:13first
0:09:15have a mismatch right
0:09:17uh we just
0:09:18you have a very
0:09:19but uh
0:09:20you look at this thing right
0:09:22the results
0:09:23it's not very good
0:09:27and then we'll back
0:09:29maximum
0:09:30exactly what the master
0:09:32and we use
0:09:33the ubm
0:09:34which
0:09:35do not report is the max
0:09:37as the final score
0:09:39but
0:09:40the
0:09:40yeah we can discuss this also
0:09:45and that's the way your minimum that you were the racial master
0:09:48and the you know gives the best results
0:09:51model based remastered
0:10:01and then there is the question arises why the minimum
0:10:04yeah the racial matters K with the best result
0:10:07unfortunately
0:10:09wait i'm
0:10:09know the exact reason
0:10:11uh
0:10:13in
0:10:13intuitively that
0:10:14peak
0:10:15we we we we try to do uh
0:10:18ah
0:10:18combination
0:10:19in
0:10:20jokingly that speaker
0:10:22yeah and i'm actually
0:10:23for the and the the you yeah you would wear both
0:10:26increase if what match
0:10:28has utterance is in court
0:10:30and this is just which transmits meetings bother with no the reason
0:10:34uh
0:10:35we can make it
0:10:37uh
0:10:38the
0:10:39means and the standard
0:10:41every iteration of that
0:10:42that
0:10:43oh
0:10:43that we would rituals
0:10:45oh
0:10:45well as i tucson has six
0:10:47which each
0:10:48yeah
0:10:49and we put a thinker
0:10:51just like this
0:10:53uh i have to say that
0:10:55it's finger is not
0:10:56the reason of this
0:10:58send
0:10:59of this intense
0:11:00uh
0:11:02we just want to know why
0:11:09now
0:11:09we you know the components
0:11:11ah in this paper with was to investigate here
0:11:14the week yeah that's the right term
0:11:17for ubm training the interesting action
0:11:20experiment
0:11:21short time
0:11:22that you'll be actually them is about you
0:11:24new media data with battered in the ubm trend with all that they are
0:11:29based on this finding
0:11:30we further propose a multiple background model system
0:11:34yeah right
0:11:35you take multiple speaker gmm and ubm yeah
0:11:38for speaker recognition
0:11:41uh through minimum
0:11:43now we we shall feel with the proposed master
0:11:46and improve the performance
0:11:48i'm used to be
0:11:50but yeah
0:11:51you're right
0:11:51open questions
0:11:53what the minimum that we would we show master gave the best results
0:11:56it's just locally experience
0:11:58uh we will be posting the slow but
0:12:02the property is under investigation
0:12:04well techniques to improve the state of
0:12:08uh standard of that
0:12:09this term
0:12:10uh for example if you
0:12:12the yeah the system and the
0:12:15yeah system
0:12:16you are you yeah
0:12:17the performance
0:12:18yeah
0:12:19cool
0:12:20uh we know and the way people
0:12:22to the expression
0:12:23experiment
0:12:24how about that
0:12:26computational cost the and the bases another
0:12:29sure
0:12:31finally
0:12:32i just talking about
0:12:40you have plenty of time for questions
0:12:45there were so
0:12:59hmmm
0:13:00oh
0:13:01i'm sorry it wasn't clear to me exactly how you choosing the
0:13:05you have multiple unions and we'll selecting what
0:13:08within each ubm
0:13:10you mean the
0:13:11two semester
0:13:13you just you have multiple units which are built a little different datasets to uh address
0:13:18how do you decide what
0:13:20when
0:13:20each year
0:13:22uh
0:13:23yeah
0:13:26it's cigarettes
0:13:27the what factors is different
0:13:30and the
0:13:31we use the
0:13:32if the
0:13:34hmmm
0:13:35for example that if the
0:13:37uh warping factor is that the airport mine then
0:13:41this is the
0:13:42with some back this paper
0:13:44to train the ubm
0:13:46when there is a website
0:13:47well it does that and this and that it is
0:13:50yeah
0:13:51what during the enrolment and the test the
0:13:53and the
0:13:54the uh
0:13:55we can have
0:13:56oh the tree data and test data and not extracted
0:14:00they are just the scroll back
0:14:02a two
0:14:02uh
0:14:03ubm and the speaker gmm pattern
0:14:11so i'm just going to be a a synthesis using the all the user density and the
0:14:16just to use
0:14:17they were easy to combine this discourse is used
0:14:22two
0:14:22to school too
0:14:23if i noticed like usual as well
0:14:26you you
0:14:29remote questions
0:14:39uh
0:14:40hmmm
0:14:41uh you know where you use
0:14:43yeah
0:14:44did you
0:14:45for speaker
0:14:46no
0:14:47i mean you mean how many of them is
0:14:50maybe if you assume that you are using
0:14:54you you
0:14:55you did
0:14:56you
0:14:58yeah
0:15:00you know
0:15:01yes
0:15:02i know that's
0:15:03question
0:15:03yeah in fact there are really many mass transit to
0:15:07uh so get data and the
0:15:09i think of like yeah it's just a way of them
0:15:12which has
0:15:13just well then
0:15:14and they are
0:15:15ah
0:15:16true
0:15:16yeah me either
0:15:18right
0:15:19it would be
0:15:28and that's where one last question you
0:15:32yeah
0:15:32hello
0:15:33i just one question um
0:15:36you just use the vocal tract length normalisation to select
0:15:40the population for the building the different buttons models
0:15:43and you can also use the acoustically
0:15:46to to select
0:15:48appropriate calls for normalisation
0:15:50looking for speakers
0:15:52which uh
0:15:53more close to the
0:15:55actually the speaker
0:15:56how do you own it is pretty maybe comparing
0:15:58using different but remotes or different population for
0:16:02normalisation with
0:16:03we propose from it
0:16:07i can't
0:16:08i think
0:16:09it's a
0:16:09i'm i'm just
0:16:10the asking these uh
0:16:12it's do you uh do you have
0:16:14this is also the vocal tract length normalisation
0:16:17to select
0:16:18a cohort
0:16:19six
0:16:20of the speaker
0:16:21for normalisation of the school
0:16:24yeah
0:16:25you mean
0:16:27i don't know
0:16:28uh
0:16:29i don't know what you
0:16:30good
0:16:31right
0:16:31the same
0:16:32that would be
0:16:33probably
0:16:35okay yeah she's didn't uh uh we don't want to just the framing of will or will not too close
0:16:41and uh if you
0:16:42yeah
0:16:42very nice
0:16:43did you do that
0:16:44for for you
0:16:45would you only
0:16:46yeah so you remotes and we move the twos
0:16:49that's because figure