0:00:06 | uh so hi everyone um the most common |
---|
0:00:10 | and that's and thank god university of uh |
---|
0:00:12 | where is |
---|
0:00:13 | and i'd like to talk about in your problem |
---|
0:00:14 | to speaker clustering |
---|
0:00:16 | uh |
---|
0:00:17 | which |
---|
0:00:17 | based mainly on some counts |
---|
0:00:19 | so |
---|
0:00:20 | um |
---|
0:00:21 | say uh |
---|
0:00:23 | individual ornaments analysis |
---|
0:00:25 | uh so |
---|
0:00:26 | that's when trying to be with these uh transition somehow |
---|
0:00:30 | so um the outline |
---|
0:00:34 | i |
---|
0:00:38 | so |
---|
0:00:44 | we should make |
---|
0:00:45 | good |
---|
0:00:47 | so uh i'm gonna talk about a little bit of but no parameter density estimate |
---|
0:00:50 | and then i'm going to supply based on this the baseline and see |
---|
0:00:54 | and then across a wide requires adaptation |
---|
0:00:57 | uh in order to be compatible with a problem |
---|
0:01:00 | uh and i'm gonna so some bass and setting would |
---|
0:01:03 | i'm gonna show it like like that |
---|
0:01:04 | is nothing more than say a to see |
---|
0:01:06 | the modes of the posterior |
---|
0:01:08 | uh this includes |
---|
0:01:10 | to define |
---|
0:01:11 | the divergence is |
---|
0:01:12 | uh the proposed kernel |
---|
0:01:14 | and then i got a little bit about exponential family nodded to see that to show that |
---|
0:01:18 | at least for this a family distribution there's |
---|
0:01:22 | i think not to risk |
---|
0:01:23 | not heuristic involved |
---|
0:01:25 | okay |
---|
0:01:26 | so um |
---|
0:01:29 | so we have this |
---|
0:01:30 | well basically is that non parametric approach to plaster |
---|
0:01:34 | uh the number of cluster a is not required known a priori |
---|
0:01:37 | which means that it fit well to the problem |
---|
0:01:40 | okay |
---|
0:01:40 | it's awful mimosa units |
---|
0:01:42 | should be considered rather after develop a hierarchical clustering |
---|
0:01:46 | all these approaches |
---|
0:01:47 | other kind of vacation or |
---|
0:01:49 | yeah |
---|
0:01:49 | this |
---|
0:01:50 | i basically much segmentation |
---|
0:01:51 | stuff like that |
---|
0:01:53 | but also object right |
---|
0:01:54 | recent |
---|
0:01:56 | and my my reference is it |
---|
0:01:58 | the seminal paper of uh combination read by me |
---|
0:02:02 | uh i recently seconded |
---|
0:02:04 | something like |
---|
0:02:05 | citations are something like |
---|
0:02:06 | three thousand |
---|
0:02:07 | so it |
---|
0:02:08 | similar did |
---|
0:02:10 | uh some examples from the the paper you have any mention you wanted to segmented |
---|
0:02:15 | based on the colours |
---|
0:02:16 | okay and you have this |
---|
0:02:17 | eh |
---|
0:02:18 | and at one parameter that something so you never mind |
---|
0:02:21 | uh |
---|
0:02:22 | your target it is to define these clusters are you see that |
---|
0:02:26 | hi debbie dorsey |
---|
0:02:27 | so |
---|
0:02:28 | that's the reason why we go no permit |
---|
0:02:31 | if you wanna |
---|
0:02:31 | one or a find a parameter model you |
---|
0:02:35 | it would be a disaster |
---|
0:02:36 | see this |
---|
0:02:37 | also |
---|
0:02:38 | something like this you see them all |
---|
0:02:40 | there are seven models |
---|
0:02:42 | if you want to have a gaussian distribution then be considered is that it |
---|
0:02:46 | with not with a if you prevail probably doing |
---|
0:02:50 | so another example of the new position how the original about here |
---|
0:02:54 | and |
---|
0:02:55 | uh this is a very the call or a |
---|
0:02:59 | bandwidth salmon explained and this also is |
---|
0:03:02 | this possible |
---|
0:03:04 | how smooth you wanna be and |
---|
0:03:06 | but this way you |
---|
0:03:07 | have different levels of smoothing |
---|
0:03:09 | okay this is another example uh again |
---|
0:03:12 | paper |
---|
0:03:14 | very similar approaches |
---|
0:03:15 | you want to extract the boundaries |
---|
0:03:17 | so what do you do a colour segmentation first and then you have to see what |
---|
0:03:21 | there is about the same here |
---|
0:03:23 | so |
---|
0:03:25 | the limitations now |
---|
0:03:26 | in order to uh the data they don't to |
---|
0:03:29 | what like |
---|
0:03:30 | directly to our problem |
---|
0:03:32 | well that's on the spatial of observation |
---|
0:03:34 | see the text |
---|
0:03:35 | you know parameterisation |
---|
0:03:37 | okay whereas uh uh there are several uh class and task |
---|
0:03:40 | like |
---|
0:03:41 | the one we have here |
---|
0:03:43 | where |
---|
0:03:43 | the natural end |
---|
0:03:45 | please |
---|
0:03:45 | we are we have can only be described using |
---|
0:03:48 | parametric models |
---|
0:03:51 | so |
---|
0:03:52 | can we adapted |
---|
0:03:53 | in order to be applicable for problems such problems |
---|
0:03:55 | they're not they're missing the problem so |
---|
0:03:57 | so i have some photographs |
---|
0:04:00 | with different analysis |
---|
0:04:01 | and we want to plaster |
---|
0:04:03 | unsupervised |
---|
0:04:04 | the same problem |
---|
0:04:06 | you wanna have described it saved by um a normal distribution |
---|
0:04:09 | each of the four |
---|
0:04:11 | and you wanna |
---|
0:04:12 | if you want to live in this missive algorithm in order to classify class of them |
---|
0:04:17 | you wanted to do this model the observation space with |
---|
0:04:20 | there's not euclidean geometry which is in euclidean geometry let's say |
---|
0:04:24 | but |
---|
0:04:25 | a lot of space |
---|
0:04:26 | so |
---|
0:04:27 | the proposed method i suppose used as an exponential family uses |
---|
0:04:30 | a bayesian framework |
---|
0:04:32 | and they just some concept |
---|
0:04:33 | or information john |
---|
0:04:35 | some |
---|
0:04:39 | so |
---|
0:04:40 | standard is |
---|
0:04:41 | you know parameter |
---|
0:04:42 | estimation |
---|
0:04:43 | you have some data by X |
---|
0:04:44 | this X matrix |
---|
0:04:46 | and using a possible window in order |
---|
0:04:48 | six |
---|
0:04:48 | smooth |
---|
0:04:49 | the empirical distribution you convolved with kernel |
---|
0:04:53 | let's say see possibly still have is rather close to mitigate no |
---|
0:04:57 | say put it's a gaussian |
---|
0:04:58 | and |
---|
0:04:59 | the only parameter here is |
---|
0:05:00 | V eights about |
---|
0:05:03 | excel |
---|
0:05:04 | something okay |
---|
0:05:06 | department impersonal parameter recovered that stuff before |
---|
0:05:09 | but |
---|
0:05:10 | what what not grammatically basically means that you let you parameters grow linearly with your data |
---|
0:05:15 | doesn't mean that you don't have parameters actually |
---|
0:05:17 | but you're parameters are actually the data themselves |
---|
0:05:20 | and |
---|
0:05:20 | last someone with some say stuff like that |
---|
0:05:23 | okay and the buttons can be bible too |
---|
0:05:26 | um the basic problems are mainly that you don't have enough data to estimate |
---|
0:05:31 | properly |
---|
0:05:32 | okay and that's to have more more a |
---|
0:05:34 | uh |
---|
0:05:35 | the dimensions |
---|
0:05:37 | um you require more more data |
---|
0:05:39 | i guess of the personality and all the stuff |
---|
0:05:44 | but |
---|
0:05:44 | the point is |
---|
0:05:45 | do we actually need |
---|
0:05:46 | to to |
---|
0:05:47 | to estimate robust |
---|
0:05:49 | for each problem |
---|
0:05:51 | these on the like yeah |
---|
0:05:52 | and the answer is no for example |
---|
0:05:54 | class and |
---|
0:05:56 | consider before the before |
---|
0:05:58 | what we need it you wire is |
---|
0:06:00 | a method |
---|
0:06:01 | because |
---|
0:06:02 | because the mode |
---|
0:06:03 | say they have said |
---|
0:06:04 | we haven't we should have another but not the the become the mode |
---|
0:06:08 | and the method |
---|
0:06:09 | plus sign each observations |
---|
0:06:11 | to the appropriate mode |
---|
0:06:12 | whatever this means |
---|
0:06:13 | so |
---|
0:06:14 | should we require the estimate |
---|
0:06:16 | robustly that we get |
---|
0:06:17 | oh that's really |
---|
0:06:18 | if we can easily by |
---|
0:06:20 | bypass |
---|
0:06:21 | this procedure |
---|
0:06:22 | and that |
---|
0:06:23 | what is it that |
---|
0:06:24 | so recall the expression |
---|
0:06:27 | find more |
---|
0:06:28 | seem to differentiate |
---|
0:06:29 | with respect to X |
---|
0:06:30 | and set this to zero |
---|
0:06:33 | okay |
---|
0:06:34 | after some ads in because this is a square distance you squarely so you have this you know |
---|
0:06:39 | so |
---|
0:06:40 | define for it |
---|
0:06:41 | that |
---|
0:06:42 | the differential of the current no |
---|
0:06:44 | with these G |
---|
0:06:45 | and you have the simple form |
---|
0:06:47 | apart from the constant you have this and this this is |
---|
0:06:50 | you can interpret it as |
---|
0:06:52 | uh say the estimated pdf using |
---|
0:06:55 | the differential profile |
---|
0:06:57 | and the other is a message that |
---|
0:06:58 | the main the main a result |
---|
0:07:00 | this |
---|
0:07:01 | what |
---|
0:07:02 | let's say is is that |
---|
0:07:03 | it reminded |
---|
0:07:05 | okay when |
---|
0:07:07 | you have this |
---|
0:07:08 | uh say wait time range |
---|
0:07:10 | of all the pixels |
---|
0:07:12 | with respect the differential kernel |
---|
0:07:15 | when |
---|
0:07:16 | and and you all you need to do is |
---|
0:07:18 | find a way to buy this |
---|
0:07:21 | if |
---|
0:07:21 | this is one of the main means that you are you know more |
---|
0:07:25 | it so you don't actually estimated dense |
---|
0:07:27 | you seem to do this |
---|
0:07:28 | and that's implies the other |
---|
0:07:30 | so that would look like this |
---|
0:07:32 | very intuitive |
---|
0:07:34 | this is abusive spectrum |
---|
0:07:36 | so for each of the reservations |
---|
0:07:38 | and the other does not matter at all |
---|
0:07:40 | start begin with excited to to say X zero |
---|
0:07:44 | calculate an assist vector |
---|
0:07:46 | like this |
---|
0:07:48 | okay |
---|
0:07:49 | and simply |
---|
0:07:50 | and it |
---|
0:07:50 | that one position |
---|
0:07:52 | until convergence it they are proven over this is very |
---|
0:07:56 | trivial insulting the common need to set up a |
---|
0:07:58 | so that was it |
---|
0:07:59 | due to the situation for all |
---|
0:08:01 | the observation huh |
---|
0:08:03 | okay and stored into the observation here |
---|
0:08:06 | the convergent point |
---|
0:08:07 | so |
---|
0:08:09 | if |
---|
0:08:10 | two |
---|
0:08:11 | or more are on the same old |
---|
0:08:13 | they belong to the same cluster |
---|
0:08:15 | does it was that |
---|
0:08:16 | so |
---|
0:08:16 | if you wanna |
---|
0:08:17 | the |
---|
0:08:18 | post here is an observation |
---|
0:08:20 | okay the initial position |
---|
0:08:21 | you see the trajectory |
---|
0:08:22 | any |
---|
0:08:23 | but |
---|
0:08:23 | those are |
---|
0:08:25 | for the red the red apple |
---|
0:08:28 | thus it was that that's um is it ugly |
---|
0:08:30 | the main idea |
---|
0:08:31 | in observation space |
---|
0:08:33 | okay |
---|
0:08:33 | and you can and class was used in arbitrary shapes |
---|
0:08:38 | see we're clustering |
---|
0:08:41 | what are made with a gun |
---|
0:08:42 | and |
---|
0:08:42 | uh |
---|
0:08:44 | make this last week |
---|
0:08:45 | be one i mean |
---|
0:08:46 | it's very very good |
---|
0:08:48 | so |
---|
0:08:50 | yeah |
---|
0:08:55 | so how how can we had that this idea |
---|
0:08:57 | be applicable to the spatial distribution |
---|
0:09:00 | suppose they could be last on the same have |
---|
0:09:02 | and utterances |
---|
0:09:04 | okay |
---|
0:09:05 | um |
---|
0:09:07 | not forget about the speaker last numbers are taken more generally have an distributions |
---|
0:09:11 | parameterised by theta |
---|
0:09:13 | we should if i colonel |
---|
0:09:15 | that means a shape in the distance |
---|
0:09:16 | appropriate |
---|
0:09:18 | okay and the pdf can be regarded |
---|
0:09:20 | as opposed to a few time which sense |
---|
0:09:22 | in which is that you consider |
---|
0:09:24 | the density of the data |
---|
0:09:26 | given |
---|
0:09:27 | your observations |
---|
0:09:28 | and this is it i would |
---|
0:09:29 | what determines |
---|
0:09:31 | or simply the cluster indicators |
---|
0:09:33 | oh |
---|
0:09:34 | your initial segmentation |
---|
0:09:35 | so if you have considered that |
---|
0:09:37 | speaker clustering task in that or is it some more |
---|
0:09:39 | you apply first |
---|
0:09:41 | segmentation it might be uniform might be |
---|
0:09:44 | speakers said |
---|
0:09:45 | based on speaker change detector |
---|
0:09:46 | and this is it |
---|
0:09:48 | okay |
---|
0:09:49 | so in this sense as opposed to fit |
---|
0:09:52 | and here is an example |
---|
0:09:54 | suppose we have six |
---|
0:09:55 | in this of segments |
---|
0:09:57 | okay |
---|
0:09:58 | and |
---|
0:09:59 | this |
---|
0:09:59 | yeah |
---|
0:10:00 | the common to get the same when this is over all posterior |
---|
0:10:02 | so if one apply the same idea and they begin with here |
---|
0:10:06 | we will see that this last |
---|
0:10:08 | would be attracted |
---|
0:10:09 | only by itself |
---|
0:10:10 | so we would create |
---|
0:10:11 | a second bite so |
---|
0:10:13 | okay the same haven't so this |
---|
0:10:16 | the other three |
---|
0:10:17 | we're gonna have that would be i don't think that we're gonna have |
---|
0:10:20 | four |
---|
0:10:20 | iconoclast |
---|
0:10:22 | and the other the |
---|
0:10:23 | the the last one again |
---|
0:10:25 | its own class |
---|
0:10:27 | see this again to the remote files but it's it |
---|
0:10:30 | exactly the same idea |
---|
0:10:33 | okay so it's |
---|
0:10:34 | like |
---|
0:10:35 | a higher level in the hierarchy that's what |
---|
0:10:38 | that's |
---|
0:10:38 | a integrated to what if i have the observations |
---|
0:10:42 | and the parameters but now |
---|
0:10:43 | you are in the space of observations |
---|
0:10:45 | and you have a posterior |
---|
0:10:47 | okay and uh |
---|
0:10:49 | the the same way you want express somehow the uncertainty in a smoother results |
---|
0:10:53 | by using this kernel |
---|
0:10:54 | in the observation domain which might be gaussian |
---|
0:10:57 | on the same way you have to be |
---|
0:10:58 | you have to expose your uncertainty |
---|
0:11:00 | about uh the estimation |
---|
0:11:02 | and as you see like |
---|
0:11:05 | why why we should a consider also |
---|
0:11:08 | the the the number |
---|
0:11:10 | the sample size of its class it was supposed to have the same position |
---|
0:11:14 | and supple somehow |
---|
0:11:16 | uh they were |
---|
0:11:17 | all this corresponded to ten times |
---|
0:11:19 | that's the sample size |
---|
0:11:20 | before |
---|
0:11:21 | well |
---|
0:11:22 | then probably all all these classes |
---|
0:11:24 | would be single tones |
---|
0:11:26 | it's saying that would be it would be a single class because we expect |
---|
0:11:29 | that as much as more data right |
---|
0:11:32 | these three |
---|
0:11:33 | we will manage |
---|
0:11:34 | more to what we would have a close |
---|
0:11:38 | okay so |
---|
0:11:39 | there is certainly dependence of the sample size if it's linear |
---|
0:11:42 | it's not |
---|
0:11:43 | i i |
---|
0:11:43 | i |
---|
0:11:44 | it's a linear or only |
---|
0:11:45 | if the motors are correctly specified |
---|
0:11:48 | and in speaker diarization and especially for using |
---|
0:11:51 | simulation model is |
---|
0:11:52 | there's if you just dismiss misspecification |
---|
0:11:56 | that |
---|
0:11:56 | you can consider this |
---|
0:11:57 | actually |
---|
0:11:58 | and that's a problem |
---|
0:12:00 | um so |
---|
0:12:01 | let's define the kernel |
---|
0:12:03 | um |
---|
0:12:05 | let's see this delegation this some months |
---|
0:12:07 | but uh probably don't have time line |
---|
0:12:09 | but |
---|
0:12:10 | consider this as a parameterised by delta |
---|
0:12:13 | family of the aviation be to assume so |
---|
0:12:15 | the for me to scale endeavours |
---|
0:12:18 | and the others |
---|
0:12:19 | but you have living their argument |
---|
0:12:21 | yeah that was the liquid |
---|
0:12:23 | you okay |
---|
0:12:24 | uh they wanted to was what the hell at least |
---|
0:12:28 | and is your estimate distance |
---|
0:12:30 | but you can also estimate right and all that kind of errors by summing it |
---|
0:12:33 | or by taking their harmonic mean |
---|
0:12:35 | and |
---|
0:12:36 | we use this approach |
---|
0:12:37 | however many dressing up was um |
---|
0:12:39 | can based on this highly reduced |
---|
0:12:41 | okay and recall that |
---|
0:12:43 | this will happen no matter what the database |
---|
0:12:46 | which is a |
---|
0:12:47 | this is the see that pitch information about this which is simply that |
---|
0:12:50 | then they made the metric tensor |
---|
0:12:52 | you do it you could consider it information john |
---|
0:12:57 | so having defined as the shape |
---|
0:12:59 | now consider a some infinite distances let's define the shape |
---|
0:13:02 | because |
---|
0:13:03 | we had to go motion |
---|
0:13:04 | in the observation space |
---|
0:13:05 | should be goes any longer |
---|
0:13:06 | well |
---|
0:13:07 | it's another parameterisation now by myself |
---|
0:13:10 | okay if you if you consider only out of that |
---|
0:13:13 | are equal to one |
---|
0:13:15 | then you have an exponential |
---|
0:13:16 | okay and this should be considered |
---|
0:13:18 | as i read the rather nicotine |
---|
0:13:20 | the K derivative |
---|
0:13:22 | density with respect to the information element in these information element with like real big measure if one |
---|
0:13:28 | but anyway |
---|
0:13:29 | um well |
---|
0:13:30 | so |
---|
0:13:30 | if you don't like |
---|
0:13:31 | uh though i i i |
---|
0:13:33 | simply consider a close one |
---|
0:13:35 | you have a lot of time for mobile phone |
---|
0:13:38 | and isaacs play a video propose again explain you collected yesterday |
---|
0:13:41 | these are varied |
---|
0:13:42 | only two |
---|
0:13:43 | the T distribution |
---|
0:13:45 | if you wanna |
---|
0:13:45 | considering |
---|
0:13:47 | us a heavy tail |
---|
0:13:49 | priors |
---|
0:13:50 | okay so it's a very nice interpretation of that |
---|
0:13:52 | the looks like so |
---|
0:13:54 | i'm not gonna |
---|
0:13:55 | analyse all this stuff |
---|
0:13:57 | but all this |
---|
0:13:58 | and we can see that by minimising the cost function |
---|
0:14:01 | okay here |
---|
0:14:03 | this down |
---|
0:14:04 | is |
---|
0:14:05 | correspond to how much does you are off in the you are about to measurement |
---|
0:14:09 | so |
---|
0:14:10 | it should be somehow leno leno |
---|
0:14:12 | with the sample size |
---|
0:14:14 | whereas this to tell you |
---|
0:14:16 | how close |
---|
0:14:17 | you wanna be |
---|
0:14:18 | with an informative prior the jeffrey's prior |
---|
0:14:20 | basically |
---|
0:14:21 | which is simply the flat prior if you consider |
---|
0:14:24 | uh you can enjoyment |
---|
0:14:26 | i think does that so its minimisation of cost function |
---|
0:14:29 | and here is |
---|
0:14:29 | just a single the average |
---|
0:14:31 | 'kay on them |
---|
0:14:31 | yeah that |
---|
0:14:32 | then the deviation |
---|
0:14:34 | okay |
---|
0:14:35 | that's great and |
---|
0:14:36 | all this |
---|
0:14:37 | uh right of the conjugate priors all this stuff |
---|
0:14:40 | and we consider our as um |
---|
0:14:43 | parameter parameter additional this family |
---|
0:14:46 | but it's often end up |
---|
0:14:49 | so |
---|
0:14:50 | let's go back problem having to find all the stuff |
---|
0:14:53 | so here's the posterior |
---|
0:14:55 | we have to have okay segments |
---|
0:14:57 | okay |
---|
0:14:58 | this is simply the difference but i don't |
---|
0:15:00 | you can be outside |
---|
0:15:02 | and you have these |
---|
0:15:02 | suppose it |
---|
0:15:03 | i would like also one let's consider only the stuff |
---|
0:15:06 | okay |
---|
0:15:06 | so this should be normally something like away |
---|
0:15:09 | and K |
---|
0:15:10 | are simply the number of my examples of how of it for the case uh sec |
---|
0:15:15 | and then a role |
---|
0:15:16 | seamless are just like like a weight so you may consider doesn't it |
---|
0:15:19 | as a as a mixture of uh distribution |
---|
0:15:22 | okay so the point is now |
---|
0:15:25 | recall that have differentiated respect to extract because of |
---|
0:15:28 | of |
---|
0:15:29 | uh using the squared distances |
---|
0:15:31 | we came up with this formula |
---|
0:15:32 | so the question is |
---|
0:15:34 | we did we would you heuristics and a or a by differentiate this stuff |
---|
0:15:39 | or or or not |
---|
0:15:41 | uh and and the point is that |
---|
0:15:43 | if you constrain yourself |
---|
0:15:45 | you'd exponential family |
---|
0:15:47 | then there's not reasonable for example in the in the |
---|
0:15:52 | save for the normal distribution |
---|
0:15:54 | you have these uh the this a natural parameters that correspond to this problem tradition |
---|
0:15:59 | the sufficient statistic that |
---|
0:16:00 | i a tax naturally with the the |
---|
0:16:03 | the natural parameters |
---|
0:16:04 | and the measurements constant |
---|
0:16:06 | and um but i'm not gonna get into that because |
---|
0:16:09 | the wine tasting wait |
---|
0:16:11 | um |
---|
0:16:13 | so |
---|
0:16:15 | there is another couple |
---|
0:16:17 | uh |
---|
0:16:19 | or a parameter space |
---|
0:16:20 | and is the expectation parameters |
---|
0:16:22 | more close to what |
---|
0:16:24 | we usually |
---|
0:16:25 | uh tree |
---|
0:16:27 | in this |
---|
0:16:27 | in in the speaker clustering |
---|
0:16:29 | okay |
---|
0:16:30 | by exploiting |
---|
0:16:31 | the complexity |
---|
0:16:33 | or the log partition with respect |
---|
0:16:34 | to defeat that |
---|
0:16:35 | you end up with a not with the expectation parameters |
---|
0:16:39 | okay and |
---|
0:16:40 | by |
---|
0:16:41 | by re differentiating the stuff |
---|
0:16:43 | with respect to feed that |
---|
0:16:45 | you get |
---|
0:16:46 | the fisher information |
---|
0:16:47 | major |
---|
0:16:48 | matrix okay |
---|
0:16:50 | all the stuff or |
---|
0:16:51 | from the classical statistics of where known |
---|
0:16:54 | and |
---|
0:16:55 | this is a potential find that the two potential functions |
---|
0:16:58 | which shows the mostly clearly why this |
---|
0:17:01 | ah |
---|
0:17:02 | this pay some couple expectation parameters |
---|
0:17:04 | and that he the parameters |
---|
0:17:06 | so if you want to do what you wanna do what you want to read |
---|
0:17:10 | and the same way you average between points in there |
---|
0:17:13 | euclidean domain |
---|
0:17:15 | you may consider also |
---|
0:17:17 | the |
---|
0:17:17 | expectation parameterisation |
---|
0:17:19 | the expectation parameters |
---|
0:17:21 | all the natural parameters |
---|
0:17:22 | and it depends on you |
---|
0:17:23 | but like to do |
---|
0:17:25 | okay which geometry you wanted to |
---|
0:17:27 | do you probably have |
---|
0:17:29 | which more appropriate |
---|
0:17:31 | so |
---|
0:17:33 | they could but lyman let's let's consider only the K and i went to the two extremes |
---|
0:17:38 | the one and zero |
---|
0:17:40 | they'll take one or zero |
---|
0:17:41 | you have these |
---|
0:17:42 | that's that's these are all mad |
---|
0:17:44 | but |
---|
0:17:45 | if you differentiate you obtain some cleaner |
---|
0:17:47 | what to do this |
---|
0:17:48 | but it also this |
---|
0:17:49 | there's a there is any finally show like to mention here that's not mentioned my paper |
---|
0:17:54 | you expect that |
---|
0:17:55 | by differentiating this and |
---|
0:17:57 | and they are like rock and roll |
---|
0:18:00 | you have a parameterisation with that data |
---|
0:18:03 | but |
---|
0:18:04 | since you don't use the natural gradient |
---|
0:18:07 | okay see that sees you don't consider the care |
---|
0:18:10 | but this new space is curved |
---|
0:18:12 | you differentiate respectively attended in if you switch composition |
---|
0:18:17 | which is somehow weird |
---|
0:18:18 | but if you use the natural gradient |
---|
0:18:21 | which define simply defined as this |
---|
0:18:23 | with distilled that |
---|
0:18:25 | okay |
---|
0:18:26 | now if you differentiate with respect to feed that |
---|
0:18:29 | you remain to the same parameterisation |
---|
0:18:31 | okay there's no switching |
---|
0:18:33 | between this parameterisation |
---|
0:18:34 | so you know i recall that in order to to make it a |
---|
0:18:38 | to define a ground rule |
---|
0:18:40 | you should |
---|
0:18:40 | want to do the kind of the space |
---|
0:18:42 | so you should |
---|
0:18:43 | work with the natural reading |
---|
0:18:47 | so |
---|
0:18:48 | the final |
---|
0:18:50 | 'cause i did what you put in like a month in the model this model called |
---|
0:18:54 | okay |
---|
0:18:54 | recalling the same again |
---|
0:18:56 | iteration |
---|
0:18:57 | start from a from the percent per segment of the last there's no doubt that has no matter |
---|
0:19:02 | but how it doesn't matter |
---|
0:19:04 | and your next estimation will be an average |
---|
0:19:07 | oh for |
---|
0:19:09 | of all the all for the other segment |
---|
0:19:11 | in this parameterisation |
---|
0:19:12 | or in this parameterisation the fit |
---|
0:19:15 | they're not around |
---|
0:19:16 | it depends on you |
---|
0:19:20 | so |
---|
0:19:21 | suppose we have some segments |
---|
0:19:23 | it's all segments |
---|
0:19:25 | and the method to sixteen clusters |
---|
0:19:27 | it's blue uh don't |
---|
0:19:29 | represents |
---|
0:19:30 | actually the mean the mean value |
---|
0:19:34 | okay the mean value of these say the these say to mfcc |
---|
0:19:39 | visions |
---|
0:19:39 | okay and see how the man |
---|
0:19:41 | you know sixteen plus you also find |
---|
0:19:43 | some single class |
---|
0:19:45 | that simply that right |
---|
0:19:47 | by the bottom so |
---|
0:19:51 | so |
---|
0:19:52 | something a final report to go to the experiment |
---|
0:19:56 | um uh |
---|
0:19:58 | all this i wouldn't feel about the U two specification is no |
---|
0:20:02 | uh you should somehow bias |
---|
0:20:04 | the results |
---|
0:20:05 | um towards that have an annuity because you know |
---|
0:20:09 | that a a dialogue with the doesn't have |
---|
0:20:12 | say utterance |
---|
0:20:13 | of say three or four a |
---|
0:20:16 | segments |
---|
0:20:17 | is this a must buy as a as a a supplying a a |
---|
0:20:21 | do you play prior |
---|
0:20:22 | over a transition matrix |
---|
0:20:24 | been forced somehow continue eating |
---|
0:20:26 | so |
---|
0:20:27 | but we do |
---|
0:20:28 | we multiply |
---|
0:20:30 | okay |
---|
0:20:30 | by this uh C distribution |
---|
0:20:34 | okay so if you are |
---|
0:20:36 | supporting the in the cave |
---|
0:20:38 | segment |
---|
0:20:39 | okay |
---|
0:20:39 | you want to emphasise your neighbour |
---|
0:20:42 | but not in a way that goes and does it |
---|
0:20:45 | okay not in such a way because if you if you do together some way |
---|
0:20:49 | the first and the last segment would we what we want one B one from the same a classroom |
---|
0:20:55 | never |
---|
0:20:55 | you wanted it more mild and everything |
---|
0:20:58 | yeah i guess |
---|
0:20:59 | i don't the one week |
---|
0:21:01 | so |
---|
0:21:02 | is there a database |
---|
0:21:05 | okay |
---|
0:21:07 | sure |
---|
0:21:07 | oh well |
---|
0:21:08 | you know this is a very database broke |
---|
0:21:10 | news |
---|
0:21:12 | okay and we compare it to the standard of big uh here approach |
---|
0:21:16 | you won't find any and super duper while our results |
---|
0:21:20 | uh |
---|
0:21:21 | the best |
---|
0:21:22 | a configuration we find with local because this |
---|
0:21:25 | we fixed with fixed the lab agenda parameter using the development and then we have this |
---|
0:21:29 | result in the test set |
---|
0:21:31 | and M S was put them in C |
---|
0:21:33 | exactly |
---|
0:21:34 | hey using only single ego self motivated matrix |
---|
0:21:37 | no gmms at all |
---|
0:21:38 | and we see that the harmonic mean was the best |
---|
0:21:42 | four |
---|
0:21:42 | and the other side |
---|
0:21:43 | rather close enough not though |
---|
0:21:46 | if you use |
---|
0:21:47 | and if you're a clustering |
---|
0:21:48 | seem to care that it is |
---|
0:21:50 | would be a tragedy |
---|
0:21:53 | this is because of the big |
---|
0:21:54 | right you |
---|
0:21:55 | but you cannot use big you know |
---|
0:21:58 | and that's a problem |
---|
0:21:59 | i would use this stuff |
---|
0:22:00 | so |
---|
0:22:02 | to finish |
---|
0:22:03 | but also |
---|
0:22:05 | and adaptation of an open out of attacks on the space of observation space bounded |
---|
0:22:10 | okay we use |
---|
0:22:12 | some reasonable i think uh bayesian argument in order to |
---|
0:22:16 | to make these transition |
---|
0:22:19 | okay and i was so that at least for the exponential families |
---|
0:22:22 | not recyclable and stuff |
---|
0:22:24 | so |
---|
0:22:25 | uh well these are relevant to be honest um |
---|
0:22:27 | you want to obtain a point estimate about your hidden variables |
---|
0:22:31 | build stuff document clustering but |
---|
0:22:34 | or if you wanna have diarrhoea that |
---|
0:22:36 | you |
---|
0:22:37 | consider all real time lapse |
---|
0:22:39 | but if you wanna doing force you don't you |
---|
0:22:41 | stuff |
---|
0:22:42 | you don't use your clustering either |
---|
0:22:44 | okay you do do do do dirichlet process |
---|
0:22:46 | says |
---|
0:22:47 | you use um N C M C |
---|
0:22:50 | you use variational bayes |
---|
0:22:51 | okay |
---|
0:22:52 | so |
---|
0:22:53 | one final |
---|
0:22:54 | gmms |
---|
0:22:55 | certainly you can if you consider the complete data they don't if you consider |
---|
0:22:59 | they as a like the complete data likelihood |
---|
0:23:02 | then the gmms we don't wanna see |
---|
0:23:03 | family |
---|
0:23:04 | yeah |
---|
0:23:05 | but you you need to have a correspondence or you need to start |
---|
0:23:08 | with the ubm with a common ubm |
---|
0:23:10 | why because |
---|
0:23:11 | if you can see the the complete data likelihood |
---|
0:23:14 | you only |
---|
0:23:15 | i can see there |
---|
0:23:16 | uh okay on the basis between |
---|
0:23:19 | corresponding |
---|
0:23:20 | uh gaussian |
---|
0:23:22 | last |
---|
0:23:23 | italian |
---|
0:23:24 | for them different weights if you |
---|
0:23:26 | trained if you also allow the trains |
---|
0:23:29 | the the weights to be to be |
---|
0:23:31 | also i vectors |
---|
0:23:32 | i'm sure you can use it for |
---|
0:23:34 | that |
---|
0:23:35 | even the original message |
---|
0:23:36 | however |
---|
0:23:37 | if you use |
---|
0:23:38 | i i picked those |
---|
0:23:40 | note that you have |
---|
0:23:41 | um different segments |
---|
0:23:44 | so |
---|
0:23:45 | the variability of the estimate |
---|
0:23:46 | should also be part of the problem |
---|
0:23:49 | so somehow |
---|
0:23:50 | um |
---|
0:23:52 | play with a band with allow the bandwidth to be |
---|
0:23:54 | depending on the sample size |
---|
0:23:56 | okay to to to encode these |
---|
0:23:58 | i uncertainty |
---|
0:24:00 | in the this estimate |
---|
0:24:03 | thank you |
---|
0:24:10 | thanks and sorry for |
---|
0:24:11 | chris you wanted to keep some |
---|
0:24:13 | time |
---|
0:24:14 | for some question |
---|
0:24:15 | ooh |
---|
0:24:16 | you should comment |
---|
0:24:24 | fig |
---|
0:24:25 | okay |
---|
0:24:26 | yeah |
---|
0:24:26 | quick |
---|
0:24:27 | just simple |
---|
0:24:29 | it's on |
---|
0:24:30 | it's one |
---|
0:24:30 | uh |
---|
0:24:31 | you are optimising |
---|
0:24:33 | uh |
---|
0:24:33 | you time for them |
---|
0:24:35 | yeah |
---|
0:24:35 | so |
---|
0:24:36 | where |
---|
0:24:37 | doing gradient descent |
---|
0:24:38 | so |
---|
0:24:39 | you can |
---|
0:24:40 | you have to compute the gradient |
---|
0:24:42 | but |
---|
0:24:43 | do i understand correctly that you're not able to compute the |
---|
0:24:47 | value |
---|
0:24:48 | so you during the optimisation |
---|
0:24:49 | oh |
---|
0:24:50 | being |
---|
0:24:51 | i don't evaluate don't it actually you can this is the |
---|
0:24:55 | the house analytical that |
---|
0:24:56 | but is not correct |
---|
0:24:58 | it's based on that |
---|
0:24:59 | you know it's uh |
---|
0:25:01 | it's a simple estimate |
---|
0:25:03 | what what what |
---|
0:25:04 | the idea behind miss that means if i was it |
---|
0:25:06 | that you don't actually need |
---|
0:25:07 | to estimate the overall period you can bypass problem by using gradient |
---|
0:25:12 | that the idea |
---|
0:25:13 | is the |
---|
0:25:14 | just a |
---|
0:25:15 | practical thing because my favourite optimisation |
---|
0:25:18 | yeah |
---|
0:25:20 | right objective function |
---|
0:25:22 | okay |
---|
0:25:23 | if we go |
---|
0:25:24 | okay |
---|
0:25:25 | so |
---|
0:25:26 | of the wine |
---|
0:25:31 | well we should come in |
---|
0:25:41 | general questions about the mean shift |
---|
0:25:44 | um |
---|
0:25:45 | if you start off |
---|
0:25:46 | start out by hypothesising time clusters |
---|
0:25:50 | you will always got time clusters is that correct |
---|
0:25:53 | no you don't no or little number one stress come you |
---|
0:25:57 | some |
---|
0:25:58 | i was just thinking proper |
---|
0:26:00 | the number of classes completely fair infer from the from the average |
---|
0:26:04 | you don't you don't the uh |
---|
0:26:07 | proof a predefined number of class |
---|
0:26:10 | it depends only if the if the points compared |
---|
0:26:13 | okay to the to the same value |
---|
0:26:15 | so |
---|
0:26:16 | you don't need to |
---|
0:26:18 | big |
---|
0:26:18 | yeah |
---|
0:26:19 | you said that um |
---|
0:26:22 | not having not being able to incorporate |
---|
0:26:24 | a course |
---|
0:26:25 | first of all you cannot have big because |
---|
0:26:27 | big big implies |
---|
0:26:29 | a a marginalisation with respect to the parameters |
---|
0:26:33 | okay but it isn't it it's a transfers dropout |
---|
0:26:37 | naturally hmmm it's |
---|
0:26:39 | thrusters |
---|
0:26:40 | dialling |
---|
0:26:41 | not sure |
---|
0:26:43 | you know you can hold |
---|
0:26:45 | find the correct number of clusters |
---|
0:26:47 | with that without using a big |
---|
0:26:49 | sorry |
---|
0:26:49 | okay |
---|
0:26:49 | material |
---|
0:26:51 | sure sure |
---|
0:26:52 | i'm not using it because i forgot the comparison |
---|
0:26:54 | probably the correctly last |
---|
0:26:59 | okay |
---|