0:00:15 | yeah |
---|
0:00:16 | thank you |
---|
0:00:17 | um |
---|
0:00:17 | and let's the some audience left for the last talk of today day |
---|
0:00:21 | and a |
---|
0:00:23 | the |
---|
0:00:24 | it is uh a of different to the talks before |
---|
0:00:28 | um for for getting line of mike talk we can just this is uh |
---|
0:00:33 | i to the book were that's uh uh i was taught to |
---|
0:00:36 | given that very short introduction seduction what lights also plays |
---|
0:00:39 | say it |
---|
0:00:40 | what's the problem can load it's case may need to permutation and the greedy and how i'm solving in using |
---|
0:00:47 | as sparsity basically criteria |
---|
0:00:49 | so um |
---|
0:00:51 | and the you a case for like some separation is when you have a cocktail party problem |
---|
0:00:56 | um we have some sources |
---|
0:00:59 | uh at this point i say we have |
---|
0:01:01 | speech sources to a people talking |
---|
0:01:04 | and he would like to get |
---|
0:01:06 | sing the components of that |
---|
0:01:08 | uh but a what what you get a some recordings which are just make chance |
---|
0:01:13 | these send single components |
---|
0:01:15 | and um |
---|
0:01:18 | in this case |
---|
0:01:19 | but i'm |
---|
0:01:20 | looking here uh uh we have the |
---|
0:01:22 | better problem off to the |
---|
0:01:24 | uh mixture of being convolutive one |
---|
0:01:27 | as we have to of speech we have reflections and so and so on |
---|
0:01:30 | so uh |
---|
0:01:32 | the problem becomes more complicated |
---|
0:01:34 | and the mathematical formulation for this um we have |
---|
0:01:38 | some source |
---|
0:01:39 | some extent |
---|
0:01:40 | and matrix |
---|
0:01:41 | at least for the instantaneous then use case |
---|
0:01:44 | i gets of measurements and what we want to do is to |
---|
0:01:48 | a estimate might matrix uh separating matrix so we get again to |
---|
0:01:52 | uh i'll in |
---|
0:01:53 | a signals |
---|
0:01:54 | uh for this we had like the ica |
---|
0:01:57 | so nothing you at this point |
---|
0:01:59 | uh what we have to um |
---|
0:02:01 | take into account uh we never now the although of the sources and you never know which energy the sauces |
---|
0:02:09 | have |
---|
0:02:10 | um |
---|
0:02:11 | in my work i used to |
---|
0:02:13 | done not feature a of the natural gradient |
---|
0:02:16 | uh as i think if you but you know |
---|
0:02:18 | oh |
---|
0:02:19 | for speech signals we need |
---|
0:02:22 | uh |
---|
0:02:23 | as always we need some |
---|
0:02:25 | a a probability dispersion |
---|
0:02:27 | functions for speech what when considering here |
---|
0:02:30 | we can safely assume uh we have using a class industry |
---|
0:02:35 | so |
---|
0:02:36 | as i you said you have |
---|
0:02:39 | uh not to simply case we have to convolutive mixture we have |
---|
0:02:42 | a in this case |
---|
0:02:44 | you you |
---|
0:02:45 | different delays you have to reflections and so on |
---|
0:02:48 | so we model this |
---|
0:02:49 | using uh a convolution |
---|
0:02:52 | and uh four |
---|
0:02:54 | we a situations we have some known that to us |
---|
0:02:57 | two thousand four thousand taps or whatever |
---|
0:03:00 | um |
---|
0:03:02 | estimating these filters directly in time domain |
---|
0:03:06 | is |
---|
0:03:06 | hot |
---|
0:03:07 | possibly but very hard |
---|
0:03:09 | so the you wouldn't way is to go to the |
---|
0:03:12 | a time-frequency domain using the short fourier transform |
---|
0:03:15 | and now what we have is |
---|
0:03:18 | just again uh what implication in each frequency bin |
---|
0:03:21 | so uh we can just use the |
---|
0:03:25 | uh up to a to are you shown in each frequency bin independently |
---|
0:03:29 | which is again |
---|
0:03:31 | uh |
---|
0:03:32 | not a problem |
---|
0:03:33 | but |
---|
0:03:34 | no |
---|
0:03:35 | we have |
---|
0:03:36 | the problem of |
---|
0:03:37 | uh the different |
---|
0:03:39 | and rotation patients and and scaling things uh |
---|
0:03:42 | and the previous example |
---|
0:03:44 | can do in you think about that in this case we have to correct |
---|
0:03:48 | um |
---|
0:03:49 | the scaling |
---|
0:03:50 | uh there some standard was you have to solve it |
---|
0:03:53 | uh |
---|
0:03:54 | the typical the case is the minimum distance |
---|
0:03:57 | or often principle |
---|
0:03:58 | uh which we |
---|
0:04:00 | multiply the |
---|
0:04:01 | i'm next matrix by yeah |
---|
0:04:03 | and the with to tight on you down at them and |
---|
0:04:06 | uh what |
---|
0:04:07 | this |
---|
0:04:08 | and that's that we |
---|
0:04:10 | uh X and and scaling done by the mixing system |
---|
0:04:14 | you do not know which was |
---|
0:04:15 | but at least we do not |
---|
0:04:16 | at new distortion |
---|
0:04:17 | just point |
---|
0:04:19 | um |
---|
0:04:19 | some new method |
---|
0:04:21 | uh presented |
---|
0:04:22 | and last time uh a filter shorting filter shaping |
---|
0:04:26 | but for these masks that you need |
---|
0:04:28 | well |
---|
0:04:28 | um |
---|
0:04:29 | solve the permutation problem first |
---|
0:04:32 | uh well it's as |
---|
0:04:33 | uh you can so it didn't each frequency bin independent |
---|
0:04:37 | so |
---|
0:04:39 | we were talking about the permutation problem what what is so how can be |
---|
0:04:44 | uh well |
---|
0:04:45 | uh |
---|
0:04:46 | scrap |
---|
0:04:47 | in this case |
---|
0:04:48 | we have to |
---|
0:04:49 | short time |
---|
0:04:50 | the |
---|
0:04:51 | some space two spectrograms for time free transform |
---|
0:04:54 | of two signals |
---|
0:04:55 | where just |
---|
0:04:56 | when you exactly know |
---|
0:04:58 | these spots a swell between the do use two |
---|
0:05:01 | uh |
---|
0:05:02 | spectrograms |
---|
0:05:04 | when you are we start these signals |
---|
0:05:06 | back |
---|
0:05:06 | to time domain of course |
---|
0:05:08 | both signals appear in boston channels |
---|
0:05:11 | so again you didn't uh |
---|
0:05:14 | separate and so you have to correct |
---|
0:05:17 | for use permutation and these can be |
---|
0:05:19 | uh and every frequency band different |
---|
0:05:22 | and usually comes quite complicated |
---|
0:05:25 | uh usually the two main approaches |
---|
0:05:29 | uh |
---|
0:05:30 | the |
---|
0:05:31 | a lot of paper as in and of friends |
---|
0:05:34 | concentrate on on direct T V two patents and directions of arrival |
---|
0:05:38 | uh the idea is |
---|
0:05:40 | when you have to or mixing matrix as |
---|
0:05:42 | uh we can just |
---|
0:05:44 | uh |
---|
0:05:44 | calculate |
---|
0:05:45 | to directions with a some come from and assume |
---|
0:05:49 | uh that one direction is one source |
---|
0:05:52 | this works |
---|
0:05:53 | good |
---|
0:05:54 | a strong we have low reverberation |
---|
0:05:56 | but i reverberation uh you can't |
---|
0:05:59 | um um |
---|
0:06:01 | pinpoint point a the sauce to thing the direction in all frequencies together |
---|
0:06:05 | uh in this case here |
---|
0:06:07 | i i used the statistics of the separated signals |
---|
0:06:11 | um one |
---|
0:06:12 | trivial simple case is uh |
---|
0:06:15 | you just |
---|
0:06:16 | look |
---|
0:06:17 | such a a line in the neighbouring nine in this say |
---|
0:06:20 | i |
---|
0:06:20 | they have to look to same |
---|
0:06:22 | so |
---|
0:06:23 | they here they are highly correlated |
---|
0:06:26 | um |
---|
0:06:28 | yeah this is true |
---|
0:06:29 | does this |
---|
0:06:31 | at least for |
---|
0:06:32 | when when you are looking for a very near bring bent so we have here to a wreck neighbouring bins |
---|
0:06:37 | and blue and green and yeah okay yeah highly correlated |
---|
0:06:41 | if you just |
---|
0:06:42 | go |
---|
0:06:43 | a few bins away |
---|
0:06:45 | yeah i i wouldn't say |
---|
0:06:47 | these been covered |
---|
0:06:49 | so the correlation method |
---|
0:06:50 | is not |
---|
0:06:51 | so to robust |
---|
0:06:53 | but uh they have been extensions to make it |
---|
0:06:56 | uh a lot more robust |
---|
0:06:58 | oh okay so |
---|
0:06:59 | yeah |
---|
0:07:01 | at these um |
---|
0:07:02 | the correlation coefficients uh |
---|
0:07:05 | take the |
---|
0:07:06 | um |
---|
0:07:07 | and then low |
---|
0:07:08 | calculate the correlation |
---|
0:07:09 | and decide |
---|
0:07:10 | the pen what station |
---|
0:07:12 | depending on all four possible permutations take |
---|
0:07:16 | and then |
---|
0:07:16 | and uh using is uh |
---|
0:07:18 | uh are you can just use a this this way |
---|
0:07:21 | as a already said this isn't very robust you have |
---|
0:07:24 | to make it |
---|
0:07:25 | a because of the |
---|
0:07:27 | yeah when comparing more distant bins |
---|
0:07:29 | a |
---|
0:07:30 | you just got wrong |
---|
0:07:32 | uh and then |
---|
0:07:33 | so um |
---|
0:07:35 | and |
---|
0:07:36 | you years ago uh uh just been proposed it is the other so think she you as proposed here |
---|
0:07:42 | but you don't compare |
---|
0:07:44 | single bins |
---|
0:07:45 | uh yeah |
---|
0:07:46 | but how blocks of bins |
---|
0:07:48 | so that the S luck like this |
---|
0:07:50 | you compare |
---|
0:07:51 | it's a first stage you compare one been but another |
---|
0:07:53 | zero you one |
---|
0:07:55 | and calculate a couple |
---|
0:07:56 | correlation can created in and you get |
---|
0:07:59 | you permutation and take the next to bins and so and so on |
---|
0:08:02 | so in this case you have neighbouring bands and you can assume okay to |
---|
0:08:07 | assumption to five related bins |
---|
0:08:09 | it's met |
---|
0:08:10 | in the next step |
---|
0:08:12 | you take |
---|
0:08:13 | these to correctly calculated bins |
---|
0:08:15 | take to two and calculate now |
---|
0:08:18 | uh these four collation so actually what you get |
---|
0:08:21 | F here for coefficients |
---|
0:08:23 | and we have to decide |
---|
0:08:24 | which one to take to you site which can eight uh which permutation do we take |
---|
0:08:29 | to big as one |
---|
0:08:30 | to mean |
---|
0:08:31 | to always one or whatever |
---|
0:08:33 | four |
---|
0:08:34 | but not a problem |
---|
0:08:35 | here you go to already sixteen and the next |
---|
0:08:38 | yeah we get a sixty four and so on |
---|
0:08:41 | so it becomes even harder |
---|
0:08:43 | um |
---|
0:08:44 | a simple example for this |
---|
0:08:46 | um |
---|
0:08:46 | when we just plot |
---|
0:08:48 | for the the situation but for a frequency |
---|
0:08:51 | bins |
---|
0:08:52 | um |
---|
0:08:52 | the coefficients yeah |
---|
0:08:55 | um |
---|
0:08:56 | for all frequency bins so |
---|
0:08:58 | and the first page you would just take the correlation it C coefficients |
---|
0:09:02 | directly |
---|
0:09:03 | uh on the first of their i don't know |
---|
0:09:05 | uh |
---|
0:09:06 | and a |
---|
0:09:07 | uh okay when you look at this |
---|
0:09:10 | it's |
---|
0:09:10 | looks like |
---|
0:09:11 | just go to uh |
---|
0:09:12 | well |
---|
0:09:13 | it just one here and here |
---|
0:09:15 | hardly |
---|
0:09:16 | so when you going |
---|
0:09:17 | next up to next steps |
---|
0:09:19 | so that's say |
---|
0:09:20 | you compare |
---|
0:09:21 | the block |
---|
0:09:23 | five from that to eight hundred to the block a time that to one thousand |
---|
0:09:28 | we on that or whatever |
---|
0:09:29 | you compare all the coefficients well which are and a square |
---|
0:09:34 | so we have a lot of coefficients which are correctly |
---|
0:09:37 | and a lot of coefficients with or |
---|
0:09:38 | not can |
---|
0:09:39 | and and so on in this case here |
---|
0:09:42 | K |
---|
0:09:44 | as we work |
---|
0:09:44 | here are not |
---|
0:09:46 | but in the next steps you compare these coefficients |
---|
0:09:49 | a K just me still worked as might a stable |
---|
0:09:52 | but this case here |
---|
0:09:54 | if a lot |
---|
0:09:55 | one computations |
---|
0:09:56 | which is a lot of |
---|
0:09:57 | indicators of our limitations which |
---|
0:10:00 | in a right and |
---|
0:10:01 | one conditions so |
---|
0:10:03 | usually the dyadic sorting scheme |
---|
0:10:06 | is that are but still |
---|
0:10:08 | phase |
---|
0:10:10 | but |
---|
0:10:10 | so and signal |
---|
0:10:13 | um |
---|
0:10:15 | no i want to |
---|
0:10:16 | um |
---|
0:10:18 | a present if you approach |
---|
0:10:20 | uh the first |
---|
0:10:21 | uh observation i i and you can make it |
---|
0:10:24 | when you're just take |
---|
0:10:26 | speech signals |
---|
0:10:27 | speech signals as past |
---|
0:10:29 | and um |
---|
0:10:32 | a mixture of two signals which are in a independent |
---|
0:10:35 | this last |
---|
0:10:37 | and a |
---|
0:10:38 | you can extend this |
---|
0:10:41 | even if the signals are on a signal |
---|
0:10:44 | as long as the independent |
---|
0:10:47 | to mixture is less spots |
---|
0:10:50 | and just is exactly what we have a a permutation problem we have to bound a signals and one to |
---|
0:10:55 | look which permutation do we have |
---|
0:10:57 | so the wrong permutation will be |
---|
0:11:00 | uh |
---|
0:11:01 | a past |
---|
0:11:03 | a a you have he an example of this |
---|
0:11:06 | uh just |
---|
0:11:07 | to plain speech signal |
---|
0:11:09 | but nothing |
---|
0:11:10 | hadn't yeah |
---|
0:11:12 | and in this case |
---|
0:11:13 | i just |
---|
0:11:14 | most to |
---|
0:11:16 | hi are that's uh uh of of the signal so that |
---|
0:11:19 | hi up |
---|
0:11:19 | half of the signal |
---|
0:11:21 | to the other so we have to mutation |
---|
0:11:23 | and the lower |
---|
0:11:25 | level of of the the R T K that sorting scheme |
---|
0:11:28 | and when we compare these |
---|
0:11:29 | we have here a lot of |
---|
0:11:31 | you was or more zeros |
---|
0:11:33 | and when you look here we have |
---|
0:11:36 | clearly a signal which is less spots |
---|
0:11:39 | and uh |
---|
0:11:41 | this is exactly what we need to uh |
---|
0:11:45 | from late |
---|
0:11:45 | the a new criterion |
---|
0:11:48 | you want to signal to be S sparse as possible |
---|
0:11:51 | uh the measurement of sparsity um |
---|
0:11:54 | for this is an hour of uh to take to |
---|
0:11:57 | some new method of the lp norm |
---|
0:11:59 | uh |
---|
0:12:01 | in my case cases a usually it takes something like zero point one |
---|
0:12:05 | for for P |
---|
0:12:06 | but it's not that |
---|
0:12:08 | and part you can vary |
---|
0:12:10 | um okay so |
---|
0:12:12 | uh i there is no |
---|
0:12:14 | S with the correlation coefficient |
---|
0:12:17 | we take |
---|
0:12:18 | our signal |
---|
0:12:20 | calculate |
---|
0:12:22 | no not the correlation between two signals |
---|
0:12:24 | but the sparsity of a sum of two signal |
---|
0:12:28 | and |
---|
0:12:29 | take again |
---|
0:12:30 | the four coefficients |
---|
0:12:31 | every every one against each other |
---|
0:12:34 | and you get one |
---|
0:12:35 | um |
---|
0:12:37 | yeah coefficients |
---|
0:12:38 | coefficient which can decide which permutation |
---|
0:12:41 | the point think about this |
---|
0:12:42 | snow |
---|
0:12:43 | we don't take the |
---|
0:12:45 | coefficients in the time-frequency domain but D transform |
---|
0:12:49 | is |
---|
0:12:50 | point process |
---|
0:12:51 | coefficients |
---|
0:12:52 | to uh |
---|
0:12:53 | time domain signal |
---|
0:12:54 | where we can apply |
---|
0:12:56 | it it you know |
---|
0:12:58 | uh |
---|
0:13:00 | using this |
---|
0:13:02 | even if we take |
---|
0:13:04 | that's a hundred frequency bins from K to S |
---|
0:13:07 | still again P that the calm |
---|
0:13:09 | just one and coefficient |
---|
0:13:11 | for the whole sorting she |
---|
0:13:14 | so when we now know do the |
---|
0:13:16 | the are or thing |
---|
0:13:17 | so we have again here and |
---|
0:13:20 | frequency |
---|
0:13:20 | just one thing the frequency band transform to the time domain |
---|
0:13:24 | he again one |
---|
0:13:25 | E applied to you know |
---|
0:13:27 | is |
---|
0:13:28 | and here again |
---|
0:13:29 | and |
---|
0:13:29 | at this point that is uh |
---|
0:13:31 | different |
---|
0:13:32 | no we transform |
---|
0:13:34 | to frequency bins the time domain |
---|
0:13:38 | and calculate again one comes and so and so and so |
---|
0:13:41 | so it's this point you don't know |
---|
0:13:43 | have to problem of |
---|
0:13:44 | which coefficients of this |
---|
0:13:46 | that's a thousands or or whatever |
---|
0:13:49 | do you do you takes on you uh but you have always just one coefficient |
---|
0:13:53 | and |
---|
0:13:54 | due to the |
---|
0:13:55 | different |
---|
0:13:56 | criterion |
---|
0:13:58 | uh a a it's it's much more robust |
---|
0:14:01 | mostly |
---|
0:14:02 | um |
---|
0:14:04 | i have |
---|
0:14:05 | done some |
---|
0:14:06 | simulations |
---|
0:14:08 | um |
---|
0:14:09 | so first set |
---|
0:14:10 | uh uh data set this does a for the set up |
---|
0:14:12 | use |
---|
0:14:12 | go |
---|
0:14:14 | T |
---|
0:14:15 | um |
---|
0:14:16 | so on so about last they can set from five years ago so |
---|
0:14:20 | um |
---|
0:14:21 | we have |
---|
0:14:22 | a separate |
---|
0:14:23 | this this state set that uh is |
---|
0:14:25 | the lot uh somehow |
---|
0:14:27 | it's a reverberant |
---|
0:14:29 | recordings some some speech but to relation is |
---|
0:14:32 | quite whole |
---|
0:14:33 | you can when you hear of to is that has that you can see yeah it's |
---|
0:14:36 | government art |
---|
0:14:38 | derivations like |
---|
0:14:39 | this this case |
---|
0:14:41 | the direction of of uh an approach |
---|
0:14:43 | it's |
---|
0:14:44 | very good |
---|
0:14:45 | um |
---|
0:14:47 | it |
---|
0:14:48 | it works because of the low vibration |
---|
0:14:51 | the proposed method it |
---|
0:14:53 | not as good |
---|
0:14:54 | almost |
---|
0:14:56 | but uh when you're local closely Y |
---|
0:14:59 | is performing |
---|
0:15:00 | not that good it's because |
---|
0:15:02 | it's a very low stage where we compare just one thing and frequency bin |
---|
0:15:07 | i |
---|
0:15:07 | yeah uh |
---|
0:15:08 | happened some limitations to and correct |
---|
0:15:11 | and uh |
---|
0:15:12 | so |
---|
0:15:13 | perhaps |
---|
0:15:15 | uh |
---|
0:15:15 | should it this to get so that a bit more |
---|
0:15:17 | if |
---|
0:15:19 | uh is |
---|
0:15:19 | assumption of |
---|
0:15:21 | sparsity and |
---|
0:15:22 | solves |
---|
0:15:23 | a a one pass cygnus is of this is correct |
---|
0:15:27 | and um |
---|
0:15:29 | but |
---|
0:15:29 | when you going to a a set which uh a that the cartons that high reverberation |
---|
0:15:34 | uh |
---|
0:15:35 | all over you got |
---|
0:15:37 | less |
---|
0:15:38 | uh suppression performance |
---|
0:15:41 | the do approach |
---|
0:15:42 | is |
---|
0:15:43 | because it with to set up you don |
---|
0:15:46 | to have the uh |
---|
0:15:48 | the signal coming from one direction because |
---|
0:15:50 | of the reverberation |
---|
0:15:52 | but |
---|
0:15:53 | the new approach we all again get almost the performance of the non right algorithm |
---|
0:15:58 | uh because this case um |
---|
0:16:01 | you don't |
---|
0:16:02 | matter which direction to signal comes as long as we |
---|
0:16:05 | i able to separate it |
---|
0:16:07 | in every frequency bin |
---|
0:16:09 | and um um |
---|
0:16:11 | so it's not always |
---|
0:16:13 | matching the non by case |
---|
0:16:15 | but it's |
---|
0:16:15 | more robust |
---|
0:16:16 | compared to the |
---|
0:16:18 | signal it's of the dot pro |
---|
0:16:20 | so to conclude |
---|
0:16:22 | um |
---|
0:16:23 | the converted by source separation |
---|
0:16:25 | can be soft and the sorry time-frequency domain |
---|
0:16:29 | a you have to solve the scaling and permutation |
---|
0:16:32 | and |
---|
0:16:33 | no we presented a new algorithm based and sparsity |
---|
0:16:37 | in the time domain |
---|
0:16:38 | not as user a and a dating time domain |
---|
0:16:43 | and with tire of variation we have usually better |
---|
0:16:46 | separation performance and there |
---|
0:16:48 | direction five |
---|
0:17:09 | uh |
---|
0:17:11 | so |
---|
0:17:15 | yeah let's a hard a set up it's like seven and a half set and for this i used five |
---|
0:17:20 | seconds |
---|
0:17:22 | i i saying |
---|
0:17:23 | if |
---|
0:17:24 | an a signal uh enough signal to make i C in each frequency band |
---|
0:17:28 | then there would be enough signal to make you |
---|
0:17:31 | you know |
---|