0:00:15 | yeah thank you mister chairman and um |
---|
0:00:18 | so the topic of my talk is on uh the relation between uh independent component and that analysis and mmse |
---|
0:00:24 | and it's a joint work my phd supervisor a professor bin yeah |
---|
0:00:30 | so a commonly considered case for independent component analysis |
---|
0:00:34 | is uh the demixing of linear noiseless mixture |
---|
0:00:37 | and in that case the um ideal demixing matrix W |
---|
0:00:41 | is the inverse of the mixing matrix a |
---|
0:00:45 | however i here we want to consider um linear noisy mixtures |
---|
0:00:49 | and the noise changes the ica solution so it's no longer |
---|
0:00:53 | the inverse mixing matching |
---|
0:00:55 | and this can be modelled by um the thing question here W ica |
---|
0:00:59 | is equal to a inverse class um deviation |
---|
0:01:02 | W to the |
---|
0:01:04 | or we can approximate this |
---|
0:01:06 | um for small noise as a inverse plus sigma squared times W bar |
---|
0:01:11 | um prior work on no a noisy ica mainly consists in methods to compensate um the by W to that |
---|
0:01:19 | and they modify the cost function or updated creation of ica |
---|
0:01:23 | however they require knowledge about the noise |
---|
0:01:27 | and we are interested in the ica solution |
---|
0:01:29 | for the noisy case without any bias correction |
---|
0:01:34 | and because we have made the observation that indeed uh i think you haste |
---|
0:01:37 | uh quite similar |
---|
0:01:39 | to mmse |
---|
0:01:40 | and uh and our goal is to find this matrix W bar |
---|
0:01:44 | you |
---|
0:01:45 | and that's creation |
---|
0:01:46 | and uh by this we want to explore the relation between i C an mmse theoretically |
---|
0:01:53 | so you a quick overview of my talk i will start of the signal model and the assumptions |
---|
0:01:58 | then uh we will look at |
---|
0:02:00 | three different solutions for the demixing task namely D |
---|
0:02:03 | inverse solution and the mmse solution of a to not blind methods |
---|
0:02:07 | then we will look at uh i a solution which is of course of course a |
---|
0:02:11 | blind approach |
---|
0:02:13 | uh in the is that section will then see that |
---|
0:02:15 | indeed i can um achieve an mse close to the mmse |
---|
0:02:23 | so the mixing and the demixing mixing process can be some right by these two equations creations you they are |
---|
0:02:28 | probably about known to all of you |
---|
0:02:30 | um |
---|
0:02:31 | X is the vector of |
---|
0:02:33 | mixture just signals which are are linear combinations of the source signals S |
---|
0:02:37 | um with them mixing |
---|
0:02:39 | through a square mixing matrix |
---|
0:02:40 | a a |
---|
0:02:41 | which is and by and |
---|
0:02:42 | and we have some at additive noise uh re |
---|
0:02:45 | yeah |
---|
0:02:47 | and the D make signals Y are obtained by a linear transform W applied to the mixture signals X |
---|
0:02:54 | the goal of the demixing is of course to get |
---|
0:02:57 | the D mixed signals by |
---|
0:02:58 | as similar as possible to |
---|
0:03:00 | the origin signals as |
---|
0:03:05 | so we make uh a a couple of these assumptions first the for the mixing process should be involved close |
---|
0:03:09 | so this means a a inverse should exist |
---|
0:03:13 | the original signals are assumed to be independent with the non gaussian pdf Q i |
---|
0:03:18 | with uh mean zero and variance one |
---|
0:03:21 | and furthermore we assume that the |
---|
0:03:23 | uh a D F Q is three times continuously differentiable |
---|
0:03:27 | and that all required expectation sick |
---|
0:03:31 | we got "'em" the noise we assume that it's zero-mean mean |
---|
0:03:34 | with a covariance matrix uh stick must where times are we so sick must where and D denotes the average |
---|
0:03:39 | variance of |
---|
0:03:40 | we and are we use a normal as covariance matrix |
---|
0:03:44 | the pdf O |
---|
0:03:46 | the pdf of the noise can be arbitrary uh but metric |
---|
0:03:50 | and this means |
---|
0:03:52 | that all or order moments of uh the noise are equal to zero |
---|
0:03:56 | and last we assume that uh the original sources S and the noise we are independent |
---|
0:04:04 | so he as the the first the non blind solution for the mixing that's uh it's the inverse solution |
---|
0:04:10 | so |
---|
0:04:11 | W inverse is equal to a inverse |
---|
0:04:14 | and uh it has the problem properties that it achieves a perfect demixing for the noiseless case |
---|
0:04:20 | however if there's noise this attention of noise amplification and this is |
---|
0:04:24 | especially serious if |
---|
0:04:25 | the mixing matrix a is close to singular |
---|
0:04:29 | and of course it only possible if you know a |
---|
0:04:32 | a in advance or some how can estimated |
---|
0:04:35 | and sits a non blind method |
---|
0:04:39 | the second non blind method is the mmse solution |
---|
0:04:43 | which is a a the metrics W which and minimize |
---|
0:04:46 | the uh M C |
---|
0:04:48 | there's solution is given in this equation here |
---|
0:04:51 | and we can approximate it |
---|
0:04:52 | in terms of signal square where S in the last line |
---|
0:04:56 | the properties are again that it's i think to to the inverse solution if that's no noise so we can |
---|
0:05:01 | achieve a perfect |
---|
0:05:02 | demixing mixing if there's no noise |
---|
0:05:04 | whatever um |
---|
0:05:05 | we need |
---|
0:05:06 | to know the mixing matrix a and |
---|
0:05:08 | properties of the |
---|
0:05:10 | noise |
---|
0:05:11 | or we need to be able to estimate a a second order moments between S and X |
---|
0:05:15 | so again it the um non blind met |
---|
0:05:20 | so now we come to the uh blind approach the ica solution |
---|
0:05:24 | the idea of ica is of course to get um |
---|
0:05:27 | the |
---|
0:05:28 | did mixed signals by a statistically independent |
---|
0:05:31 | since the since we assume that the original signals are statistically independent |
---|
0:05:35 | and we can define a desired distribution of the D mixed signals Q of |
---|
0:05:39 | why |
---|
0:05:40 | um to be the product of the marginal densities |
---|
0:05:44 | of the original source source Q i |
---|
0:05:47 | and then we can define a a cost function namely the kullback-leibler divergence |
---|
0:05:51 | between um the actual pdf of |
---|
0:05:54 | the T mixed signals by |
---|
0:05:56 | and the decide um P |
---|
0:05:58 | Q Q five |
---|
0:06:01 | and uh the |
---|
0:06:02 | formula for the kullback-leibler divergence is given here |
---|
0:06:05 | we just want to note that it's equal to zero if the two |
---|
0:06:08 | P D |
---|
0:06:09 | P and Q are identical |
---|
0:06:11 | and it's larger than zero if they are different |
---|
0:06:15 | hence speak can um so if the demixing type by minimizing this cost function using stochastic |
---|
0:06:20 | gradient descent |
---|
0:06:21 | and the update equations are given here |
---|
0:06:25 | so the the update uh the at W |
---|
0:06:28 | depends on W in uh in transpose |
---|
0:06:31 | and this so do correlation metrics |
---|
0:06:34 | and the function C i here |
---|
0:06:36 | is the um negative |
---|
0:06:38 | the remote of of the log pdf of the original source |
---|
0:06:45 | okay so at convergence of course |
---|
0:06:48 | the the update that of W is equal to zero and this is equivalent to say |
---|
0:06:53 | um that |
---|
0:06:54 | this |
---|
0:06:55 | uh to the correlation matrix |
---|
0:06:57 | uh fee of white tense why transpose |
---|
0:07:00 | um is equal to the identity metric |
---|
0:07:04 | the properties of the ica solution are that |
---|
0:07:07 | um it is equal to the inverse solution if there's no noise |
---|
0:07:11 | um |
---|
0:07:12 | but the big difference is that we don't need to know anything about a or S |
---|
0:07:17 | so um it's applied blind mixing |
---|
0:07:19 | yep the only thing that we require a is that we know the |
---|
0:07:22 | pdf of the original source |
---|
0:07:24 | and uh the original sources must be non goals |
---|
0:07:28 | if um all the pdfs at different |
---|
0:07:31 | then there's no permutation ambiguity |
---|
0:07:34 | and um if you know the pdf perfectly |
---|
0:07:37 | then this also no scaling at but |
---|
0:07:39 | um only a um but remains if the pdf |
---|
0:07:42 | estimate |
---|
0:07:45 | so now we come to the mentor or um of the paper |
---|
0:07:49 | um |
---|
0:07:49 | we can show by taylor series expansion of the nonlinear function fee i |
---|
0:07:54 | that the ica solution is given |
---|
0:07:57 | by this equation |
---|
0:07:59 | where um are we that is a transformed |
---|
0:08:02 | correlation matrix of the noise |
---|
0:08:04 | and |
---|
0:08:05 | and a is a scaling metrics |
---|
0:08:07 | which uh contain uh which depends on the pdf of the original sources |
---|
0:08:12 | through the parameters uh a pile and drove |
---|
0:08:15 | which are given uh here |
---|
0:08:18 | you just want to note that uh cup i a measure of non gaussian gaussianity |
---|
0:08:21 | and it's equal to one if and only if S as course and and it's |
---|
0:08:25 | in all other cases it's larger than one |
---|
0:08:28 | and for comparison we have |
---|
0:08:30 | written down here |
---|
0:08:31 | the um |
---|
0:08:32 | mmse solution and if you compared to see creation and the one on the top here |
---|
0:08:37 | you can see that they are indeed quite similar |
---|
0:08:39 | except for the scaling matrix and |
---|
0:08:42 | go go back uh the scaling matrix and here |
---|
0:08:45 | and if and is approximately a um a metrics for |
---|
0:08:48 | with all elements equal to one |
---|
0:08:50 | then we can conclude that the ica solution |
---|
0:08:53 | um is close to the mmse solution and we can also show that in that case |
---|
0:08:57 | um the two M ease |
---|
0:08:59 | of the ica solution and the mmse solution are quite similar |
---|
0:09:04 | the elements of the scaling matrix and are determined by um the pdf |
---|
0:09:09 | Q of S of the source |
---|
0:09:11 | and then to make any further conclusions we will assume |
---|
0:09:13 | uh a certain family of pdfs |
---|
0:09:16 | maybe the generalized some distribution |
---|
0:09:18 | so the pdf um |
---|
0:09:20 | is given here |
---|
0:09:21 | where come as the come function and that at is the shape parameter which controls the shape of the distribution |
---|
0:09:27 | for example for a type was to to we obtain the cost some distribution for a was to one |
---|
0:09:32 | that a i think distribution and |
---|
0:09:34 | if you let |
---|
0:09:35 | but to go to infinity V get the uniform distribution |
---|
0:09:40 | so um |
---|
0:09:41 | if we fixed the variance to one um then we obtain um that rose people to better minus one |
---|
0:09:49 | and the other parameters cut a to and the elements of the scaling matrix and |
---|
0:09:53 | are given in the plot here and the table |
---|
0:09:57 | so the diagonal elements and i i |
---|
0:09:59 | um i exactly equal to uh couple divided by two |
---|
0:10:03 | and the off diagonal elements |
---|
0:10:05 | and i J are between zero point five and one |
---|
0:10:10 | but maybe more interesting than these parameters is the question |
---|
0:10:13 | uh what i is he can be a and how close |
---|
0:10:16 | um can be get to the mmse estimator |
---|
0:10:20 | and for this we uh make an example we consider to G two D sources with the same shape parameter |
---|
0:10:26 | better |
---|
0:10:27 | the mixing matrix |
---|
0:10:28 | uh is given here we assume goes noise with |
---|
0:10:31 | uh identity covariance matrix |
---|
0:10:33 | and we have studied do relative mse E |
---|
0:10:36 | so this means the mse E of the ica solution |
---|
0:10:39 | divided it uh by the mse of the mmse is |
---|
0:10:44 | and as you can see from the plot you on the right hand side |
---|
0:10:47 | um |
---|
0:10:48 | the relative M you of the ica solution is close to one for a large range of the shape parameter |
---|
0:10:54 | better yeah |
---|
0:10:55 | so uh it |
---|
0:10:56 | less than one point zero six so uh only six percent worse than the mmse estimator |
---|
0:11:03 | and for reference we have also calculated the relative mse of the inverse solution |
---|
0:11:08 | for the two as an hours of ten db and |
---|
0:11:10 | twenty db |
---|
0:11:11 | so um you can see that |
---|
0:11:13 | um |
---|
0:11:14 | a blind approach i three eight out performs uh the inverse solution |
---|
0:11:18 | um which is the non blind method |
---|
0:11:20 | and also for a lot lot french of |
---|
0:11:22 | uh the values of the shape parameter better |
---|
0:11:27 | up to now we have a um can |
---|
0:11:30 | so we have uh consider only use theoretical results um which are valid |
---|
0:11:35 | uh only if |
---|
0:11:36 | for a infinite amount of data of since we have evaluated all the expectations exactly |
---|
0:11:41 | but in practice you'd never have uh internet amount of data |
---|
0:11:44 | so now we want to look at um an actual um could back like the divergence based ica algorithm with |
---|
0:11:50 | a finite amount of data |
---|
0:11:53 | and in practice use really don't use the standard ready and |
---|
0:11:56 | uh |
---|
0:11:57 | but instead the natural gradient because it has a better convergence properties and the update |
---|
0:12:02 | just |
---|
0:12:03 | you know here |
---|
0:12:05 | since we i now using um |
---|
0:12:07 | a finite amount of data |
---|
0:12:09 | of course not only the bias of the ica solution from the mmse solution is important |
---|
0:12:14 | but also the covariance of |
---|
0:12:16 | the estimation contributes to the mse |
---|
0:12:20 | and |
---|
0:12:20 | we uh can assume um |
---|
0:12:23 | two identically distributed source so um ica suffers from the permutation but |
---|
0:12:29 | so we need to resolve this before we can calculate the mse |
---|
0:12:32 | and uh last |
---|
0:12:34 | the scaling of the ica a uh components |
---|
0:12:37 | is slightly different from the scaling of the mmse solution |
---|
0:12:39 | so uh we also compensate for this before we can calculate the mse value |
---|
0:12:46 | so he a a a on the left plot |
---|
0:12:48 | we show the mse E for low passing dispute it signals with |
---|
0:12:51 | uh for different snrs and different sample size L |
---|
0:12:55 | the um like |
---|
0:12:57 | so line line the mmse estimator |
---|
0:13:00 | and um |
---|
0:13:01 | the colour lines are the actual performance of the ica algorithm |
---|
0:13:05 | and as you can see um |
---|
0:13:07 | for large enough sample size |
---|
0:13:09 | uh we can get |
---|
0:13:10 | quite close um to the M M Z |
---|
0:13:13 | um estimator so we can achieve a very good mse performance |
---|
0:13:18 | um |
---|
0:13:19 | for |
---|
0:13:20 | a low snr |
---|
0:13:21 | we can also see that ica out performs the inverse solution this is shown on the right hand side but |
---|
0:13:26 | we plot the relative mse |
---|
0:13:28 | so the line with a um |
---|
0:13:31 | triangles down works |
---|
0:13:33 | this |
---|
0:13:33 | uh sorry trying as a a court |
---|
0:13:36 | the inverse solution here |
---|
0:13:37 | so for low snr it |
---|
0:13:39 | increases quite dramatically |
---|
0:13:41 | where is the ica solution still use a reasonable |
---|
0:13:44 | um M E |
---|
0:13:47 | and the point where they are are the um the ica ica solution and inverse solution |
---|
0:13:51 | and this depends on the mixing matrix and the sample size |
---|
0:13:56 | and one last point that i want to mention um we have also plotted the uh to radical ica solution |
---|
0:14:01 | but the triangle down what's your |
---|
0:14:03 | and old um you can see that |
---|
0:14:05 | it matches quite well |
---|
0:14:07 | with the performance of the uh actual i i them |
---|
0:14:11 | except for very low snr of uh a zero db |
---|
0:14:14 | and this is |
---|
0:14:15 | a because uh we have made a small noise assumption in the door |
---|
0:14:19 | a because you have only be considered um |
---|
0:14:22 | terms up to order of signal square in or taylor series |
---|
0:14:28 | uh we want also to study the influence of the shape parameter better around the perform |
---|
0:14:33 | so here we plot of the relative mse of the ica solution |
---|
0:14:36 | for different |
---|
0:14:37 | um snrs |
---|
0:14:39 | so a ten db twenty db and thirty db |
---|
0:14:41 | and the channel or friend is that the more non goes and the source |
---|
0:14:45 | so the close a um but has equal to zero point five or or uh uh |
---|
0:14:50 | for large values of button |
---|
0:14:53 | the low the relative mse is |
---|
0:14:55 | except for a uh this case here where |
---|
0:14:57 | for the um |
---|
0:14:59 | as an hour ten db |
---|
0:15:02 | and one last point is that um one might wonder why does the the relative mse increase |
---|
0:15:08 | for increasing as are so if you go from ten db is not to thirty db snr Y |
---|
0:15:14 | uh just the relative mse increase |
---|
0:15:16 | um but this can be explained by the fact that |
---|
0:15:19 | indeed uh the and is E for the |
---|
0:15:22 | for i C eight for the noise this case |
---|
0:15:24 | is not a close to zero um |
---|
0:15:27 | but it's low or no longer but the cramer-rao role point |
---|
0:15:30 | which um depends on a cup and to |
---|
0:15:33 | i a couple and a sorry |
---|
0:15:35 | and uh and the relative mse increases for increasing snr |
---|
0:15:41 | so uh to summarise |
---|
0:15:42 | we have a uh in this paper right the ica solution and the M if for the noisy case |
---|
0:15:47 | have seen that there exists a relation between ica and mse |
---|
0:15:51 | which depends on the pdf of the original sources |
---|
0:15:55 | um however off from the ica solution which is the of course of blind approach |
---|
0:15:59 | is close to the mmse solution |
---|
0:16:02 | and we want to state that uh the relation also six just when the non great chief |
---|
0:16:07 | uh fee i |
---|
0:16:08 | does not match the true pdf |
---|
0:16:11 | and we have seen in the simulation results |
---|
0:16:13 | that uh we can at in practice achieve and mse close to the mmse estimator with |
---|
0:16:19 | uh and |
---|
0:16:21 | uh i C a i'd can based on the kullback-leibler divergence |
---|
0:16:25 | and we have also seen that not only the bias of the ica solution is important |
---|
0:16:29 | but also the covariance |
---|
0:16:31 | of the estimation at D two minds yet to for the performance |
---|
0:16:35 | and to |
---|
0:16:35 | some up everything i want to state uh |
---|
0:16:38 | blind demixing by a i is in many cases uh similar to non blind demixing based and M M C |
---|
0:16:44 | so uh i think you for attention and if you press |
---|
0:16:46 | these fit for each |
---|
0:17:20 | yeah or |
---|
0:17:21 | assuming a this is assuming uh |
---|
0:17:24 | that there's no uh time dependent |
---|
0:17:39 | and |
---|
0:17:39 | it of course it depends if you assume for example if you assume the the rum um |
---|
0:17:44 | the wrong type of the distribution if you assume to |
---|
0:17:46 | it's that course in and and in in the source that super coarse and |
---|
0:17:49 | then of course i say doesn't work |
---|
0:17:51 | so it's done it's of assume the correct type |
---|
0:17:54 | and |
---|
0:17:55 | yeah it |
---|
0:17:56 | it depends on the on the amount of mismatch |
---|
0:17:58 | if the mismatch is |
---|
0:17:59 | is um reasonably small or uh and uh used used to good approach |
---|
0:18:04 | and for phone |
---|
0:18:06 | and i i'm |
---|
0:18:07 | yeah |
---|
0:18:12 | yeah |
---|
0:18:13 | it |
---|
0:18:14 | new yeah |
---|
0:18:19 | it yeah it could be it could in the that are and have of uh i i've mentioned that uh |
---|
0:18:23 | in the in the paper or um that |
---|
0:18:25 | um you put use uh this derivation revision to act maybe be right um fee function |
---|
0:18:31 | uh |
---|
0:18:32 | which could you a lower mse |
---|
0:18:35 | by T |
---|
0:18:36 | but a by um getting a a metric which is close |
---|
0:18:39 | which is the elements close one |
---|
0:18:42 | um but the problem is this |
---|
0:18:43 | obviously depends on the snr and so you again you would need to |
---|
0:18:49 | yeah a |
---|
0:18:51 | so |
---|
0:18:53 | yeah |
---|
0:18:55 | oh |
---|