0:00:15yeah thank you mister chairman and um
0:00:18so the topic of my talk is on uh the relation between uh independent component and that analysis and mmse
0:00:24and it's a joint work my phd supervisor a professor bin yeah
0:00:30so a commonly considered case for independent component analysis
0:00:34is uh the demixing of linear noiseless mixture
0:00:37and in that case the um ideal demixing matrix W
0:00:41is the inverse of the mixing matrix a
0:00:45however i here we want to consider um linear noisy mixtures
0:00:49and the noise changes the ica solution so it's no longer
0:00:53the inverse mixing matching
0:00:55and this can be modelled by um the thing question here W ica
0:00:59is equal to a inverse class um deviation
0:01:02W to the
0:01:04or we can approximate this
0:01:06um for small noise as a inverse plus sigma squared times W bar
0:01:11um prior work on no a noisy ica mainly consists in methods to compensate um the by W to that
0:01:19and they modify the cost function or updated creation of ica
0:01:23however they require knowledge about the noise
0:01:27and we are interested in the ica solution
0:01:29for the noisy case without any bias correction
0:01:34and because we have made the observation that indeed uh i think you haste
0:01:37uh quite similar
0:01:39to mmse
0:01:40and uh and our goal is to find this matrix W bar
0:01:44you
0:01:45and that's creation
0:01:46and uh by this we want to explore the relation between i C an mmse theoretically
0:01:53so you a quick overview of my talk i will start of the signal model and the assumptions
0:01:58then uh we will look at
0:02:00three different solutions for the demixing task namely D
0:02:03inverse solution and the mmse solution of a to not blind methods
0:02:07then we will look at uh i a solution which is of course of course a
0:02:11blind approach
0:02:13uh in the is that section will then see that
0:02:15indeed i can um achieve an mse close to the mmse
0:02:23so the mixing and the demixing mixing process can be some right by these two equations creations you they are
0:02:28probably about known to all of you
0:02:30um
0:02:31X is the vector of
0:02:33mixture just signals which are are linear combinations of the source signals S
0:02:37um with them mixing
0:02:39through a square mixing matrix
0:02:40a a
0:02:41which is and by and
0:02:42and we have some at additive noise uh re
0:02:45yeah
0:02:47and the D make signals Y are obtained by a linear transform W applied to the mixture signals X
0:02:54the goal of the demixing is of course to get
0:02:57the D mixed signals by
0:02:58as similar as possible to
0:03:00the origin signals as
0:03:05so we make uh a a couple of these assumptions first the for the mixing process should be involved close
0:03:09so this means a a inverse should exist
0:03:13the original signals are assumed to be independent with the non gaussian pdf Q i
0:03:18with uh mean zero and variance one
0:03:21and furthermore we assume that the
0:03:23uh a D F Q is three times continuously differentiable
0:03:27and that all required expectation sick
0:03:31we got "'em" the noise we assume that it's zero-mean mean
0:03:34with a covariance matrix uh stick must where times are we so sick must where and D denotes the average
0:03:39variance of
0:03:40we and are we use a normal as covariance matrix
0:03:44the pdf O
0:03:46the pdf of the noise can be arbitrary uh but metric
0:03:50and this means
0:03:52that all or order moments of uh the noise are equal to zero
0:03:56and last we assume that uh the original sources S and the noise we are independent
0:04:04so he as the the first the non blind solution for the mixing that's uh it's the inverse solution
0:04:10so
0:04:11W inverse is equal to a inverse
0:04:14and uh it has the problem properties that it achieves a perfect demixing for the noiseless case
0:04:20however if there's noise this attention of noise amplification and this is
0:04:24especially serious if
0:04:25the mixing matrix a is close to singular
0:04:29and of course it only possible if you know a
0:04:32a in advance or some how can estimated
0:04:35and sits a non blind method
0:04:39the second non blind method is the mmse solution
0:04:43which is a a the metrics W which and minimize
0:04:46the uh M C
0:04:48there's solution is given in this equation here
0:04:51and we can approximate it
0:04:52in terms of signal square where S in the last line
0:04:56the properties are again that it's i think to to the inverse solution if that's no noise so we can
0:05:01achieve a perfect
0:05:02demixing mixing if there's no noise
0:05:04whatever um
0:05:05we need
0:05:06to know the mixing matrix a and
0:05:08properties of the
0:05:10noise
0:05:11or we need to be able to estimate a a second order moments between S and X
0:05:15so again it the um non blind met
0:05:20so now we come to the uh blind approach the ica solution
0:05:24the idea of ica is of course to get um
0:05:27the
0:05:28did mixed signals by a statistically independent
0:05:31since the since we assume that the original signals are statistically independent
0:05:35and we can define a desired distribution of the D mixed signals Q of
0:05:39why
0:05:40um to be the product of the marginal densities
0:05:44of the original source source Q i
0:05:47and then we can define a a cost function namely the kullback-leibler divergence
0:05:51between um the actual pdf of
0:05:54the T mixed signals by
0:05:56and the decide um P
0:05:58Q Q five
0:06:01and uh the
0:06:02formula for the kullback-leibler divergence is given here
0:06:05we just want to note that it's equal to zero if the two
0:06:08P D
0:06:09P and Q are identical
0:06:11and it's larger than zero if they are different
0:06:15hence speak can um so if the demixing type by minimizing this cost function using stochastic
0:06:20gradient descent
0:06:21and the update equations are given here
0:06:25so the the update uh the at W
0:06:28depends on W in uh in transpose
0:06:31and this so do correlation metrics
0:06:34and the function C i here
0:06:36is the um negative
0:06:38the remote of of the log pdf of the original source
0:06:45okay so at convergence of course
0:06:48the the update that of W is equal to zero and this is equivalent to say
0:06:53um that
0:06:54this
0:06:55uh to the correlation matrix
0:06:57uh fee of white tense why transpose
0:07:00um is equal to the identity metric
0:07:04the properties of the ica solution are that
0:07:07um it is equal to the inverse solution if there's no noise
0:07:11um
0:07:12but the big difference is that we don't need to know anything about a or S
0:07:17so um it's applied blind mixing
0:07:19yep the only thing that we require a is that we know the
0:07:22pdf of the original source
0:07:24and uh the original sources must be non goals
0:07:28if um all the pdfs at different
0:07:31then there's no permutation ambiguity
0:07:34and um if you know the pdf perfectly
0:07:37then this also no scaling at but
0:07:39um only a um but remains if the pdf
0:07:42estimate
0:07:45so now we come to the mentor or um of the paper
0:07:49um
0:07:49we can show by taylor series expansion of the nonlinear function fee i
0:07:54that the ica solution is given
0:07:57by this equation
0:07:59where um are we that is a transformed
0:08:02correlation matrix of the noise
0:08:04and
0:08:05and a is a scaling metrics
0:08:07which uh contain uh which depends on the pdf of the original sources
0:08:12through the parameters uh a pile and drove
0:08:15which are given uh here
0:08:18you just want to note that uh cup i a measure of non gaussian gaussianity
0:08:21and it's equal to one if and only if S as course and and it's
0:08:25in all other cases it's larger than one
0:08:28and for comparison we have
0:08:30written down here
0:08:31the um
0:08:32mmse solution and if you compared to see creation and the one on the top here
0:08:37you can see that they are indeed quite similar
0:08:39except for the scaling matrix and
0:08:42go go back uh the scaling matrix and here
0:08:45and if and is approximately a um a metrics for
0:08:48with all elements equal to one
0:08:50then we can conclude that the ica solution
0:08:53um is close to the mmse solution and we can also show that in that case
0:08:57um the two M ease
0:08:59of the ica solution and the mmse solution are quite similar
0:09:04the elements of the scaling matrix and are determined by um the pdf
0:09:09Q of S of the source
0:09:11and then to make any further conclusions we will assume
0:09:13uh a certain family of pdfs
0:09:16maybe the generalized some distribution
0:09:18so the pdf um
0:09:20is given here
0:09:21where come as the come function and that at is the shape parameter which controls the shape of the distribution
0:09:27for example for a type was to to we obtain the cost some distribution for a was to one
0:09:32that a i think distribution and
0:09:34if you let
0:09:35but to go to infinity V get the uniform distribution
0:09:40so um
0:09:41if we fixed the variance to one um then we obtain um that rose people to better minus one
0:09:49and the other parameters cut a to and the elements of the scaling matrix and
0:09:53are given in the plot here and the table
0:09:57so the diagonal elements and i i
0:09:59um i exactly equal to uh couple divided by two
0:10:03and the off diagonal elements
0:10:05and i J are between zero point five and one
0:10:10but maybe more interesting than these parameters is the question
0:10:13uh what i is he can be a and how close
0:10:16um can be get to the mmse estimator
0:10:20and for this we uh make an example we consider to G two D sources with the same shape parameter
0:10:26better
0:10:27the mixing matrix
0:10:28uh is given here we assume goes noise with
0:10:31uh identity covariance matrix
0:10:33and we have studied do relative mse E
0:10:36so this means the mse E of the ica solution
0:10:39divided it uh by the mse of the mmse is
0:10:44and as you can see from the plot you on the right hand side
0:10:47um
0:10:48the relative M you of the ica solution is close to one for a large range of the shape parameter
0:10:54better yeah
0:10:55so uh it
0:10:56less than one point zero six so uh only six percent worse than the mmse estimator
0:11:03and for reference we have also calculated the relative mse of the inverse solution
0:11:08for the two as an hours of ten db and
0:11:10twenty db
0:11:11so um you can see that
0:11:13um
0:11:14a blind approach i three eight out performs uh the inverse solution
0:11:18um which is the non blind method
0:11:20and also for a lot lot french of
0:11:22uh the values of the shape parameter better
0:11:27up to now we have a um can
0:11:30so we have uh consider only use theoretical results um which are valid
0:11:35uh only if
0:11:36for a infinite amount of data of since we have evaluated all the expectations exactly
0:11:41but in practice you'd never have uh internet amount of data
0:11:44so now we want to look at um an actual um could back like the divergence based ica algorithm with
0:11:50a finite amount of data
0:11:53and in practice use really don't use the standard ready and
0:11:56uh
0:11:57but instead the natural gradient because it has a better convergence properties and the update
0:12:02just
0:12:03you know here
0:12:05since we i now using um
0:12:07a finite amount of data
0:12:09of course not only the bias of the ica solution from the mmse solution is important
0:12:14but also the covariance of
0:12:16the estimation contributes to the mse
0:12:20and
0:12:20we uh can assume um
0:12:23two identically distributed source so um ica suffers from the permutation but
0:12:29so we need to resolve this before we can calculate the mse
0:12:32and uh last
0:12:34the scaling of the ica a uh components
0:12:37is slightly different from the scaling of the mmse solution
0:12:39so uh we also compensate for this before we can calculate the mse value
0:12:46so he a a a on the left plot
0:12:48we show the mse E for low passing dispute it signals with
0:12:51uh for different snrs and different sample size L
0:12:55the um like
0:12:57so line line the mmse estimator
0:13:00and um
0:13:01the colour lines are the actual performance of the ica algorithm
0:13:05and as you can see um
0:13:07for large enough sample size
0:13:09uh we can get
0:13:10quite close um to the M M Z
0:13:13um estimator so we can achieve a very good mse performance
0:13:18um
0:13:19for
0:13:20a low snr
0:13:21we can also see that ica out performs the inverse solution this is shown on the right hand side but
0:13:26we plot the relative mse
0:13:28so the line with a um
0:13:31triangles down works
0:13:33this
0:13:33uh sorry trying as a a court
0:13:36the inverse solution here
0:13:37so for low snr it
0:13:39increases quite dramatically
0:13:41where is the ica solution still use a reasonable
0:13:44um M E
0:13:47and the point where they are are the um the ica ica solution and inverse solution
0:13:51and this depends on the mixing matrix and the sample size
0:13:56and one last point that i want to mention um we have also plotted the uh to radical ica solution
0:14:01but the triangle down what's your
0:14:03and old um you can see that
0:14:05it matches quite well
0:14:07with the performance of the uh actual i i them
0:14:11except for very low snr of uh a zero db
0:14:14and this is
0:14:15a because uh we have made a small noise assumption in the door
0:14:19a because you have only be considered um
0:14:22terms up to order of signal square in or taylor series
0:14:28uh we want also to study the influence of the shape parameter better around the perform
0:14:33so here we plot of the relative mse of the ica solution
0:14:36for different
0:14:37um snrs
0:14:39so a ten db twenty db and thirty db
0:14:41and the channel or friend is that the more non goes and the source
0:14:45so the close a um but has equal to zero point five or or uh uh
0:14:50for large values of button
0:14:53the low the relative mse is
0:14:55except for a uh this case here where
0:14:57for the um
0:14:59as an hour ten db
0:15:02and one last point is that um one might wonder why does the the relative mse increase
0:15:08for increasing as are so if you go from ten db is not to thirty db snr Y
0:15:14uh just the relative mse increase
0:15:16um but this can be explained by the fact that
0:15:19indeed uh the and is E for the
0:15:22for i C eight for the noise this case
0:15:24is not a close to zero um
0:15:27but it's low or no longer but the cramer-rao role point
0:15:30which um depends on a cup and to
0:15:33i a couple and a sorry
0:15:35and uh and the relative mse increases for increasing snr
0:15:41so uh to summarise
0:15:42we have a uh in this paper right the ica solution and the M if for the noisy case
0:15:47have seen that there exists a relation between ica and mse
0:15:51which depends on the pdf of the original sources
0:15:55um however off from the ica solution which is the of course of blind approach
0:15:59is close to the mmse solution
0:16:02and we want to state that uh the relation also six just when the non great chief
0:16:07uh fee i
0:16:08does not match the true pdf
0:16:11and we have seen in the simulation results
0:16:13that uh we can at in practice achieve and mse close to the mmse estimator with
0:16:19uh and
0:16:21uh i C a i'd can based on the kullback-leibler divergence
0:16:25and we have also seen that not only the bias of the ica solution is important
0:16:29but also the covariance
0:16:31of the estimation at D two minds yet to for the performance
0:16:35and to
0:16:35some up everything i want to state uh
0:16:38blind demixing by a i is in many cases uh similar to non blind demixing based and M M C
0:16:44so uh i think you for attention and if you press
0:16:46these fit for each
0:17:20yeah or
0:17:21assuming a this is assuming uh
0:17:24that there's no uh time dependent
0:17:39and
0:17:39it of course it depends if you assume for example if you assume the the rum um
0:17:44the wrong type of the distribution if you assume to
0:17:46it's that course in and and in in the source that super coarse and
0:17:49then of course i say doesn't work
0:17:51so it's done it's of assume the correct type
0:17:54and
0:17:55yeah it
0:17:56it depends on the on the amount of mismatch
0:17:58if the mismatch is
0:17:59is um reasonably small or uh and uh used used to good approach
0:18:04and for phone
0:18:06and i i'm
0:18:07yeah
0:18:12yeah
0:18:13it
0:18:14new yeah
0:18:19it yeah it could be it could in the that are and have of uh i i've mentioned that uh
0:18:23in the in the paper or um that
0:18:25um you put use uh this derivation revision to act maybe be right um fee function
0:18:31uh
0:18:32which could you a lower mse
0:18:35by T
0:18:36but a by um getting a a metric which is close
0:18:39which is the elements close one
0:18:42um but the problem is this
0:18:43obviously depends on the snr and so you again you would need to
0:18:49yeah a
0:18:51so
0:18:53yeah
0:18:55oh