0:00:15 | oh uh i |
---|
0:00:17 | i mentioned uh this work is uh |
---|
0:00:19 | i one for and the initial a trained that for a more quickly |
---|
0:00:24 | on the the a project of do you and at least that's not C G |
---|
0:00:30 | or or i or one |
---|
0:00:32 | i know i had the opportunity to work for uh |
---|
0:00:35 | some uh is search for those |
---|
0:00:36 | sparse it |
---|
0:00:37 | i is is that uh |
---|
0:00:39 | results of out |
---|
0:00:40 | right |
---|
0:00:42 | what's that what would like to |
---|
0:00:43 | you was i a the top P and evaluation of noise power spectral lines |
---|
0:00:47 | summation them |
---|
0:00:49 | in at the best an and one |
---|
0:00:52 | but the outline of mine each face |
---|
0:00:55 | give the motivation of the whole |
---|
0:00:57 | and now i in to use you an overview of that is that's that we can see that the uh |
---|
0:01:02 | a frame or |
---|
0:01:04 | then uh the evaluation measures that use |
---|
0:01:07 | and uh the |
---|
0:01:08 | and of my uh |
---|
0:01:10 | that a they would be uh experimental results and uh |
---|
0:01:14 | conclusion |
---|
0:01:16 | but you know that and noise power spectral density estimation or or uh and noise power |
---|
0:01:21 | estimation summation is a a a crucial part of a speech enhancement the frequency domain |
---|
0:01:27 | and uh this N T and in new algorithms have mean in to use in this uh in your |
---|
0:01:33 | uh but unfortunately there is no uh compressive and you |
---|
0:01:38 | uh |
---|
0:01:39 | frame for evaluating |
---|
0:01:41 | uh |
---|
0:01:42 | the performance of noise probably made |
---|
0:01:45 | therefore uh |
---|
0:01:47 | the aims of the framework that you got uh |
---|
0:01:50 | look for was |
---|
0:01:51 | to uh present the |
---|
0:01:53 | performance of some uh the on and B sent and was power estimators |
---|
0:01:58 | and uh |
---|
0:02:00 | we |
---|
0:02:01 | uh |
---|
0:02:02 | for the more at the uh uh a new measure two |
---|
0:02:05 | do you are a more comprehensive uh evaluation of the perform |
---|
0:02:11 | we can see there uh a a lose them in our framework uh the minimum statistics noise power estimate |
---|
0:02:18 | uh proposed boy uh rain a marketing two thousand one which is one of the is state of the or |
---|
0:02:23 | algorithm |
---|
0:02:25 | a second uh i is that was uh a minimal |
---|
0:02:28 | controlled recursive averaging or |
---|
0:02:31 | and crawl |
---|
0:02:32 | and which is also the the state of the art i is and in this area |
---|
0:02:36 | quite and two thousand two |
---|
0:02:38 | V i with them belonging to M prague category |
---|
0:02:42 | uh the improved version of this algorithm |
---|
0:02:45 | in cross and uh em crawl which is a and hand |
---|
0:02:49 | uh minimal controlled recursive averaging |
---|
0:02:52 | and am cry maybe |
---|
0:02:54 | uh we consider it's bass notes tracking approach or it's and T what was what hendrix of doesn't eight |
---|
0:03:02 | to algorithms exams uh based on a minimum unit square error or estimation |
---|
0:03:07 | which are available in two |
---|
0:03:09 | uh a different approaches uh mini |
---|
0:03:11 | mmse U and mmse had two thousand ten |
---|
0:03:17 | a a a a a uh for evaluation uh to he issues that there uh taken into account and uh |
---|
0:03:23 | the face |
---|
0:03:24 | issues that |
---|
0:03:26 | we one oh the uh evaluation being uh independent of uh a speech enhancement system |
---|
0:03:32 | uh there is then is that we want to uh separate the effects of any a speech enhancement system |
---|
0:03:38 | on the performance of noise power estimator |
---|
0:03:41 | we just focus on the |
---|
0:03:43 | uh estimation error or of noise |
---|
0:03:45 | a track |
---|
0:03:47 | a second issue is that uh we need |
---|
0:03:50 | it to have a |
---|
0:03:51 | a suitable able friends noise for our evaluation |
---|
0:03:55 | and uh uh you have as the |
---|
0:03:57 | three uh |
---|
0:03:59 | reasons for our uh consideration |
---|
0:04:01 | first during link speech activities |
---|
0:04:04 | uh the an instantaneous and noise but is not available |
---|
0:04:08 | uh and noise |
---|
0:04:10 | and most |
---|
0:04:10 | noise power estimation approach |
---|
0:04:14 | also sorry in noise reduction approaches require a smooth |
---|
0:04:17 | uh version of noise |
---|
0:04:18 | estimate |
---|
0:04:20 | and uh the last one not that this one if we want to |
---|
0:04:23 | uh that to use the impact of uh random fluctuations in the |
---|
0:04:28 | origin original noise uh a pretty okay well |
---|
0:04:30 | therefore four three |
---|
0:04:32 | uh |
---|
0:04:32 | consider the |
---|
0:04:35 | let for noise not the original |
---|
0:04:37 | the reference noise which is this means not |
---|
0:04:40 | the innocent in |
---|
0:04:42 | so |
---|
0:04:44 | uh |
---|
0:04:45 | the first evaluation measure that we can see is |
---|
0:04:48 | uh mean estimation error or used the most common |
---|
0:04:51 | uh use |
---|
0:04:53 | the estimation |
---|
0:04:54 | uh |
---|
0:04:55 | evaluation measure |
---|
0:04:56 | and is defined as the average if there's based being the |
---|
0:05:01 | yeah reference noise |
---|
0:05:03 | oh and |
---|
0:05:04 | estimated well |
---|
0:05:05 | if you can see that the look you of uh the noise power that are you have shown here |
---|
0:05:12 | uh the reference noise power boy |
---|
0:05:14 | uh |
---|
0:05:16 | a that are you power to and uh as that's the hot |
---|
0:05:20 | you see uh that the operation the |
---|
0:05:23 | but shoot the look or a issue over all frequency bins and frame is |
---|
0:05:28 | where capital K is the number of frequency bins and a capital R i |
---|
0:05:33 | is the number of frame the |
---|
0:05:35 | is is that you did not model uh |
---|
0:05:38 | evaluation measure |
---|
0:05:40 | and the pope was one is the estimation error value |
---|
0:05:43 | which is |
---|
0:05:45 | uh propose in this uh a frame |
---|
0:05:47 | and in fact if if you imagine the |
---|
0:05:51 | ratio between |
---|
0:05:52 | uh a reference noise ball |
---|
0:05:55 | no if reference noise power and estimated noise power |
---|
0:05:58 | overall all of frequency bins and frame in this is as the re |
---|
0:06:03 | we divide these meant to some uh of units and uh |
---|
0:06:07 | we estimate the variance of is |
---|
0:06:10 | so you need then the computed value values are |
---|
0:06:13 | uh i over the number of some units |
---|
0:06:17 | but |
---|
0:06:18 | and trap it all N |
---|
0:06:20 | is there a number of us all units |
---|
0:06:22 | in a role of the matrix and N is the number of sub units in colour |
---|
0:06:28 | this body has operator |
---|
0:06:30 | is the operator which computes the variance of his soap unit in the and so on "'em" score |
---|
0:06:36 | that in the next nine |
---|
0:06:39 | uh |
---|
0:06:39 | here are we shall uh for example the |
---|
0:06:42 | do you i think the sub you need |
---|
0:06:44 | uh but the number of frequency bins of this up in it is |
---|
0:06:47 | uh |
---|
0:06:48 | case so on |
---|
0:06:49 | the number of frames in this |
---|
0:06:51 | i so |
---|
0:06:53 | so uh i for this uh up units V |
---|
0:06:56 | compute the value as uh which is uh uh is |
---|
0:07:00 | uh equation to estimate the value |
---|
0:07:03 | and here |
---|
0:07:04 | uh for example um mu |
---|
0:07:06 | i and is there |
---|
0:07:08 | mean of the |
---|
0:07:10 | uh you know of the |
---|
0:07:11 | expected values |
---|
0:07:13 | for us stop unit in the and strong |
---|
0:07:16 | uh in our experiments we consider at the number of uh the number of |
---|
0:07:20 | yeah frame in this is for just sub unit be to fifteen |
---|
0:07:24 | and the number of frequency bins ten |
---|
0:07:28 | yeah |
---|
0:07:28 | yeah |
---|
0:07:30 | or each uh present you the |
---|
0:07:32 | the experiments are settings of the algorithm |
---|
0:07:35 | uh as i mentioned before i |
---|
0:07:37 | eight a yeah i it so it's right |
---|
0:07:39 | where implement implemented |
---|
0:07:42 | a sampling frequency of all signals is eight khz |
---|
0:07:46 | uh they've window length as well as the uh you have length |
---|
0:07:50 | a uh is it one is fifty six samples or sit two the cans |
---|
0:07:55 | uh |
---|
0:07:56 | consider it uh |
---|
0:07:58 | to source of clean is speech signal and are taken from timit database |
---|
0:08:03 | uh one made a speech on a a one theme in speech |
---|
0:08:06 | each of with the duration of two you |
---|
0:08:10 | right yeah concatenating the |
---|
0:08:11 | short segments |
---|
0:08:13 | of for example a us |
---|
0:08:15 | six speakers different speakers |
---|
0:08:18 | and uh for uh simulating the |
---|
0:08:21 | uh |
---|
0:08:22 | adverse environments |
---|
0:08:23 | acoustic can stick one ms we consider a a seven different type of noise |
---|
0:08:28 | uh taken from uh sound at as database |
---|
0:08:31 | this simple less and the |
---|
0:08:33 | uh |
---|
0:08:35 | uh |
---|
0:08:37 | easy S one is the what question noise to estimate uh the on a stationary and white gaussian noise that |
---|
0:08:43 | we can see that |
---|
0:08:44 | uh |
---|
0:08:45 | bank i of noise |
---|
0:08:47 | uh and sinusoidally modulated white question noise |
---|
0:08:50 | uh the a noise car noise and traffic want traffic to noise |
---|
0:08:54 | but the traffic to noise uh mainly can uh contains |
---|
0:08:58 | uh a home |
---|
0:08:59 | uh sounds |
---|
0:09:01 | and the difference between traffic one traffic one is that the traffic to noise |
---|
0:09:05 | is a more structured and how |
---|
0:09:10 | at the range of input snrs used from minus five db to twenty db with this sub size of uh |
---|
0:09:15 | five db |
---|
0:09:17 | uh |
---|
0:09:18 | as i talk uh |
---|
0:09:20 | before the reference noise that's uh was |
---|
0:09:23 | important for our evaluation |
---|
0:09:25 | uh finally we decided to |
---|
0:09:28 | uh can can see that that we can receive temporal a smooth single of the noise pretty a ground |
---|
0:09:33 | with the |
---|
0:09:35 | uh |
---|
0:09:35 | a most you factor of a point nine |
---|
0:09:40 | yeah i i shall a a uh pretty at a crumb or the noise power |
---|
0:09:44 | uh which is a uh or |
---|
0:09:47 | uh frequency bins |
---|
0:09:49 | uh and is |
---|
0:09:50 | plot to or frame in this is |
---|
0:09:53 | uh here you see the |
---|
0:09:57 | the noise power of white question noise uh bank is to while question noise |
---|
0:10:02 | uh |
---|
0:10:02 | on to the |
---|
0:10:03 | traffic |
---|
0:10:04 | a noise |
---|
0:10:05 | and you see the |
---|
0:10:06 | a a stationary T of the noise |
---|
0:10:09 | and uh are also a stationary |
---|
0:10:10 | someone |
---|
0:10:17 | yeah |
---|
0:10:17 | here is their results of our yeah evaluation |
---|
0:10:21 | in terms of uh a not |
---|
0:10:23 | estimation or or yeah it's make mean estimation error or |
---|
0:10:26 | and uh estimation error variance for what question noise |
---|
0:10:31 | that is uh the pick that |
---|
0:10:33 | for a a eight i'll |
---|
0:10:35 | uh all this space was tracking mmse mse hmmm X |
---|
0:10:38 | i mean was it it's six uh on the |
---|
0:10:41 | mmse you |
---|
0:10:43 | and is a |
---|
0:10:45 | uh is uh depicted for uh |
---|
0:10:48 | uh six level of signal soon as a issue |
---|
0:10:52 | uh be different colours |
---|
0:10:54 | and you see that uh or |
---|
0:10:57 | no signals and was you uh most most of the algorithms perform or less the same |
---|
0:11:02 | or by increasing the signal to lose a issue |
---|
0:11:05 | uh |
---|
0:11:06 | uh some of i with sam's |
---|
0:11:08 | or it seems to be more susceptible |
---|
0:11:11 | and uh here |
---|
0:11:14 | or show the results for |
---|
0:11:15 | and not a stationary what gaussian noise uh sinusoidally modulated one |
---|
0:11:19 | uh and you see some of the algorithms on not robust in tracking the noise power |
---|
0:11:25 | uh but the others us |
---|
0:11:27 | like us of base notes tracking |
---|
0:11:30 | mmse hand leaks and mmse you |
---|
0:11:33 | or or a in the low level of signals signal so those a seems to be one |
---|
0:11:38 | and fast in tracking of the noise power or |
---|
0:11:40 | what uh |
---|
0:11:42 | the um |
---|
0:11:44 | in terms of the uh |
---|
0:11:46 | estimation error variance also you see |
---|
0:11:48 | that the uh same result used to live in the ranking go five reasons |
---|
0:11:53 | more |
---|
0:11:55 | is is the |
---|
0:11:56 | results or babble noise |
---|
0:11:59 | that's |
---|
0:12:00 | uh |
---|
0:12:01 | you presents here |
---|
0:12:05 | and uh uh here is that not for traffic to noise |
---|
0:12:08 | uh and we selected actually these noise to for uh the sub bass notes tracking i with them |
---|
0:12:14 | uh there are uh that in this algorithm uh one of the national uh assumptions used that |
---|
0:12:20 | uh |
---|
0:12:21 | the noise uh |
---|
0:12:22 | shouldn't be |
---|
0:12:24 | uh |
---|
0:12:25 | a structure or how because |
---|
0:12:27 | uh |
---|
0:12:28 | the |
---|
0:12:29 | uh |
---|
0:12:30 | uh this |
---|
0:12:31 | how how many signals |
---|
0:12:33 | uh can be calm it |
---|
0:12:34 | the extra we uh low rank model |
---|
0:12:37 | and can be confused beat the speech signal in the signal subspace |
---|
0:12:41 | what of course are some modifications in these uh |
---|
0:12:44 | algorithms that in to do in the paper |
---|
0:12:47 | uh but uh for the |
---|
0:12:49 | algorithms we talked to uh a modulation you see that |
---|
0:12:53 | uh and want of the mean estimation or |
---|
0:12:56 | which is uh |
---|
0:12:57 | what's than the |
---|
0:12:59 | mm as |
---|
0:13:01 | so |
---|
0:13:05 | yeah a is the |
---|
0:13:07 | actually a short here as some of the results of our evaluation |
---|
0:13:11 | uh for uh |
---|
0:13:13 | a limited time |
---|
0:13:15 | uh |
---|
0:13:16 | and uh one of the important points that we can't to from the evaluation is that |
---|
0:13:21 | uh estimation in or writing as a trying was i addition iteration are uh in size uh for the evaluation |
---|
0:13:28 | of the perform |
---|
0:13:30 | because a using estimation error variance |
---|
0:13:33 | we can uh measure the amount of fluctuations in the noise power |
---|
0:13:37 | uh |
---|
0:13:38 | uh in the noise estimated noise power |
---|
0:13:41 | for example of uh to mess so it's performance uh a very close to show the in terms of uh |
---|
0:13:47 | is mean estimation all |
---|
0:13:50 | why having the estimation error variance we can |
---|
0:13:53 | a a get the that to uh and more comprehensive you've performance and i uh the most rate might claim |
---|
0:14:00 | point |
---|
0:14:00 | showing an example |
---|
0:14:02 | uh for example can see if you made speech signal to by uh sinusoidally modulated noise what question noise |
---|
0:14:10 | at twenty db signal to a issue |
---|
0:14:13 | in this figure are uh the |
---|
0:14:15 | you look is the speak in a speech for power |
---|
0:14:19 | uh at the green a curve is the estimated noise at to live quite in with them |
---|
0:14:25 | and uh the red curve is the estimated noise by |
---|
0:14:29 | and crime maybe are result and |
---|
0:14:32 | there are noise is |
---|
0:14:33 | uh |
---|
0:14:35 | the black four |
---|
0:14:36 | uh |
---|
0:14:37 | you see that be a different behavior of algorithms in tracking of noise power |
---|
0:14:42 | for example uh |
---|
0:14:45 | and cry maybe i'll with them |
---|
0:14:46 | uh denoted by a red curve |
---|
0:14:48 | a has the |
---|
0:14:50 | on the estimation of noise power |
---|
0:14:52 | uh |
---|
0:14:53 | well not |
---|
0:14:54 | a fast |
---|
0:14:55 | the action to it's tracking of noise |
---|
0:14:58 | uh a in the |
---|
0:15:01 | in a i'll is a gives uh |
---|
0:15:04 | some over a |
---|
0:15:05 | estimation of noise power we some |
---|
0:15:07 | uh |
---|
0:15:08 | a fluctuations |
---|
0:15:10 | and these fluctuations that are actually |
---|
0:15:13 | a uh |
---|
0:15:15 | uh following or a tracking the |
---|
0:15:17 | speech component |
---|
0:15:20 | so is important for us to a predict that |
---|
0:15:23 | if the error or is related to on the estimation or or estimation or the fluctuations |
---|
0:15:29 | here uh by using estimation |
---|
0:15:32 | a mean estimation and error you see |
---|
0:15:34 | uh that the |
---|
0:15:39 | uh you see that the |
---|
0:15:41 | estimated value they mean estimation error is uh |
---|
0:15:44 | the same very close |
---|
0:15:49 | and uh but a and and also sort the performance of in cried with them use some hope better than |
---|
0:15:54 | uh |
---|
0:15:56 | so |
---|
0:15:56 | uh but there that in prime three |
---|
0:15:59 | but in terms of the estimation error run ins you see that the |
---|
0:16:04 | uh in prime a maybe i'd with them gives a less |
---|
0:16:07 | uh |
---|
0:16:08 | and hence |
---|
0:16:09 | and |
---|
0:16:10 | this shows the |
---|
0:16:11 | uh three for ability of the |
---|
0:16:14 | employ a maybe because you |
---|
0:16:15 | meant |
---|
0:16:16 | uh gives a less for uh fluctuations or |
---|
0:16:21 | is more a smooth |
---|
0:16:22 | the tracking of noise power |
---|
0:16:27 | here |
---|
0:16:27 | by uh |
---|
0:16:29 | and of my presentation be giving some conclusions of the frame work |
---|
0:16:33 | uh first |
---|
0:16:35 | conclusion is that this some noise power estimators are sorts of the the |
---|
0:16:39 | and sensitive to the increase meant |
---|
0:16:42 | of the |
---|
0:16:42 | signals to ratio |
---|
0:16:44 | and uh for some of them you see uh the robustness in this test signals so as issue but for |
---|
0:16:51 | others no |
---|
0:16:52 | and this is can uh uh of uh can N is that uh a you having |
---|
0:16:57 | uh estimation error variance we and a |
---|
0:17:00 | gets better |
---|
0:17:01 | uh |
---|
0:17:02 | inside |
---|
0:17:03 | two words |
---|
0:17:04 | comp where comparing the most power estimator |
---|
0:17:07 | and uh in fact uh |
---|
0:17:10 | these fluctuations maybe |
---|
0:17:12 | uh put a |
---|
0:17:13 | a voice some a musical noise at the end of this beast and enhancement for the enhance the speech |
---|
0:17:19 | so is important to |
---|
0:17:20 | predict then want of |
---|
0:17:22 | but uh fluctuations |
---|
0:17:23 | and uh the says conclusion is that uh for non is stationary and noise types |
---|
0:17:29 | uh |
---|
0:17:30 | few algorithms uh can give us |
---|
0:17:33 | the |
---|
0:17:33 | fast tracking of noise power |
---|
0:17:35 | and uh a according to our experiments we found that the |
---|
0:17:39 | uh mmse hand leaks uh i i with M is the |
---|
0:17:43 | a most of was one |
---|
0:17:44 | uh and |
---|
0:17:45 | it can |
---|
0:17:46 | i gonna to guarantee that |
---|
0:17:48 | the and has a speech at the end of a speech enhancement you you was |
---|
0:17:52 | more improvement in signals the most issue |
---|
0:17:55 | for what we don't a can name that these |
---|
0:17:58 | and was probably estimator we give us better intelligibility |
---|
0:18:02 | it should be tested thing |
---|
0:18:03 | and another the uh |
---|
0:18:05 | well |
---|
0:18:07 | i Q and |
---|
0:18:08 | yeah i |
---|
0:18:13 | so at the end we actually know |
---|
0:18:15 | which are rules |
---|
0:18:17 | we should use bleach |
---|
0:18:19 | solutions |
---|
0:18:20 | but uh |
---|
0:18:20 | that might be a |
---|
0:18:22 | one question |
---|
0:18:23 | what's about complexity could you comment on that |
---|
0:18:27 | you know |
---|
0:18:28 | if we uh |
---|
0:18:30 | consider the the algorithms speech uh track |
---|
0:18:33 | uh better in terms of uh mean estimation error or |
---|
0:18:36 | uh uh uh |
---|
0:18:38 | mm as to exists no complex |
---|
0:18:40 | i mean uh and in comparison to some was space knows striking to |
---|
0:18:45 | a performs that so a fast |
---|
0:18:47 | okay i thank you |
---|
0:18:49 | the same question |
---|
0:18:50 | for a question |
---|
0:18:54 | john |
---|
0:18:57 | to to this just to your just time tracking or inside one of asked this question your |
---|
0:19:02 | i in the babble noise at look to snow three uh power was some changing as much one of your |
---|
0:19:09 | plots there |
---|
0:19:10 | i i worked a little bit more constant than i thought it might be |
---|
0:19:13 | um are are you actually you using map or using a large crowd noise |
---|
0:19:18 | uh large scale oh parts are you know you sure back but we normally you think and you can actually |
---|
0:19:23 | hear individual speakers or individual word so is uh is what isn't it distinguishable the |
---|
0:19:29 | it it it's more broad band or or an edge more two |
---|
0:19:32 | pinkish maybe a right |
---|
0:19:34 | right in this figure for example yeah like the |
---|
0:19:38 | not |
---|
0:19:38 | that that's what i'm looking at "'cause" you have a couple of spots were kind of shot you know |
---|
0:19:44 | okay no further a questions thank you once more i |
---|