0:00:15 | i |
---|
0:00:15 | but |
---|
0:00:18 | thank you |
---|
0:00:19 | and |
---|
0:00:20 | the work has been carried out uh and good morning and the work has been carried out uh in the |
---|
0:00:25 | department of electrical and computer engineering at the university of buttons in greece |
---|
0:00:29 | by a at yeah your run D professor them would open as and my sense |
---|
0:00:33 | and uh the work is on the binaural extension of |
---|
0:00:37 | single-channel channel spectral subtraction |
---|
0:00:39 | reverberation i'm |
---|
0:00:42 | reverberation has been a challenging research is you for at least forty kate |
---|
0:00:47 | and |
---|
0:00:48 | now the verb techniques are applied either there are standard as standalone process |
---|
0:00:53 | in order to enhance the reverberant signals quality |
---|
0:00:56 | or even uh increase this reverberant speech intelligibility |
---|
0:01:01 | or or as preprocessing steps before other several signal processing algorithms and applications in know the to increase their performance |
---|
0:01:10 | and one one is developing uh binaural dereverberation algorithms |
---|
0:01:15 | uh it should also take into account some constraint |
---|
0:01:18 | that are imposed from the binaural aspect of the or a system |
---|
0:01:23 | so as we all know um when the sound that i've in the left and that a right E channel |
---|
0:01:29 | here |
---|
0:01:29 | of the listener |
---|
0:01:30 | it does as with a relative delay and the relative late uh level different |
---|
0:01:36 | and these so-called binaural cues are important for the localization of |
---|
0:01:40 | sound the sound space |
---|
0:01:42 | and this should definitely be preserved from the binaural signal processing in general or |
---|
0:01:48 | and more specifically for the binaural from the binaural dereverberation algorithms |
---|
0:01:53 | on the other hand binaural reverberation |
---|
0:01:55 | has very appealing applications |
---|
0:01:58 | it can be applied in hearing aids |
---|
0:02:00 | in binaural telephony in hands-free devices |
---|
0:02:03 | in most of the code |
---|
0:02:04 | a telecommunications |
---|
0:02:07 | so uh recently we have proposed in our lab uh some single channel dereverberation algorithms |
---|
0:02:14 | we have proposed a framework for improving single-channel channel existing spectral subtraction dereverberation algorithms |
---|
0:02:21 | we have also uh presented a novel method um of high computational complexity that gives |
---|
0:02:27 | uh perceptual sick need |
---|
0:02:29 | perceptually good results |
---|
0:02:31 | which is based on perceptual reverberation modeling |
---|
0:02:34 | and also a fast uh semi-blind reverberation with |
---|
0:02:38 | that's which is based on the hand club recording |
---|
0:02:40 | which targets |
---|
0:02:41 | speech application |
---|
0:02:43 | so |
---|
0:02:43 | the state for what step for us was to extent |
---|
0:02:47 | uh sets a technique and the binaural context |
---|
0:02:51 | and |
---|
0:02:52 | and |
---|
0:02:53 | the most of those uh thing to do was to extend uh the spectral subtraction dereverberation which is |
---|
0:02:59 | uh techniques of low computational complexity when compared to sophisticated |
---|
0:03:04 | and what that the remote pro |
---|
0:03:07 | so the specific gains of this work |
---|
0:03:10 | uh is to propose a single frame frame for the extension of single-channel channel spectral subtraction dereverberation algorithms |
---|
0:03:17 | two |
---|
0:03:18 | uh into use and efficient way |
---|
0:03:20 | to prevent of estimation errors |
---|
0:03:23 | and also to evaluate the proposed framework in several state-of-the-art spectral subtraction dereverberation technique |
---|
0:03:32 | um |
---|
0:03:32 | expect that subtraction was originally proposed for D knows in application |
---|
0:03:37 | but recently it has been applied for the suppression of late reverberation |
---|
0:03:42 | we all know in room acoustics that after the direct sound |
---|
0:03:46 | the L reflections are i've these are discrete echoes that come from the close surface and produce spectral |
---|
0:03:52 | uh degradation that is perceived as colouration |
---|
0:03:55 | in the diffuse feed the late reverberation arrives |
---|
0:03:58 | which has a the gang noise or a like characteristics |
---|
0:04:02 | and he's perceived as the well-known signal a reverberant tails |
---|
0:04:07 | so in the late reverberation suppression some context spectral subtraction |
---|
0:04:11 | uh gives the any coke estimation by simply um subtracting from the reverberant signal and and then uh and uh |
---|
0:04:17 | estimation of late variation |
---|
0:04:20 | and mostly liberation separation methods that work can this way |
---|
0:04:24 | how to uh estimate exactly these late reverberation spectrum more power spectrum depending on the method |
---|
0:04:31 | and let's look some state-of-the-art methods |
---|
0:04:34 | yeah |
---|
0:04:35 | the methods proposed by where wine gone for we can cut out come from a one will refer to them |
---|
0:04:40 | as W W an S K A |
---|
0:04:42 | i i taking someone assumptions on the reverberant signals that these six |
---|
0:04:46 | while the well known uh reverberation technique from bar to and then be |
---|
0:04:50 | uh uh from oh no we refer to this as a B |
---|
0:04:53 | is um a concern assumption on reverberation characteristic |
---|
0:04:57 | keep in mind that |
---|
0:04:59 | we can easily express the subtraction um |
---|
0:05:03 | a principle as again multiplication |
---|
0:05:06 | uh in the frequency domain by deriving the appropriate gain |
---|
0:05:11 | so the |
---|
0:05:11 | a straightforward approach would be to implement separately in the binaural context uh independently this uh late reverberation suppression technique |
---|
0:05:20 | for the left and the right channel |
---|
0:05:23 | but it has been proved that the lateral signal processing will destroy this binaural cues and |
---|
0:05:29 | uh it will make the localization in the produced signal uh be distorted so |
---|
0:05:34 | and in the bibliography |
---|
0:05:37 | be i hitting a can team has proposed |
---|
0:05:40 | uh spectral subtraction extension which is based on the delay and sum beamformer |
---|
0:05:45 | uh |
---|
0:05:46 | by beamforming by actually at thing at the left and the right the channels and synchronizing then |
---|
0:05:51 | um |
---|
0:05:52 | it produces a reference signal it then makes the late reverberation estimation and the signal and then it apply spectral |
---|
0:05:59 | subtraction independently |
---|
0:06:01 | uh in the left and the right yeah |
---|
0:06:03 | and so the binaural cues are present |
---|
0:06:08 | in these work |
---|
0:06:09 | uh i will make an extra samson that the relative delay between uh that to um E S i actually |
---|
0:06:16 | depends on the weight of the human head and |
---|
0:06:18 | it can be assumed that it would be uh smaller than the typical analysis windows |
---|
0:06:23 | so we for this work uh we meet the delay and sum beamformer state |
---|
0:06:28 | and we propose a binaural extension which is based on a single channel uh spectral subtraction dereverberation on |
---|
0:06:35 | based and lateral again of station |
---|
0:06:38 | a see the signal flow of the proposed approach |
---|
0:06:41 | uh |
---|
0:06:42 | separately from the web left and the right a rubber and frames with the two different estimations |
---|
0:06:49 | and uh know the to derive the bi lateral games |
---|
0:06:52 | then these gains are combined |
---|
0:06:54 | with a chosen a again of the patient seen |
---|
0:06:57 | in order to to give us the binaural game |
---|
0:06:59 | then |
---|
0:07:00 | again my to the regularization seem that prevents from of very estimation roles that we introduce here is applied |
---|
0:07:06 | in order to give us a constraint binaural again |
---|
0:07:09 | which is separately independently |
---|
0:07:12 | applied on the left and the right frame |
---|
0:07:16 | the gain adaptation for the gain adaptation in this work was chosen the or to use uh started is |
---|
0:07:21 | by taking the marks again in it's frequency being uh we had seemed more it's operation and fewer processing artifacts |
---|
0:07:28 | by taking the average gain would be the compromise between the reverberation reduction and the processing folk |
---|
0:07:34 | while the minimum gain give significance of print so oppression but |
---|
0:07:38 | it can be easily introduce artifacts |
---|
0:07:41 | so the selection of the gain of the patients one was made according to the application scenario |
---|
0:07:47 | you know there |
---|
0:07:48 | these blind method as are uh use and introducing uh signal artifacts and to not to to prevent from such |
---|
0:07:55 | of estimation not different |
---|
0:07:57 | um |
---|
0:07:58 | we have |
---|
0:07:59 | uh probe proposed here we introduce here again a market to the regularization step |
---|
0:08:04 | which is implemented |
---|
0:08:06 | uh in the low signal to reverberation or should detector |
---|
0:08:09 | the assumption here is that um |
---|
0:08:13 | musical noise or yeah other of estimation that the facts |
---|
0:08:16 | will a um |
---|
0:08:18 | we are more probably to uh be present in low signal to reverberation racial frames |
---|
0:08:24 | and this these and didn't regularization sing |
---|
0:08:28 | uh depends on a regularization application of to see that |
---|
0:08:32 | and |
---|
0:08:33 | on a regularization ratio are |
---|
0:08:35 | these are user defined parameters that can be a just |
---|
0:08:39 | in order to um control the suppression rate |
---|
0:08:42 | so this that um |
---|
0:08:44 | while properly uh just adjusting these parameters can compensate for estimation error |
---|
0:08:49 | and prevent musical noise |
---|
0:08:52 | further explain uh the use of these parameters |
---|
0:08:56 | these are typical spectral gain functions |
---|
0:09:00 | and now by keeping seat that to zero point two and are equal to |
---|
0:09:05 | uh are equal for an a equal or are equal eight we can see how the gain functions |
---|
0:09:10 | saying |
---|
0:09:12 | and |
---|
0:09:12 | but keeping think to uh uh are constant we can change the |
---|
0:09:17 | two zero point four and zero point sick |
---|
0:09:20 | so we |
---|
0:09:21 | from here we can see that a that can be used for the but note um |
---|
0:09:26 | control of the separation range |
---|
0:09:28 | why of the parameter R can be used for fine tuning the method |
---|
0:09:34 | uh let's present some results |
---|
0:09:36 | uh these results |
---|
0:09:37 | um |
---|
0:09:39 | are uh um made with um measure at um |
---|
0:09:44 | i impulse responses |
---|
0:09:45 | these uh a specific uh a is since a given from the i can that the base yeah that the |
---|
0:09:50 | base |
---|
0:09:50 | in the stairway away for uh with a reverberation time of |
---|
0:09:54 | zero point seven approximately |
---|
0:09:56 | note the to evaluate the results |
---|
0:09:58 | uh we used to metrics the signal to reverberation |
---|
0:10:01 | or a should difference when compared to the reverberation |
---|
0:10:04 | to the reverberant signal |
---|
0:10:06 | so pos difference is be note that the um |
---|
0:10:09 | more significant reduction |
---|
0:10:11 | and also um medic the pesq Q uh difference when comparing to the reverberant signal |
---|
0:10:17 | which relates more to the perceptual |
---|
0:10:19 | uh quality of the final result |
---|
0:10:22 | uh we implement |
---|
0:10:24 | uh this |
---|
0:10:25 | three by a binaural gain adaptation the patient started is |
---|
0:10:28 | as well as a delay and sum beamformer or in three state of the art a spectral subtraction dereverberation algorithms |
---|
0:10:34 | V L B W W gone of gay |
---|
0:10:37 | and as we can see |
---|
0:10:38 | uh all of the then any can me significantly reduce reverberation |
---|
0:10:43 | as we expected the mean gain adaptation seem we'd uses more reverberation while the marks gain less |
---|
0:10:49 | and when seeing the |
---|
0:10:51 | where P Q difference which makes more sense in a from a perceptual point of view we can see that |
---|
0:10:56 | the W W method with the mean game technique |
---|
0:10:59 | uh gives slightly but the results |
---|
0:11:03 | these results are taken in the at the uh from the all the would that the base |
---|
0:11:07 | and |
---|
0:11:09 | these cafeteria has uh |
---|
0:11:11 | high reverberation time of one point three seconds |
---|
0:11:15 | and um |
---|
0:11:17 | ooh |
---|
0:11:17 | as we can see that is the reverberation reduction here is um |
---|
0:11:23 | smaller |
---|
0:11:24 | and it seems that |
---|
0:11:26 | such techniques in the sets reverberant conditions |
---|
0:11:30 | uh and enhance the final signals |
---|
0:11:33 | but on the other hand uh the enhancement is less than the previous case |
---|
0:11:39 | again uh the W W to can uh technique i had achieved uh but the results |
---|
0:11:45 | in terms of |
---|
0:11:46 | um S R are and press |
---|
0:11:49 | and uh the best results were uh were observed for the average gain adaptation seen |
---|
0:11:57 | so we not there to presents some further evaluation we conducted |
---|
0:12:01 | um |
---|
0:12:02 | subjective evaluation test |
---|
0:12:04 | this test was based on the I T U B |
---|
0:12:08 | eight thirty five and recommendation |
---|
0:12:12 | and |
---|
0:12:13 | seventeen test subjects participated in the test |
---|
0:12:16 | uh we made by a look test not the to get to test the um two |
---|
0:12:20 | choose the best of the station |
---|
0:12:22 | and seem for the set it's techniques so for the L B and W W technique |
---|
0:12:28 | the average gain adaptation was chosen while for the S T A an meaning i the M meaning gain technique |
---|
0:12:33 | was chosen |
---|
0:12:35 | and the test subjects were asked |
---|
0:12:37 | two or rate the speech not real nice |
---|
0:12:40 | they reverberation intrusive an S and the overall quality of |
---|
0:12:44 | this speech signals |
---|
0:12:46 | um |
---|
0:12:47 | for a in a most K from zero to five |
---|
0:12:51 | so from these results |
---|
0:12:54 | we can see that uh the test subjects |
---|
0:12:57 | rate the dereverberated signal |
---|
0:13:00 | i net less natural in all cases |
---|
0:13:03 | however |
---|
0:13:04 | and we notice a significant reverberation reduction |
---|
0:13:09 | and also |
---|
0:13:10 | at least the L B and W W techniques preserve the signal quality the overall signal while |
---|
0:13:18 | and a for gently we need |
---|
0:13:20 | um |
---|
0:13:21 | headphones phones know the to diffuse some them one |
---|
0:13:23 | but if anyone is interested |
---|
0:13:26 | uh that then was out of a are available in the web of our group |
---|
0:13:30 | um B website is also in the paper uh is written in the paper |
---|
0:13:36 | so to sum up |
---|
0:13:38 | and |
---|
0:13:39 | we have introduced a framework for five binaural spectral subtraction dereverberation |
---|
0:13:43 | which is based on bi lateral gain adaptation |
---|
0:13:47 | the gain map and the regularization seeing that we introduced can read use the over estimation errors |
---|
0:13:52 | and produce some uh uh and |
---|
0:13:54 | um |
---|
0:13:56 | preserve |
---|
0:13:56 | from some uh |
---|
0:13:58 | uh the gradations uh processing the gradations |
---|
0:14:02 | the selection of the adaptation seem and the D M parameters |
---|
0:14:05 | uh can be made according to the application scenario |
---|
0:14:09 | and there is also significant reverberation reduction |
---|
0:14:12 | uh while the overall speech quality and the binaural cues are |
---|
0:14:17 | can be present |
---|
0:14:19 | how there |
---|
0:14:20 | we noticed some loss of speech naturalness |
---|
0:14:24 | so for the for us this indicates the need for native binaural mode it's |
---|
0:14:28 | models that take into account the binaural properties of the to the system |
---|
0:14:33 | and |
---|
0:14:33 | this is on what where working right now |
---|
0:14:37 | thank you very much |
---|
0:14:42 | okay um we have time for a few questions |
---|
0:14:46 | you that questions can you just use the microphone over there |
---|
0:14:52 | any questions from the audience |
---|
0:14:59 | and questions |
---|
0:15:01 | okay maybe i just start |
---|
0:15:03 | okay a how do you |
---|
0:15:05 | oh man on the uh uh the accuracy of this been all real |
---|
0:15:09 | oh and uh cues preservation |
---|
0:15:12 | uh this is a big problem because actually we |
---|
0:15:15 | the a perceptual test |
---|
0:15:17 | that can and exactly and um |
---|
0:15:22 | read the of on the on these |
---|
0:15:24 | these need to really control the um |
---|
0:15:27 | and um |
---|
0:15:29 | environment |
---|
0:15:30 | and so it was really difficult to do so it's that's actually |
---|
0:15:34 | uh i think that i i'm not aware of uh and it test for reverberation a graph that |
---|
0:15:40 | and and um |
---|
0:15:42 | uh exactly uh predict the these |
---|
0:15:44 | uh binaural cues preservation |
---|
0:15:47 | um this is the for the for further investigation |
---|
0:15:51 | so you have not done any subject you test on this |
---|
0:15:53 | on these snow |
---|
0:15:59 | the questions |
---|
0:16:03 | you know we best you really |
---|
0:16:08 | uh |
---|
0:16:08 | another question is how do you did the mean this power meters |
---|
0:16:12 | you know G R G M R |
---|
0:16:14 | or |
---|
0:16:14 | uh |
---|
0:16:15 | these parameters |
---|
0:16:16 | actually depend on the frame length |
---|
0:16:19 | and on the reverberation time on how to store to this you not signal |
---|
0:16:24 | and we give some uh range for the parameters in the paper |
---|
0:16:29 | so |
---|
0:16:30 | uh |
---|
0:16:31 | actually the they are |
---|
0:16:33 | different frequencies range |
---|
0:16:35 | for it's sampling frequency needs frame length |
---|
0:16:38 | that's the user can that know the to take the optimal results |
---|
0:16:41 | so for your experiment or for simulations are a bit sorry |
---|
0:16:45 | use no we made by look test |
---|
0:16:47 | to tune the parameters |
---|
0:16:49 | for these |
---|
0:16:50 | a rules what different environments yes |
---|
0:16:58 | any questions |
---|
0:16:59 | so you've not last thanks the speakers again |
---|