0:00:13 | thank |
---|
0:00:14 | so i make a bit like and uh |
---|
0:00:16 | and i will present our paper called converting a one one all audio recordings to be a format for directional |
---|
0:00:22 | audio coding with blocks |
---|
0:00:25 | right long a let's score |
---|
0:00:27 | so yeah yes line of the presentation so i was that |
---|
0:00:31 | like to be an introduction to the topic |
---|
0:00:34 | and then give you some background information about this and then the actual something |
---|
0:00:39 | and i will and with some conclude |
---|
0:00:43 | so yeah |
---|
0:00:44 | several parametric spatial audio coding methods |
---|
0:00:47 | for example and pixel round or a parametric stereo |
---|
0:00:51 | it's that a |
---|
0:00:52 | but |
---|
0:00:53 | and they are really good but to become a they have some certain kind of reproduction system for some stereo |
---|
0:00:57 | or five point one |
---|
0:00:59 | so it would be perhaps nice if we could have some kind of |
---|
0:01:03 | oh any kind of in to the |
---|
0:01:05 | system and and it would also reduce the sound using any |
---|
0:01:09 | reproduction system but some any loudspeaker layout or what sample head |
---|
0:01:16 | so what you have in deal you've ever been in our universe these this thing all the directional audio coding |
---|
0:01:21 | all the know |
---|
0:01:23 | and |
---|
0:01:24 | this this thing to use this as the form in input |
---|
0:01:27 | and you can use any loss speaker layout as a |
---|
0:01:30 | for reproduction and also had poles |
---|
0:01:32 | so we think that the uh that would be perhaps a suitable can that for this kind of generate |
---|
0:01:38 | only a form |
---|
0:01:40 | and in this study |
---|
0:01:41 | i will |
---|
0:01:42 | think about |
---|
0:01:43 | what that for example white one one or something that is really common on no more well |
---|
0:01:47 | multichannel reproduction so i ones |
---|
0:01:49 | uh |
---|
0:01:50 | study about and of buy been studying about |
---|
0:01:53 | could we can will convert these five one one signal that's to be for |
---|
0:01:57 | and we would be we them with a directional audio coding without that any artifacts |
---|
0:02:03 | and then |
---|
0:02:03 | in this case your case for example |
---|
0:02:06 | we tried to |
---|
0:02:07 | not a point one one and get a set and that of reproduction of a sets of sound |
---|
0:02:12 | and get the same thing |
---|
0:02:13 | through this can of and and the uh these directions audio coding reproduction production |
---|
0:02:20 | so |
---|
0:02:21 | first i will so you something about this |
---|
0:02:23 | the are technology so the basic idea and is it |
---|
0:02:27 | that we don't try to reproduce the sound really physically for right |
---|
0:02:31 | instead we tried to reproduce it in a way that it would be perceptually correct so how we perceive does |
---|
0:02:37 | some real would be the say |
---|
0:02:38 | as in the original space |
---|
0:02:40 | and that |
---|
0:02:41 | possible see its for this technology would be for simple telephone thing |
---|
0:02:46 | and also spatial sound reproduction |
---|
0:02:50 | and that's i set |
---|
0:02:51 | the input to this |
---|
0:02:53 | uh |
---|
0:02:53 | processing is the form |
---|
0:02:55 | three |
---|
0:02:56 | so it contains a the directional signal W |
---|
0:03:00 | and and die X Y and set |
---|
0:03:02 | which uh |
---|
0:03:03 | direct that along the |
---|
0:03:05 | coding case of a to connotation that's just |
---|
0:03:10 | so what we |
---|
0:03:12 | so to is that we process the sum some in time-frequency frequency domain |
---|
0:03:16 | and why we do this is that we tried to me the temporal and also the frequency resolution of of |
---|
0:03:22 | yeah |
---|
0:03:23 | so we try to do with the same resolution solution that's our |
---|
0:03:27 | hearing system works |
---|
0:03:30 | and |
---|
0:03:31 | next let's think about what should we and the last of the sound |
---|
0:03:35 | so we are do the analysis for each time for it's time to get is |
---|
0:03:39 | things without L |
---|
0:03:41 | uh |
---|
0:03:42 | much to the solution |
---|
0:03:44 | and uh |
---|
0:03:45 | first would be to do is we analyzed the direction of sound so so simply just wait it coming for |
---|
0:03:49 | left right up don't |
---|
0:03:51 | so on |
---|
0:03:52 | and then we have this another parameter was called diffusion as |
---|
0:03:57 | which is this uh |
---|
0:03:59 | means that |
---|
0:04:00 | is the sound coming only for certain direction is it like a minute weight |
---|
0:04:04 | or always it got that's completely diffuse coming from all directions but like for example reparation to be |
---|
0:04:09 | or is it something in a we |
---|
0:04:14 | and yeah is the scene that is part of processing and the whole block diagram of we you can see |
---|
0:04:20 | so |
---|
0:04:22 | as you get see yeah |
---|
0:04:23 | we done by this time we is the composition for simple using a short time fourier transform |
---|
0:04:29 | and then we do the processing separately for each frequency that |
---|
0:04:33 | so we get use a direction analysis |
---|
0:04:36 | results from yeah |
---|
0:04:38 | and what we do that the old your that is that we create these kind of work so might of |
---|
0:04:41 | course which can be a sample |
---|
0:04:44 | uh card us to was lots to a store this really similar to first order ambisonics these |
---|
0:04:50 | these H |
---|
0:04:51 | but the take what we do is that we divide the sum of two different |
---|
0:04:55 | it's is for known used three |
---|
0:04:57 | and the is used to be |
---|
0:05:00 | and we break reduce them you know different way |
---|
0:05:02 | and for the don't need used three we assume that it |
---|
0:05:05 | basic a for of the direct sum so we try to |
---|
0:05:08 | rep reduce it that's a point like source using for example i'm to to binding or in three D case |
---|
0:05:13 | vector base |
---|
0:05:14 | base and it would kind |
---|
0:05:16 | and |
---|
0:05:17 | for the diffuse if you stream we ask them that it's rounding so for example real so what we want |
---|
0:05:22 | to get a get a kind of |
---|
0:05:24 | and by her section so what we want to use a for some but the correlation to get it |
---|
0:05:29 | running |
---|
0:05:30 | and in the and we use these two streams and also the all frequency bands and we get a a |
---|
0:05:34 | a a uh |
---|
0:05:35 | lots a C |
---|
0:05:39 | and |
---|
0:05:40 | using this made up with the in it that's been studied for a couple of years and you can get |
---|
0:05:44 | quite nice audio for for real recordings with a a form a microphone |
---|
0:05:50 | so let's scroll next to the actual topic what by we get a like what we were studying in this |
---|
0:05:56 | right |
---|
0:05:57 | so how to really convert this kind of wine |
---|
0:06:00 | five point one audio recordings to be much so at first it seems really seem also |
---|
0:06:06 | what we simply can do which we problem from this kind of well lot record |
---|
0:06:10 | so we position the loudspeakers according to itu standard and then we just assume that we are in a a |
---|
0:06:17 | it works space |
---|
0:06:18 | and we perform kind of words the recording with the people microphone |
---|
0:06:22 | is really simple be can use simple cosine on a sine functions for this |
---|
0:06:28 | but |
---|
0:06:29 | however there were some problems that we power |
---|
0:06:32 | so if if about how |
---|
0:06:34 | human |
---|
0:06:35 | yes this kind of reproduction if you have some |
---|
0:06:37 | the if use of a reverberant something part one one case |
---|
0:06:41 | we perceive is it as the diffuse and around |
---|
0:06:44 | but if you think about how |
---|
0:06:46 | there there a nice is |
---|
0:06:47 | analysis |
---|
0:06:48 | works in this case |
---|
0:06:50 | as i said before that |
---|
0:06:52 | the diffuse nest is one you we have |
---|
0:06:54 | sound coming E from all directions and also these is |
---|
0:06:58 | i think the |
---|
0:07:00 | is also true in that you think about what is stiff use field |
---|
0:07:03 | but in this case we have a most of the last because i in the problem |
---|
0:07:07 | so we don't have equal energy coming from all directions and i so |
---|
0:07:11 | a result we have a low weight use nist and one |
---|
0:07:15 | and what this course is that the we get the by guess of sound source the to send the lot |
---|
0:07:20 | be |
---|
0:07:21 | and it's not received to be completely diffuse |
---|
0:07:24 | and surrounding the solve |
---|
0:07:26 | that this is something of course that we don't cool |
---|
0:07:30 | that we got the second idea how to fix this problem is that we isn't the speakers equally around the |
---|
0:07:37 | we form a microphone and |
---|
0:07:39 | this is |
---|
0:07:39 | was |
---|
0:07:40 | working quite nice that the now you get the diffuse sound coming from all directions T |
---|
0:07:46 | and you get to correct used as well |
---|
0:07:50 | and also you that quite nice all your |
---|
0:07:52 | quality but the problem is of course that |
---|
0:07:55 | no the loudspeakers a |
---|
0:07:57 | it wrong basis because if you think about do you should have a |
---|
0:08:01 | sound so that what simple minus the decrease but now we have a feature that sound source here |
---|
0:08:06 | so we get kind of a by in the reproduction |
---|
0:08:09 | which we of course can |
---|
0:08:11 | point to by in the reproduction side that assume that we have a speakers at those |
---|
0:08:15 | direction |
---|
0:08:17 | but we might have also really people my reporting |
---|
0:08:20 | which you would might perhaps too much much these uh one but it seeing as and in this case you |
---|
0:08:25 | would have a wrong directions in i don't of |
---|
0:08:28 | case |
---|
0:08:29 | so we need something more |
---|
0:08:32 | a lab right to get is |
---|
0:08:33 | as function in all that the time |
---|
0:08:36 | and then we have a |
---|
0:08:37 | i will present this |
---|
0:08:38 | but that that we propose a would be that a |
---|
0:08:41 | so what we do is that for the directional provost |
---|
0:08:44 | some for example |
---|
0:08:46 | a sources or direct sound of the |
---|
0:08:49 | uh |
---|
0:08:50 | sound |
---|
0:08:51 | we use the standard layout so for that part we get the directions right and that is |
---|
0:08:56 | of course the important in that |
---|
0:08:58 | and if we have some reference some then we used even layout |
---|
0:09:02 | to get the diffuse is that you correct |
---|
0:09:05 | and this is really |
---|
0:09:06 | nice in a way because the both directions and the diffuse nist a correct when they are the most in |
---|
0:09:11 | four |
---|
0:09:13 | oh let's look at the block diaphragm |
---|
0:09:15 | so |
---|
0:09:16 | we do the analysis |
---|
0:09:18 | and |
---|
0:09:19 | a a frequency domain so we start by these time because of the composition |
---|
0:09:24 | and what we do is we can you these |
---|
0:09:27 | we watch other be not signals use the even and layout and the standard let loudspeaker layer |
---|
0:09:35 | and what we do is we simply a these two three is a holding to the D C use ms |
---|
0:09:40 | of the summer |
---|
0:09:41 | of the some you and we do that |
---|
0:09:43 | analysis using these you and lots of light |
---|
0:09:46 | and we can see |
---|
0:09:48 | cross fade or was played between these two three |
---|
0:09:52 | to get the most |
---|
0:09:53 | the |
---|
0:09:55 | people three |
---|
0:09:57 | and we have seen that these would be quite close to |
---|
0:09:59 | but it should be ten |
---|
0:10:00 | next we what we do the and less the diffuse as again |
---|
0:10:04 | if this is correct now |
---|
0:10:07 | and we |
---|
0:10:07 | see from here that |
---|
0:10:10 | if we think about this |
---|
0:10:12 | re line each the diffuse this can do with that |
---|
0:10:15 | even a speaker layout we get |
---|
0:10:17 | correct it used |
---|
0:10:18 | where with this standard layout we would get |
---|
0:10:21 | to load diffuse noise |
---|
0:10:22 | so we can see that if we much |
---|
0:10:24 | we we these two |
---|
0:10:26 | streams |
---|
0:10:26 | we get |
---|
0:10:28 | correct if used as if it's really high or really low but in the be no be get a kind |
---|
0:10:32 | of a little bit and the estimation |
---|
0:10:35 | so we need to by a big for |
---|
0:10:39 | so let's look got this if use this take of once small the see |
---|
0:10:42 | would be a would be more five to get a correct if used |
---|
0:10:46 | so we can see yeah that this is the uh |
---|
0:10:48 | pressure and this is the part of as |
---|
0:10:51 | so we can see that these if these are quite P well |
---|
0:10:55 | the whole term yeah is |
---|
0:10:56 | white white and you get that all that the use this is really low |
---|
0:11:00 | and so |
---|
0:11:01 | if that |
---|
0:11:02 | a particle velocity |
---|
0:11:04 | well basically the dipole signals and all me that's of uh he but we get a low here |
---|
0:11:09 | and on the very if you have if these two |
---|
0:11:12 | so it's a |
---|
0:11:14 | i |
---|
0:11:14 | different we can |
---|
0:11:16 | really low uh |
---|
0:11:17 | but for this um so we get right |
---|
0:11:21 | so we can simply money by the diffusion is by |
---|
0:11:24 | at just in the ratio between the only a directional signal and the dipole signals of the beep |
---|
0:11:29 | oh this is actually done |
---|
0:11:31 | you need a bunch of equations and i would not give you hear but you can read from the paper |
---|
0:11:35 | if you in |
---|
0:11:38 | so let's go back to the this |
---|
0:11:39 | block diagram so we can see that we on but the that if used as well use and we get |
---|
0:11:44 | this kind of a |
---|
0:11:45 | uh coefficients for access to the |
---|
0:11:47 | these used |
---|
0:11:49 | and it the next stage we simply |
---|
0:11:51 | molded by uh the the so signal and the died |
---|
0:11:55 | signals so we get the final |
---|
0:11:58 | we almost three |
---|
0:12:01 | so we can use them to a use these thing as reproduce the sound we get a and we also |
---|
0:12:06 | can it with the other the real people are recordings for example |
---|
0:12:11 | yeah we didn't do |
---|
0:12:12 | for only saying this yet but at least according to in formal test we didn't |
---|
0:12:17 | we didn't find to be put course any degradation of of audio or you probably |
---|
0:12:25 | so |
---|
0:12:25 | as of convolution |
---|
0:12:27 | that audio an audio coding or get out |
---|
0:12:29 | is a perceptually motivated method to reproduce spatial sound |
---|
0:12:34 | and it has been a before using mainly be you for my or us that in |
---|
0:12:39 | and so in this paper or the idea was that could we use also for simple life one S to |
---|
0:12:44 | in |
---|
0:12:45 | to these |
---|
0:12:46 | uh reproduction method |
---|
0:12:48 | in order to have for example these kind of check audio for |
---|
0:12:51 | and we |
---|
0:12:53 | showed to use i thought use a two simple matrix in based methods |
---|
0:12:57 | but they had some problems both of |
---|
0:13:00 | and the proposed method |
---|
0:13:02 | in there we analyse the virtual sound field generated by the five point one or |
---|
0:13:07 | and we do the analysis in the time frequency my |
---|
0:13:11 | uh |
---|
0:13:13 | and by using this we were able to do these kind of people |
---|
0:13:17 | a and that we did not find two |
---|
0:13:19 | uh the rate the audio quality |
---|
0:13:22 | and also we |
---|
0:13:23 | and uh |
---|
0:13:24 | makes the resulting be not that's with other be |
---|
0:13:28 | signals |
---|
0:13:30 | yeah some acknowledgements and |
---|
0:13:32 | i would like to check |
---|
0:13:41 | yeah |
---|
0:13:46 | i |
---|
0:13:55 | i |
---|
0:14:00 | hmmm |
---|
0:14:00 | i |
---|
0:14:01 | i |
---|
0:14:05 | i |
---|
0:14:06 | more |
---|
0:14:08 | yeah |
---|
0:14:08 | yeah |
---|
0:14:09 | i |
---|
0:14:11 | this |
---|
0:14:12 | in physical they are not a at all the say what perceptual at the same at these like one to |
---|
0:14:17 | lower in format test so if you look at B Y so it's a completely different but it some the |
---|
0:14:21 | say and that's the uh a little that it puts some the same |
---|
0:14:26 | uh_huh hmmm |
---|
0:14:28 | oh |
---|
0:14:30 | well what what we wanna do this is that what simple we would also read produce using gonna use in |
---|
0:14:34 | must every all seven what one O any |
---|
0:14:36 | a lot of layout are also had also so that's why we want to get also the be or more |
---|
0:14:41 | uh this five one of these people bust |
---|
0:14:48 | right |
---|
0:14:49 | i |
---|
0:14:51 | oh |
---|
0:14:53 | oh |
---|
0:14:54 | no that |
---|
0:14:55 | directional audio coding and |
---|
0:14:57 | also |
---|
0:14:58 | and also the reverse and labeled |
---|
0:15:00 | on a a bit twice |
---|
0:15:02 | same so is it's only for sets choice say |
---|
0:15:10 | i |
---|
0:15:13 | well |
---|
0:15:14 | oh |
---|
0:15:14 | i |
---|
0:15:16 | oh |
---|
0:15:17 | is |
---|
0:15:19 | yeah i |
---|
0:15:20 | one |
---|
0:15:21 | i |
---|
0:15:22 | wow |
---|
0:15:23 | oh |
---|
0:15:24 | i |
---|
0:15:24 | you |
---|
0:15:25 | uh sweet spot of sweet spot |
---|
0:15:27 | is used for |
---|
0:15:29 | no i i |
---|
0:15:30 | it's of a goalie this it does make a big difference at least we didn't for a of any brought |
---|
0:15:35 | class but of course if you go away from the of three spot you are more easily would you to |
---|
0:15:39 | get a a a plus but we mainly due the since text |
---|
0:15:42 | this |
---|
0:15:42 | the sweet |
---|
0:15:44 | but there was really noise |
---|
0:15:51 | uh_huh uh_huh uh_huh |
---|
0:15:55 | hi |
---|
0:15:56 | i |
---|
0:15:57 | okay |
---|
0:15:59 | and |
---|
0:16:00 | this |
---|
0:16:01 | oh |
---|
0:16:02 | i |
---|
0:16:04 | i |
---|
0:16:04 | hmmm |
---|
0:16:06 | no uh we really did i don't |
---|
0:16:09 | see |
---|
0:16:10 | of course we could |
---|
0:16:11 | because we don't wanna at any room at that basically because we want go present but |
---|
0:16:16 | yeah of course |
---|
0:16:18 | yeah |
---|
0:16:19 | i don't know how do we really |
---|
0:16:20 | get it in it but yeah |
---|
0:16:27 | i |
---|
0:16:37 | uh_huh |
---|
0:16:40 | i did it in |
---|
0:16:41 | room which is a a point to this decide to use that that so it's list |
---|
0:16:45 | what there |
---|
0:16:46 | white nice the i we need some really brief test also in nine a will and that we didn't find |
---|
0:16:51 | a but was also that but |
---|
0:16:53 | the main main part of the test was done in really we |
---|
0:17:00 | okay |
---|
0:17:02 | know |
---|
0:17:03 | i |
---|
0:17:04 | yeah |
---|