0:00:13 | "'kay" |
---|
0:00:13 | Q so um my name is julie it can and uh i will present you |
---|
0:00:18 | uh what we have done for hasr submission |
---|
0:00:22 | and uh |
---|
0:00:23 | and all or are not this we have them after the mission and what we have done |
---|
0:00:29 | because of this proposition off uh |
---|
0:00:31 | questions |
---|
0:00:32 | so this work is than we have |
---|
0:00:34 | and nicholas as of where |
---|
0:00:36 | uh us not of that all is a seven arsed |
---|
0:00:38 | and uh we come from friend |
---|
0:00:42 | okay |
---|
0:00:43 | so |
---|
0:00:43 | uh the goal of as or uh |
---|
0:00:46 | as two |
---|
0:00:47 | i'm lies how can you man expert i think that but to lot |
---|
0:00:51 | uh use of was automatic speaker recognition technology is |
---|
0:00:54 | and uh how you we can have the makes on the the boss communities so |
---|
0:01:00 | uh it's was a very great experience the first time and so |
---|
0:01:04 | uh for for us it's was |
---|
0:01:07 | just uh the first experience and try to do something |
---|
0:01:10 | for this submission |
---|
0:01:12 | and uh |
---|
0:01:14 | the task was |
---|
0:01:15 | uh a very few good a classical a verification task we have a two point five minutes |
---|
0:01:21 | uh of uh samples for a it's speakers they come from a every ten and it was very difficult trials |
---|
0:01:28 | and this trials all's were |
---|
0:01:30 | uh choose and by uh need |
---|
0:01:33 | and give it in the the choose the them the choose of the trials |
---|
0:01:37 | was done for reports from a particular system |
---|
0:01:40 | and uh with this two sets |
---|
0:01:43 | has a one |
---|
0:01:44 | and has or two and so we poured to pay to as so with because the the the samples |
---|
0:01:51 | the the trials |
---|
0:01:53 | or uh colour include the trials of hasr or to include the has a one so |
---|
0:01:58 | with the all this the task |
---|
0:02:00 | so what |
---|
0:02:02 | we propose it was very simple because uh E a is the more and computer science uh |
---|
0:02:09 | uh rubber or or E and uh the league to so |
---|
0:02:13 | uh we just take as three net C french listeners |
---|
0:02:17 | and |
---|
0:02:18 | uh to for all and the males and one male |
---|
0:02:22 | and we all of them to exam a a examined spectrogram and uh to chance to band past |
---|
0:02:28 | uh filters signals so |
---|
0:02:31 | they can do was they want |
---|
0:02:33 | and uh they have to decide if it was |
---|
0:02:37 | the same speaker was speaking or |
---|
0:02:39 | it was |
---|
0:02:40 | two different speaker |
---|
0:02:42 | and give a uh a confidence score and uh if they gave as they were was that's means that they |
---|
0:02:48 | are not confidence in that decision and if they gave |
---|
0:02:51 | five |
---|
0:02:52 | that's means that the are very confident and Z C |
---|
0:02:55 | that for the submission |
---|
0:02:57 | uh to um |
---|
0:02:58 | i missed |
---|
0:02:59 | uh we choose to use the majority voting that's mean because |
---|
0:03:03 | three people so it's easy to have a majority so i |
---|
0:03:07 | uh we we we choose to use this indication to to to to the the choose of the decision |
---|
0:03:15 | and uh for the score no uh we said may it's uh we try to do it do we choose |
---|
0:03:20 | to do a mapping |
---|
0:03:22 | between the |
---|
0:03:23 | you men decision and uh the score we have we are as the and gmm system |
---|
0:03:29 | uh so has to compare |
---|
0:03:30 | uh the in try to find version |
---|
0:03:34 | score uh after |
---|
0:03:35 | and to compare |
---|
0:03:38 | uh if you have question on the mapping i can and sir but i think that's it's |
---|
0:03:42 | more interesting to to take uh all their results |
---|
0:03:46 | and see all things |
---|
0:03:47 | so |
---|
0:03:48 | um a |
---|
0:03:49 | the fact is that's a nice provide us a very long um samples so |
---|
0:03:56 | because we decide to do receptive test ten |
---|
0:03:59 | to uh listen to lots of |
---|
0:04:02 | the the samples |
---|
0:04:04 | uh we decide to |
---|
0:04:06 | chance |
---|
0:04:06 | L a little bit the things and not to give to the listeners |
---|
0:04:10 | all the two minutes |
---|
0:04:12 | of uh |
---|
0:04:13 | speech for each speaker |
---|
0:04:15 | so um it's |
---|
0:04:17 | and that we we decide so to to cats the signal and to sell at |
---|
0:04:22 | the the part which i with them more energy and uh we where we we are sure is that there |
---|
0:04:27 | is a lot of speech |
---|
0:04:28 | and the |
---|
0:04:29 | maybe a lot of information |
---|
0:04:32 | so uh and |
---|
0:04:33 | so we select a um the short things around six six second for it's samples |
---|
0:04:40 | and because of the in you have a lot of uh in perceptive test and for |
---|
0:04:45 | um |
---|
0:04:47 | which are only in psychology they use this kind of duration so that's why we we choose |
---|
0:04:53 | this kind |
---|
0:04:55 | and the uh so high have some example four |
---|
0:04:58 | that's you you use can see |
---|
0:05:01 | and her what's |
---|
0:05:02 | i am talking about |
---|
0:05:04 | because the idea was to have a beep between each sample to knows that's we are changing |
---|
0:05:11 | uh of sample yeah that are is a chance of them |
---|
0:05:15 | so that's my first example |
---|
0:05:17 | but |
---|
0:05:19 | a |
---|
0:05:21 | i |
---|
0:05:22 | a |
---|
0:05:23 | i |
---|
0:05:24 | i |
---|
0:05:25 | i |
---|
0:05:26 | i |
---|
0:05:28 | i |
---|
0:05:28 | i |
---|
0:05:29 | i |
---|
0:05:31 | i |
---|
0:05:34 | i |
---|
0:05:34 | a |
---|
0:05:35 | i |
---|
0:05:37 | oh |
---|
0:05:38 | i |
---|
0:05:39 | i |
---|
0:05:39 | i |
---|
0:05:42 | i |
---|
0:05:42 | so |
---|
0:05:43 | same-speaker same different speaker |
---|
0:05:46 | what these thing |
---|
0:05:50 | okay |
---|
0:05:51 | and it's not the same i |
---|
0:05:54 | it's always is the sensing i mean we we choose the |
---|
0:05:57 | or difficulty so |
---|
0:05:59 | yeah it's not the same but um yeah you have different sample in you can |
---|
0:06:04 | uh have |
---|
0:06:05 | you you |
---|
0:06:06 | not memorise because you don't have in house to memorise but with two minutes at exact same thing we can |
---|
0:06:11 | have a i mean your voice leaf two minutes so |
---|
0:06:14 | it's something you can compare |
---|
0:06:16 | very quickly and try to do to take a decision and |
---|
0:06:20 | at the consequence of this kind of |
---|
0:06:22 | the |
---|
0:06:23 | the steam is is that's uh are are are are listener take a decision very quickly |
---|
0:06:28 | um |
---|
0:06:29 | in |
---|
0:06:30 | around thirty sec and they they take these decision so |
---|
0:06:34 | that's why we we |
---|
0:06:36 | we choose this |
---|
0:06:37 | okay so i come back here |
---|
0:06:45 | so yeah and so they |
---|
0:06:47 | they can uh use uh L |
---|
0:06:49 | and they can yeah or or try to listen but |
---|
0:06:53 | you all the usual is they they just take the decision very quickly |
---|
0:06:58 | this as their results yeah you will have the other side and |
---|
0:07:01 | i i think that |
---|
0:07:02 | what's |
---|
0:07:03 | is very very thing is that |
---|
0:07:05 | uh are |
---|
0:07:06 | a some i think system is |
---|
0:07:08 | batter that the the decision we takes |
---|
0:07:11 | by you man |
---|
0:07:14 | and |
---|
0:07:15 | um are first question why as |
---|
0:07:17 | two because |
---|
0:07:19 | the the question is the human performance at that is is the very important things so |
---|
0:07:23 | uh a a first things that's we can have four and good information to know if the U we can |
---|
0:07:29 | cook have a confidence of the decision of human |
---|
0:07:32 | is to to see if they are agree and uh uh what's up and when they are we did they |
---|
0:07:38 | take |
---|
0:07:39 | the |
---|
0:07:40 | good decision or uh these they are wrong |
---|
0:07:43 | so |
---|
0:07:44 | you can see that |
---|
0:07:46 | here |
---|
0:07:48 | yeah |
---|
0:07:48 | we count |
---|
0:07:49 | no if they are if they agree it's not |
---|
0:07:53 | uh a good uh indication of the fact that's |
---|
0:07:57 | uh we can have confidence of that our on their decision because |
---|
0:08:01 | they |
---|
0:08:02 | do um |
---|
0:08:03 | yeah if you hear you have |
---|
0:08:05 | this this is the the good |
---|
0:08:07 | um that would then sir okay |
---|
0:08:09 | that the correct answer and the here are the the trials and so you can see that |
---|
0:08:15 | here |
---|
0:08:16 | that seems that they take that good did the correct decision but here |
---|
0:08:20 | on |
---|
0:08:20 | uh when they are working on a target as target |
---|
0:08:24 | uh |
---|
0:08:26 | we can know if it's good or not because here you have exactly the same proportion |
---|
0:08:31 | and and |
---|
0:08:33 | the confidence score as gave |
---|
0:08:35 | uh is not a a good indication to so you can't trust the people when they are as they say |
---|
0:08:41 | say okay i'm sure is that is the same that's not a good thing irrigation to say okay we can |
---|
0:08:46 | trust them |
---|
0:08:47 | so |
---|
0:08:48 | that's a problem |
---|
0:08:52 | and |
---|
0:08:53 | for the um |
---|
0:08:55 | the |
---|
0:08:56 | the um yeah we we are we have some discussion on the of the protocol of has or |
---|
0:09:02 | because the first one is first thing is that a listener |
---|
0:09:05 | uh have the the feelings that's it's was more and evaluation |
---|
0:09:09 | to no he they can come pence the cup to channels |
---|
0:09:13 | the and to evaluate the proximity of the voice for to speakers so but it's because it's just |
---|
0:09:19 | not that i as they'll days they ha |
---|
0:09:22 | uh uh usually do so |
---|
0:09:24 | yeah it's was |
---|
0:09:25 | difficult |
---|
0:09:26 | um |
---|
0:09:27 | yeah we it's thin only it's not it's |
---|
0:09:31 | for our summation mission is more a perceptive says as that an an acoustic and then is is because they |
---|
0:09:37 | don't use they just filter when they have very different channels and something like this but they don't use |
---|
0:09:44 | some |
---|
0:09:45 | part of the signal to know uh where is the end to take they decision it is just press the |
---|
0:09:51 | tips things |
---|
0:09:52 | and for the limitation of the protocol the question is |
---|
0:09:55 | is we have in a house to to dues that the T sense you know exactly what's happened |
---|
0:10:01 | and |
---|
0:10:01 | what is very important is that we can't |
---|
0:10:04 | randomized |
---|
0:10:05 | the the trials is that's means that yeah all the speaker |
---|
0:10:09 | um her in the same time |
---|
0:10:12 | the set the trails and it's |
---|
0:10:14 | clears is that |
---|
0:10:15 | you don't have the same attention when used charts |
---|
0:10:18 | and when it is the |
---|
0:10:20 | a hundred of trials you are listening so as that's important and and college is a we do always that's |
---|
0:10:27 | that's to two |
---|
0:10:28 | to have the |
---|
0:10:30 | to randomise the the the C |
---|
0:10:31 | so |
---|
0:10:33 | that's |
---|
0:10:33 | after is that we have a lot of question of of does this submission our first question was okay |
---|
0:10:39 | um what is the influence of the number of speakers because we have only needs tree speakers so what's up |
---|
0:10:45 | an if we increase the the number of speaker |
---|
0:10:47 | um and the uh what is the difference between experience and and not experience and listener these we have express |
---|
0:10:55 | sort of expert |
---|
0:10:57 | and what is the compliment charity T between the you men and the system decision |
---|
0:11:01 | because |
---|
0:11:02 | uh we just said made the decision of you men |
---|
0:11:05 | so which ends a little bit the the protocol from has or |
---|
0:11:08 | uh we have more listener |
---|
0:11:11 | search non experiments and ten experience listener |
---|
0:11:14 | we randomized because we have of all the trial so we randomised of trials |
---|
0:11:19 | and we balanced to the number of non-target and target |
---|
0:11:22 | uh because uh the first time the idea of this these an are are there okay i have to |
---|
0:11:28 | to it's |
---|
0:11:29 | so yeah it's a balanced so i will take |
---|
0:11:31 | them |
---|
0:11:32 | the |
---|
0:11:33 | there were point five of |
---|
0:11:35 | um my natural priori is result |
---|
0:11:37 | and so and |
---|
0:11:38 | for uh we only uh allows them to to listen |
---|
0:11:43 | one |
---|
0:11:44 | the trials and not to repeat the trials again and |
---|
0:11:47 | and so |
---|
0:11:48 | what are the result of it |
---|
0:11:51 | we have a only for non experience and listener that's for |
---|
0:11:56 | uh above chance level so if you take |
---|
0:11:59 | uh |
---|
0:11:59 | a occur on and here take it exactly the same thing for the majority of the listeners |
---|
0:12:04 | but |
---|
0:12:05 | what's is |
---|
0:12:06 | very interesting for us is that you have a very large gap of performance according to the the trials |
---|
0:12:13 | you have some trials where |
---|
0:12:15 | ninety percent of the listener are are |
---|
0:12:18 | core are right our core but give the good the correct answer |
---|
0:12:21 | and for all other trials you are only strip or or and of the listeners that gave uh the good |
---|
0:12:27 | answer so |
---|
0:12:29 | we don't find uh difference between the male and the female trials it's |
---|
0:12:33 | exactly same thing |
---|
0:12:35 | and |
---|
0:12:35 | we have sir different be if your of are we have |
---|
0:12:39 | some is no was say oh always yes yes yes it the same is the same and or there's that's |
---|
0:12:44 | are always thing no it's not the same as not the say |
---|
0:12:47 | so we all as that for for the from the the listeners |
---|
0:12:50 | and we find a correlation between the performance |
---|
0:12:54 | and the in level of the the listeners because here it's for |
---|
0:12:58 | and not not of uh in people so |
---|
0:13:02 | yeah we find that |
---|
0:13:03 | so |
---|
0:13:04 | the last question was the complementarity between the you men and the system and |
---|
0:13:09 | that's |
---|
0:13:10 | what we find is that's um for non-target trials |
---|
0:13:14 | the as be ham of |
---|
0:13:16 | uh |
---|
0:13:17 | a lot off correct answer and it's the only correct then some for the N the M and not for |
---|
0:13:22 | the you min |
---|
0:13:23 | but |
---|
0:13:23 | it's the contrary for you and we have a lots of uh a a big for per oh |
---|
0:13:28 | sorry |
---|
0:13:30 | a we have a be a big for version here |
---|
0:13:32 | uh of |
---|
0:13:34 | correct answer only for the you and so maybe we can find a compliment terry T |
---|
0:13:40 | and uh |
---|
0:13:43 | yeah um and the not yeah |
---|
0:13:45 | that's so |
---|
0:13:46 | and |
---|
0:13:47 | the after we have are known the experiments so then |
---|
0:13:50 | experience and listener |
---|
0:13:52 | and we don't to find difference on the performance for the non expert that and the experience a listener yeah |
---|
0:14:00 | it's exactly same thing you'll have |
---|
0:14:02 | the th |
---|
0:14:04 | so for the suggest and the the first or work is |
---|
0:14:08 | more question and all those things |
---|
0:14:11 | yes |
---|
0:14:12 | because the first |
---|
0:14:13 | question is how house the you men can help the system so |
---|
0:14:16 | um maybe you uh we have to eggs i mean the trials |
---|
0:14:20 | with the scores that are near the threshold of the system because we observed in the compliments are T that |
---|
0:14:26 | that is that |
---|
0:14:27 | it's is the that it is the them |
---|
0:14:30 | the trials |
---|
0:14:31 | where uh you man a right and uh system is wrong so |
---|
0:14:36 | maybe it's |
---|
0:14:37 | something thing we can do you |
---|
0:14:38 | and |
---|
0:14:39 | yeah the second question is okay |
---|
0:14:42 | i have some trials which are very easy and all their very difficult for you man what's are the different |
---|
0:14:48 | between |
---|
0:14:49 | so trials |
---|
0:14:50 | and |
---|
0:14:51 | it's its clear that it's important to rip gates this kind of experiments |
---|
0:14:55 | we have not to have listen at that i'm sure that joe or |
---|
0:14:58 | the next paper will answer to this question |
---|
0:15:01 | thank you |
---|
0:15:27 | you describe performance or with experience to of experienced was listeners |
---|
0:15:31 | uh how how someone of experience to image bruce was but the question |
---|
0:15:35 | and and experience that listener is the fun addition but |
---|
0:15:40 | who doesn't |
---|
0:15:41 | the fire and they |
---|
0:15:43 | don't to work on for and C E core they don't are uh interested in the the the speaker they |
---|
0:15:50 | they work on the language and uh so they |
---|
0:15:53 | they are not eight uh everyday day novels |
---|
0:15:56 | they are very yeah they they |
---|
0:16:00 | but it's not so experts a for and six something because |
---|
0:16:03 | in france we don't have |
---|
0:16:04 | this kind of people |
---|
0:16:05 | i |
---|
0:16:06 | yeah |
---|
0:16:16 | so this a lot of people saying just always a all makes me fine |
---|
0:16:21 | where it would be possible to how real human judge |
---|
0:16:24 | just give the mumbled string as might come from from list model |
---|
0:16:28 | and do post facto dollar bleep each room thing to of the pressure your bit of the best to perform |
---|
0:16:35 | to do do that but it we make them but that the problem is that for the first two D |
---|
0:16:39 | we have done we the three people the three people |
---|
0:16:42 | um we don't have a correlation be good uh correct answer and |
---|
0:16:47 | a confidence score |
---|
0:16:48 | which is good so we comes use |
---|
0:16:50 | the car the |
---|
0:16:52 | you see you we can't |
---|
0:16:53 | trust the the the the the the listener that are not it's not because they say i i'm sure is |
---|
0:16:58 | that it's the of my decision that's made is that's sim is that's |
---|
0:17:03 | sin if you K eight that |
---|
0:17:04 | they are right |
---|
0:17:06 | so |
---|
0:17:07 | it's difficult to have a liberation |
---|
0:17:09 | and to use this confidence score |
---|