0:00:06 | my name is uh as you get can |
---|
0:00:08 | and uh i will present you with the |
---|
0:00:10 | the work we uh |
---|
0:00:12 | we do we can do it uh |
---|
0:00:14 | in L D A |
---|
0:00:16 | yeah which is entitled intraspeaker variability effect |
---|
0:00:19 | speaker verification |
---|
0:00:22 | over the last decade |
---|
0:00:24 | uh the the one of the systems uh the performance of this is that |
---|
0:00:28 | uh |
---|
0:00:29 | is uh |
---|
0:00:30 | very very |
---|
0:00:31 | at uh |
---|
0:00:32 | the performance the |
---|
0:00:34 | have a rich uh a good uh little |
---|
0:00:36 | and uh is it |
---|
0:00:37 | so |
---|
0:00:38 | this permit |
---|
0:00:39 | uh to have allow set of practical application |
---|
0:00:42 | uh like |
---|
0:00:43 | in industry or in forensic application |
---|
0:00:46 | and uh all this uh performance |
---|
0:00:49 | performance are always driven by average error rate |
---|
0:00:53 | and uh |
---|
0:00:54 | uh |
---|
0:00:55 | we don't have a lot uh |
---|
0:00:58 | a lattice to D's |
---|
0:00:59 | on uh the |
---|
0:01:01 | i explanation |
---|
0:01:03 | of |
---|
0:01:03 | the performance viable |
---|
0:01:05 | hmmm |
---|
0:01:05 | and uh on the arrow |
---|
0:01:07 | uh |
---|
0:01:08 | we have a one important which is that doing so in that context mean actually |
---|
0:01:13 | who uh explain the performance viability according to the speaker for five |
---|
0:01:18 | it is a well known that the |
---|
0:01:20 | according to the lens |
---|
0:01:21 | of the training and testing that's out uh the the back of the T the performance liability is very important |
---|
0:01:29 | and uh it was proposed |
---|
0:01:30 | two |
---|
0:01:31 | uh to use the diff there in front of me contain |
---|
0:01:34 | in the |
---|
0:01:35 | two |
---|
0:01:36 | to use the interview |
---|
0:01:38 | showings that |
---|
0:01:39 | uh |
---|
0:01:40 | there is interference performance |
---|
0:01:42 | uh according to |
---|
0:01:43 | this one i mean |
---|
0:01:46 | to do a um |
---|
0:01:48 | our question |
---|
0:01:49 | uh we only work |
---|
0:01:50 | on |
---|
0:01:52 | the training data |
---|
0:01:53 | four |
---|
0:01:54 | one speaker |
---|
0:01:55 | uh the question is uh is that |
---|
0:01:57 | we have several except |
---|
0:01:59 | for the same speaker |
---|
0:02:01 | so |
---|
0:02:02 | uh what is the viability |
---|
0:02:04 | due to the signal sample used to that |
---|
0:02:08 | the speaker point |
---|
0:02:10 | and uh do you also questioned it is |
---|
0:02:12 | what kind of information may explain |
---|
0:02:15 | this difference of performance |
---|
0:02:17 | and uh we propose to use to stew D the number of selected frames the phone and make distribution |
---|
0:02:23 | and it in for uh for naming candlestick different |
---|
0:02:28 | okay |
---|
0:02:29 | uh we use the we use the the ideas is because that's a system which is an ubm gmm approach |
---|
0:02:36 | approach |
---|
0:02:37 | uh with uh that in fact one of these these |
---|
0:02:40 | and uh we use the |
---|
0:02:42 | the the C the this used and uh used for |
---|
0:02:46 | the news that several complaints and uh |
---|
0:02:49 | but we don't do a score normalisation |
---|
0:02:52 | the global I D is uh to uh to do |
---|
0:02:56 | a lot |
---|
0:02:56 | of um |
---|
0:02:58 | of uh trails |
---|
0:02:59 | for the different training samples we have |
---|
0:03:02 | four |
---|
0:03:03 | a speaker |
---|
0:03:04 | and uh we select |
---|
0:03:05 | the best |
---|
0:03:06 | training |
---|
0:03:07 | except |
---|
0:03:08 | and the worst training except |
---|
0:03:10 | for each |
---|
0:03:10 | speaker |
---|
0:03:11 | um |
---|
0:03:12 | the the best |
---|
0:03:13 | training except |
---|
0:03:14 | is uh use might have one is um calculated |
---|
0:03:17 | by the um |
---|
0:03:19 | by many |
---|
0:03:20 | minimise the |
---|
0:03:21 | the percentage of |
---|
0:03:23 | four |
---|
0:03:23 | exception |
---|
0:03:24 | and uh |
---|
0:03:25 | uh forms recreation |
---|
0:03:27 | and it's the same thing we maximise |
---|
0:03:29 | the |
---|
0:03:29 | stage of |
---|
0:03:30 | phone sex option |
---|
0:03:31 | accepts and |
---|
0:03:33 | false |
---|
0:03:33 | action |
---|
0:03:35 | Z we have a |
---|
0:03:36 | it to to set |
---|
0:03:38 | if uh we see selection |
---|
0:03:40 | one |
---|
0:03:40 | uh |
---|
0:03:41 | named mean and that mother name |
---|
0:03:45 | for i mean |
---|
0:03:46 | max |
---|
0:03:47 | and random |
---|
0:03:48 | yeah |
---|
0:03:48 | we do different |
---|
0:03:50 | uh experiment |
---|
0:03:51 | uh we |
---|
0:03:52 | there is exactly the same speakers |
---|
0:03:54 | exactly the same testing except |
---|
0:03:57 | but we change the training except |
---|
0:03:59 | four |
---|
0:04:00 | each set |
---|
0:04:03 | uh we do this uh experiments on two corpora |
---|
0:04:07 | the first is the |
---|
0:04:09 | based on then used uh |
---|
0:04:11 | two thousand eight |
---|
0:04:12 | with the telephonic conversational speech |
---|
0:04:15 | and uh which uh a lance of uh |
---|
0:04:18 | two two minutes |
---|
0:04:20 | uh |
---|
0:04:21 | for for each uh uh |
---|
0:04:22 | samples |
---|
0:04:23 | and that will maximise |
---|
0:04:24 | the number of training except for each speaker |
---|
0:04:27 | we do uh leave one out |
---|
0:04:29 | uh and uh |
---|
0:04:31 | with this per process uh we uh we have a |
---|
0:04:35 | be doing this |
---|
0:04:36 | uh one hundred that seventy one speaker for we have |
---|
0:04:40 | three to uh twenty models |
---|
0:04:42 | but |
---|
0:04:44 | and uh |
---|
0:04:45 | do you also corpus we used is the right for one hundred |
---|
0:04:49 | twenty |
---|
0:04:50 | which is an |
---|
0:04:51 | stooge or recording uh |
---|
0:04:54 | corpora that database |
---|
0:04:56 | we visited exactly the same microphone |
---|
0:04:58 | and uh it is the read speech |
---|
0:05:01 | by uh newspapers and |
---|
0:05:02 | is it |
---|
0:05:03 | oh what a speaker on the T french |
---|
0:05:06 | and uh we have uh |
---|
0:05:08 | more |
---|
0:05:08 | uh females on me |
---|
0:05:10 | and uh |
---|
0:05:11 | for each |
---|
0:05:12 | uh speaker |
---|
0:05:13 | we have a |
---|
0:05:14 | training and |
---|
0:05:15 | testing except |
---|
0:05:17 | and uh it is uh |
---|
0:05:19 | the the the the |
---|
0:05:21 | we we concatenate |
---|
0:05:22 | so the some sentences |
---|
0:05:24 | to have more than |
---|
0:05:26 | uh twenty seconds |
---|
0:05:27 | all the selected frames |
---|
0:05:29 | by itself |
---|
0:05:33 | yeah i'm the heave uh we take a do we we we analyse the the viability due to the training |
---|
0:05:40 | except |
---|
0:05:41 | we see that |
---|
0:05:42 | the uh |
---|
0:05:44 | the equal error rate |
---|
0:05:45 | uh range |
---|
0:05:46 | uh is that |
---|
0:05:47 | four point one person's too |
---|
0:05:50 | twenty one |
---|
0:05:51 | only nine percent |
---|
0:05:52 | for mean that are and for breath |
---|
0:05:54 | uh iran |
---|
0:05:55 | two |
---|
0:05:56 | uh one person to |
---|
0:05:58 | the thirty three person |
---|
0:06:00 | we uh have done a random |
---|
0:06:03 | uh |
---|
0:06:04 | set |
---|
0:06:04 | and the |
---|
0:06:05 | the mean |
---|
0:06:06 | here |
---|
0:06:06 | it's uh |
---|
0:06:08 | with with the um the breath |
---|
0:06:10 | is the is |
---|
0:06:11 | the mean of up |
---|
0:06:14 | different uh run them |
---|
0:06:18 | it is the very and |
---|
0:06:20 | very important gap |
---|
0:06:21 | between according to the |
---|
0:06:23 | so |
---|
0:06:24 | training |
---|
0:06:27 | and the |
---|
0:06:28 | now the important is this |
---|
0:06:30 | to explain the viability |
---|
0:06:32 | and the the question is what kind of |
---|
0:06:34 | information |
---|
0:06:35 | so for |
---|
0:06:37 | for the number of selected frame |
---|
0:06:39 | it's possible to do that |
---|
0:06:40 | we have uh nice |
---|
0:06:41 | and right but for |
---|
0:06:43 | when i make distribution and that for phonemic acoustic difference |
---|
0:06:46 | uh we use |
---|
0:06:47 | only right |
---|
0:06:48 | because it is so |
---|
0:06:50 | mm easier |
---|
0:06:51 | two |
---|
0:06:52 | what is this type of information |
---|
0:06:56 | and uh |
---|
0:06:57 | for i used to uh we have a significant effect |
---|
0:07:00 | well the number of frames |
---|
0:07:02 | but it is something that is controlled in uh breath one hundred |
---|
0:07:06 | twenty so it is an relevant fact so for |
---|
0:07:09 | eight an explanation of uh the difference of uh performance |
---|
0:07:12 | but |
---|
0:07:13 | the other four factors that was important |
---|
0:07:15 | because we have |
---|
0:07:16 | uh more important yeah |
---|
0:07:17 | in this |
---|
0:07:18 | in brief uh one hundred twenty |
---|
0:07:21 | and uh it's not can be explained but |
---|
0:07:23 | the number of |
---|
0:07:24 | uh |
---|
0:07:24 | for free |
---|
0:07:27 | though so for the phonetic uh |
---|
0:07:30 | content |
---|
0:07:31 | uh we for me we do a forced alignment |
---|
0:07:33 | also i mean and i |
---|
0:07:36 | five |
---|
0:07:36 | where the spirits about |
---|
0:07:37 | and uh we correct |
---|
0:07:39 | thus this argument |
---|
0:07:41 | manually |
---|
0:07:42 | and the to analyse the phonemic content |
---|
0:07:45 | uh we |
---|
0:07:46 | just |
---|
0:07:47 | uh for the first time |
---|
0:07:48 | uh counts the number of selected frame for each phoneme |
---|
0:07:52 | we don't man over |
---|
0:07:54 | with a between subjects factor which are the the set and the dependent variables |
---|
0:07:58 | are the number of selected |
---|
0:08:01 | and |
---|
0:08:02 | we see that |
---|
0:08:04 | there is |
---|
0:08:04 | quietly no |
---|
0:08:06 | different |
---|
0:08:06 | on phone it |
---|
0:08:07 | media content |
---|
0:08:09 | between |
---|
0:08:09 | here |
---|
0:08:10 | as for female speakers |
---|
0:08:12 | uh between the mean max |
---|
0:08:14 | and uh the random |
---|
0:08:16 | and the only oh |
---|
0:08:18 | one |
---|
0:08:18 | for names |
---|
0:08:19 | which is uh |
---|
0:08:21 | which is the relevant |
---|
0:08:22 | and the formalities |
---|
0:08:24 | was it the same thing |
---|
0:08:25 | so it's not uh a sufficient to explain the gap |
---|
0:08:29 | of performance |
---|
0:08:32 | oh for the infra phonemic information |
---|
0:08:34 | uh we uh we use the acoustic feature |
---|
0:08:38 | uh for each for names |
---|
0:08:39 | and uh |
---|
0:08:40 | it's uh exactly the same for sitting with a a man of a |
---|
0:08:44 | bit we have we have uh between subject factor of the set |
---|
0:08:47 | and the dependence of i'll |
---|
0:08:49 | are the L S D C the delta that so that's all |
---|
0:08:52 | yeah |
---|
0:08:53 | uh we have a |
---|
0:08:54 | uh important significant difference |
---|
0:08:56 | for L F C and for all the phonemes |
---|
0:08:59 | and the four del sol |
---|
0:09:01 | is an |
---|
0:09:02 | important |
---|
0:09:02 | uh yeah |
---|
0:09:04 | difference |
---|
0:09:05 | four |
---|
0:09:06 | um |
---|
0:09:06 | around majority of uh for names |
---|
0:09:08 | and the mainly |
---|
0:09:10 | stops |
---|
0:09:10 | and several voice |
---|
0:09:12 | but we don't find difference |
---|
0:09:14 | for that utterance |
---|
0:09:16 | and uh this is uh |
---|
0:09:18 | this type of uh analysis |
---|
0:09:20 | um |
---|
0:09:22 | it is challenge and proves that uh the infra permit unique |
---|
0:09:25 | acoustic difference our uh i |
---|
0:09:27 | to be accounted for |
---|
0:09:29 | from |
---|
0:09:31 | and uh so when's the training except she ends |
---|
0:09:35 | uh the uh we have a large performance differences |
---|
0:09:39 | you might not be explained by the number |
---|
0:09:41 | of selected frames |
---|
0:09:42 | or it is a possible factor |
---|
0:09:44 | but not a sufficient proctor |
---|
0:09:46 | and the the form a mixture distribution to account |
---|
0:09:49 | uh explain exactly |
---|
0:09:51 | this is uh got |
---|
0:09:52 | is there a investigation on it |
---|
0:09:54 | to that reminds influence |
---|
0:09:56 | of uh in prof anaemic |
---|
0:09:58 | acoustic |
---|
0:10:00 | and uh |
---|
0:10:01 | that's the the question is to do the drilling |
---|
0:10:04 | between six |
---|
0:10:06 | acoustic |
---|
0:10:07 | uh in phonemic acoustic difference |
---|
0:10:09 | and uh uh higher |
---|
0:10:11 | yeah but |
---|
0:10:12 | four |
---|
0:10:13 | uh from the media |
---|
0:10:14 | information |
---|
0:10:15 | and uh |
---|
0:10:16 | work |
---|
0:10:16 | there is uh in your results |
---|
0:10:19 | since uh the |
---|
0:10:20 | the the summation of the paper |
---|
0:10:22 | and uh we see that |
---|
0:10:24 | uh the intensity is either |
---|
0:10:26 | you mean |
---|
0:10:27 | than that |
---|
0:10:28 | but |
---|
0:10:28 | it is the |
---|
0:10:32 | it's the significance but if you take the mean |
---|
0:10:35 | of |
---|
0:10:36 | the intensity it is uh |
---|
0:10:37 | a very short |
---|
0:10:38 | different |
---|
0:10:40 | there is no difference |
---|
0:10:41 | for uh |
---|
0:10:43 | fundamental |
---|
0:10:44 | top of the peach |
---|
0:10:45 | and the you you can see it's form and here we don't have different |
---|
0:10:50 | and uh |
---|
0:10:50 | we we you say the dissipation of the volumes three and and no difference |
---|
0:10:55 | for uh |
---|
0:10:56 | this type of |
---|
0:10:57 | information |
---|
0:10:58 | and uh |
---|
0:10:59 | it is the same thing for the spectrum |
---|
0:11:01 | um so uh |
---|
0:11:02 | right |
---|
0:11:02 | of the |
---|
0:11:03 | fig |
---|
0:11:05 | so for the future work |
---|
0:11:07 | uh |
---|
0:11:08 | it's the the question it is |
---|
0:11:11 | that the viability may not be only |
---|
0:11:14 | the result |
---|
0:11:15 | all the signal samples |
---|
0:11:16 | and uh |
---|
0:11:17 | maybe the system itself |
---|
0:11:19 | a a a problem |
---|
0:11:21 | and uh |
---|
0:11:22 | now we are working on the linkage between the llr |
---|
0:11:27 | by the frame |
---|
0:11:28 | and |
---|
0:11:28 | the phoneme it |
---|
0:11:29 | distributed description |
---|
0:11:30 | to understand |
---|
0:11:31 | what are the exactly the |
---|
0:11:33 | good for that frame and |
---|
0:11:34 | if it is |
---|
0:11:35 | there is not a link |
---|
0:11:36 | uh with uh funding information |
---|
0:11:40 | thank you |
---|
0:11:50 | question |
---|
0:12:06 | uh |
---|
0:12:07 | i entered |
---|
0:12:08 | and there's two you said that |
---|
0:12:09 | there was no |
---|
0:12:11 | significant difference between the snr |
---|
0:12:15 | yeah |
---|
0:12:15 | oh do |
---|
0:12:16 | by |
---|
0:12:17 | training try out some good three trials |
---|
0:12:20 | yeah that is another difference for |
---|
0:12:22 | there is a difference on uh the acoustic for the L F C C for a for it |
---|
0:12:27 | we have |
---|
0:12:28 | the significant difference for all the finance |
---|
0:12:30 | but |
---|
0:12:31 | uh she if uh we we want to find uh the link |
---|
0:12:35 | with uh i'm here |
---|
0:12:36 | uh features |
---|
0:12:38 | and we don't fine |
---|
0:12:39 | something so |
---|
0:12:40 | the question is uh |
---|
0:12:41 | oh |
---|
0:12:42 | that |
---|
0:12:42 | we don't |
---|
0:12:43 | have found |
---|
0:12:44 | uh with the description |
---|
0:12:46 | the the the description the |
---|
0:12:50 | the the feature we |
---|
0:12:51 | use only used |
---|
0:12:52 | uh in phonetic science |
---|
0:12:54 | to describe |
---|
0:12:55 | the speech |
---|
0:12:56 | actually we don't have find |
---|
0:12:58 | the link between |
---|
0:12:59 | the L X T C |
---|
0:13:00 | and uh |
---|
0:13:02 | and the the the recognition |
---|
0:13:03 | and uh |
---|
0:13:04 | the |
---|
0:13:06 | phonetic |
---|
0:13:07 | uh information in the we don't |
---|
0:13:09 | we don't know |
---|
0:13:11 | uh |
---|
0:13:12 | uh well |
---|
0:13:13 | why |
---|
0:13:13 | yeah we have this type of guy |
---|
0:13:16 | and uh |
---|
0:13:17 | and uh we don't have an explanation |
---|
0:13:19 | actually |
---|
0:13:20 | uh by by the acoustic and the phonetic |
---|
0:13:24 | uh analysis |
---|
0:13:26 | so if you just take your means |
---|
0:13:28 | trials we don't we we selection |
---|
0:13:31 | train |
---|
0:13:32 | turned out |
---|
0:13:33 | and the mean high snr don't know with an hour |
---|
0:13:37 | so don't see a difference in performance |
---|
0:13:40 | sorry |
---|
0:13:41 | you take on your knees trials |
---|
0:13:42 | no no no we we still i mean |
---|
0:13:45 | but eventually you could do yeah yeah |
---|
0:13:48 | yeah we we did something like that in there is to be difference in performance |
---|
0:13:53 | i mean is what you would expect |
---|
0:13:54 | but yes in our training data should be yeah |
---|
0:13:57 | worse performance |
---|
0:13:58 | buttons |
---|
0:13:59 | you |
---|
0:14:00 | not |
---|
0:14:01 | not a break |
---|
0:14:02 | you rattle basically for exactly the the same |
---|
0:14:06 | but |
---|
0:14:07 | maybe there is not so much but you be the the |
---|
0:14:11 | nice |
---|
0:14:13 | 'cause |
---|
0:14:13 | maybe the breath they that there is not so much |
---|
0:14:16 | but maybe it's an hour |
---|
0:14:19 | no |
---|
0:14:20 | very |
---|
0:14:21 | that um |
---|
0:14:22 | the viability |
---|
0:14:23 | about the the uh |
---|
0:14:25 | a four position for example there is no viability right |
---|
0:14:28 | okay |
---|
0:14:29 | that no it is exactly the same microphone exactly |
---|
0:14:32 | the only people are are recorded |
---|
0:14:35 | uh oh no |
---|
0:14:36 | as the same day and uh it's |
---|
0:14:38 | there is no viability of the station |
---|
0:14:40 | the unique the only uh this the unique viability |
---|
0:14:45 | is uh is on the speaker |
---|
0:14:47 | so and uh when we have only the information about the speaker |
---|
0:14:51 | we can have |
---|
0:14:52 | uh evaluation like |
---|
0:14:54 | this |
---|
0:14:55 | between one |
---|
0:14:56 | two |
---|
0:14:56 | thirty three percent |
---|
0:14:58 | i think what everybody |
---|
0:15:00 | so |
---|
0:15:01 | it's |
---|
0:15:01 | very |
---|
0:15:03 | and then the the question you |
---|
0:15:05 | how to explain that because that |
---|
0:15:07 | if we can |
---|
0:15:08 | if we can have a an explanation |
---|
0:15:11 | we can the |
---|
0:15:11 | and uh a coffee then score |
---|
0:15:13 | or something like this |
---|
0:15:15 | that |
---|
0:15:15 | can't say that |
---|
0:15:16 | uh okay |
---|
0:15:17 | uh |
---|
0:15:18 | i i know |
---|
0:15:19 | the |
---|
0:15:20 | the the training and i know the the testing |
---|
0:15:24 | detecting the testing sample |
---|
0:15:26 | and uh i can say i can say |
---|
0:15:28 | oh okay for this |
---|
0:15:30 | i i can't |
---|
0:15:31 | i have a a good score |
---|
0:15:32 | and i don't have a a confidence |
---|
0:15:34 | with |
---|
0:15:35 | this doctor |
---|
0:15:36 | but we have an older data i can have |
---|
0:15:38 | uh |
---|
0:15:39 | a good |
---|
0:15:39 | the a score uh would computed |
---|
0:15:41 | and it is |
---|
0:15:43 | it is the objective |
---|
0:15:44 | of |
---|
0:15:45 | this kind of us to do |
---|
0:15:46 | it's a good |
---|
0:15:55 | but |
---|
0:16:05 | what |
---|
0:16:07 | sure |
---|
0:16:08 | hmmm |
---|
0:16:08 | what |
---|
0:16:10 | uh_huh |
---|
0:16:11 | oh |
---|
0:16:12 | some |
---|
0:16:14 | from |
---|
0:16:16 | hmmm |
---|
0:16:16 | hmmm |
---|
0:16:17 | uh_huh |
---|
0:16:19 | um |
---|
0:16:22 | yeah and it's uh yeah |
---|
0:16:24 | the |
---|
0:16:25 | actually boring problem anyway |
---|
0:16:27 | any information we |
---|
0:16:28 | just |
---|
0:16:29 | use |
---|
0:16:29 | the L S C that that that the delta delta |
---|
0:16:32 | and that it was |
---|
0:16:33 | to to check that |
---|
0:16:34 | the there is the |
---|
0:16:36 | a difference |
---|
0:16:37 | because uh at the beginning we don't understand the question now it is the link |
---|
0:16:41 | between |
---|
0:16:42 | uh or the fornication mister |
---|
0:16:44 | and |
---|
0:16:45 | this |
---|
0:16:46 | uh L S C uh |
---|
0:16:47 | which are used because |
---|
0:16:49 | we know that |
---|
0:16:50 | in L A C C and delta we have information |
---|
0:16:53 | but |
---|
0:16:53 | we don't |
---|
0:16:54 | yeah |
---|
0:16:55 | found |
---|
0:16:55 | a link between |
---|
0:16:57 | the test |
---|
0:16:58 | see |
---|
0:16:58 | and the dental |
---|
0:16:59 | and |
---|
0:17:00 | this |
---|
0:17:00 | the |
---|
0:17:01 | the i'll evil |
---|
0:17:02 | uh i phonemic information |
---|
0:17:04 | actually i am working on them |
---|
0:17:07 | the coarticulation information |
---|
0:17:09 | and uh |
---|
0:17:10 | the |
---|
0:17:11 | uh |
---|
0:17:11 | i i the first uh experiments i do we use |
---|
0:17:14 | it was the only with the |
---|
0:17:16 | a trifle |
---|
0:17:17 | and analysing |
---|
0:17:18 | the distribution of the triphones |
---|
0:17:20 | and uh i don't |
---|
0:17:21 | fine |
---|
0:17:21 | difference |
---|
0:17:22 | so uh actually i am a misery go all the locus |
---|
0:17:26 | to see if our with a lexus whether we have here |
---|
0:17:30 | in high school |
---|
0:17:31 | that with raucous we have |
---|
0:17:34 | yeah you use the you know |
---|
0:17:36 | uh not use |
---|
0:17:37 | is um |
---|
0:17:38 | uh you take uh the value of the formants |
---|
0:17:41 | of the second that's a formant |
---|
0:17:43 | at uh |
---|
0:17:44 | then purred |
---|
0:17:44 | and |
---|
0:17:45 | or the beginning of the boy |
---|
0:17:47 | and uh on a fifty percent of the volumes and you |
---|
0:17:51 | you |
---|
0:17:52 | you analysed evaluation |
---|
0:17:54 | between uh |
---|
0:17:55 | as it to to the two values |
---|
0:17:57 | and uh |
---|
0:17:58 | normally if uh there is a a lot of articulation |
---|
0:18:01 | and so the the people |
---|
0:18:03 | uh we you and you have a |
---|
0:18:06 | you are a regression |
---|
0:18:07 | all the value according to the |
---|
0:18:09 | for all the value but if |
---|
0:18:11 | there is no coarticulation |
---|
0:18:13 | uh you have something that is very |
---|
0:18:16 | and uh |
---|
0:18:17 | two |
---|
0:18:18 | yeah |
---|
0:18:23 | uh_huh |
---|
0:18:25 | first |
---|
0:18:25 | fig |
---|
0:18:28 | oh |
---|
0:18:28 | well |
---|
0:18:30 | you yeah |
---|
0:18:34 | oh good |
---|
0:18:36 | or or |
---|
0:18:37 | uh |
---|
0:18:39 | oh |
---|
0:18:39 | for those |
---|
0:18:41 | yeah our |
---|
0:18:42 | uh |
---|
0:18:43 | okay |
---|
0:18:44 | the more you |
---|
0:18:46 | or or |
---|
0:18:49 | the |
---|
0:18:52 | yes yeah |
---|
0:18:56 | it's a it's a good question |
---|
0:18:57 | um |
---|
0:18:58 | yeah you have uh the score |
---|
0:19:00 | the last call |
---|
0:19:01 | four |
---|
0:19:03 | um i is the speaker that on the twenty eight |
---|
0:19:07 | the |
---|
0:19:07 | it is there is |
---|
0:19:09 | a different |
---|
0:19:09 | uh according to the normalisation |
---|
0:19:12 | but it is |
---|
0:19:13 | not compatible |
---|
0:19:14 | with the difference |
---|
0:19:15 | we have |
---|
0:19:16 | in a house normalisation |
---|
0:19:18 | between the |
---|
0:19:19 | the |
---|
0:19:19 | when we select |
---|
0:19:20 | you said to yeah |
---|
0:19:24 | yeah |
---|
0:19:25 | that no we we are trying we are training the |
---|
0:19:29 | the normalisation |
---|
0:19:30 | is the it is something that so we have to do |
---|
0:19:33 | but the problem is uh we have uh |
---|
0:19:35 | a database like yeah right |
---|
0:19:37 | uh it's very difficult because |
---|
0:19:39 | we don't have |
---|
0:19:40 | and now that a lot of uh |
---|
0:19:43 | a lot of data and uh to be able to to have a a good uh a good word |
---|
0:19:47 | and that's who have uh |
---|
0:19:49 | uh would uh |
---|
0:19:50 | all |
---|
0:19:51 | different sub training and testing |
---|
0:19:53 | uh we don't have a lot of |
---|
0:19:55 | uh on that that so it's very difficult to to do |
---|
0:19:58 | the normalisation |
---|
0:19:59 | we if we want |
---|
0:20:00 | to to have a lot of |
---|
0:20:02 | different |
---|
0:20:03 | uh training |
---|
0:20:05 | excel |
---|
0:20:09 | oh |
---|
0:20:10 | or or what |
---|
0:20:11 | two |
---|
0:20:12 | maybe more to each source model one quarter sometimes you can point to |
---|
0:20:18 | oh |
---|
0:20:19 | um |
---|
0:20:20 | we have for the the concatenation it is uh a randomised |
---|
0:20:25 | concatenation |
---|
0:20:26 | we are sure that there is |
---|
0:20:28 | never |
---|
0:20:28 | the same |
---|
0:20:29 | uh samples |
---|
0:20:30 | for testing and training |
---|
0:20:32 | but |
---|
0:20:33 | uh |
---|
0:20:34 | uh it |
---|
0:20:34 | so we we don't |
---|
0:20:36 | combine that actually |
---|
0:20:37 | um |
---|
0:20:38 | for example if if your question is that |
---|
0:20:40 | uh have betrayed try to train |
---|
0:20:43 | right |
---|
0:20:43 | to um |
---|
0:20:45 | to use the the the best |
---|
0:20:47 | uh and uh concatenate the bad |
---|
0:20:50 | to to to have a best |
---|
0:20:52 | model we don't have |
---|
0:20:53 | uh |
---|
0:20:54 | i tried |
---|
0:20:55 | it's uh |
---|
0:20:56 | type of combination |
---|
0:20:57 | a small country |
---|
0:20:59 | you have some recordings of each speaker |
---|
0:21:02 | point |
---|
0:21:02 | time |
---|
0:21:05 | between three and twenty |
---|
0:21:08 | recording yeah |
---|
0:21:09 | and each recording |
---|
0:21:10 | some |
---|
0:21:10 | some some some |
---|
0:21:11 | point in time |
---|
0:21:13 | and |
---|
0:21:15 | according to teach |
---|
0:21:16 | yeah |
---|
0:21:17 | okay |
---|
0:21:18 | strong |
---|
0:21:19 | combining multiple recordings to a more |
---|
0:21:22 | no yeah |
---|
0:21:23 | we we have done um |
---|
0:21:25 | with um |
---|
0:21:26 | to to to have a |
---|
0:21:28 | um |
---|
0:21:29 | samples |
---|
0:21:29 | with |
---|
0:21:30 | for |
---|
0:21:30 | two minutes |
---|
0:21:31 | i mean it's and how |
---|
0:21:32 | uh |
---|
0:21:34 | um |
---|
0:21:35 | phrase selected frame |
---|
0:21:36 | the a and the we |
---|
0:21:38 | we |
---|
0:21:39 | we do the same thing that uh |
---|
0:21:42 | select the what best and the worst with um |
---|
0:21:45 | a longer |
---|
0:21:46 | uh |
---|
0:21:47 | signal |
---|
0:21:47 | and the |
---|
0:21:48 | the the results |
---|
0:21:50 | are |
---|
0:21:51 | this one is that |
---|
0:21:52 | uh the there is |
---|
0:21:53 | let's uh that's also that's why the the curve |
---|
0:21:56 | is that not |
---|
0:21:57 | so |
---|
0:21:58 | so good |
---|
0:21:59 | but uh we have |
---|
0:22:00 | the |
---|
0:22:01 | the set not |
---|
0:22:02 | uh the same yeah |
---|
0:22:03 | that's |
---|
0:22:04 | a gap which is important |
---|
0:22:06 | and uh |
---|
0:22:07 | here it is that the the equal error rate is last one |
---|
0:22:10 | one person |
---|
0:22:11 | and uh here it is um five percent and do we have |
---|
0:22:15 | a lot of frame select |
---|
0:22:16 | yeah |
---|
0:22:17 | which shows more |
---|
0:22:19 | combination of so yeah things from yeah point sometimes or |
---|
0:22:25 | between no no no |
---|
0:22:28 | no |
---|
0:22:29 | no |
---|
0:22:29 | it's uh |
---|
0:22:31 | now because ah it is uh it is |
---|
0:22:34 | yes there there is a it is exactly the same testing for |
---|
0:22:38 | for this curve |
---|
0:22:39 | and this curve |
---|
0:22:41 | so it is uh compare it is possible to compare |
---|
0:22:43 | the |
---|
0:22:44 | that's why the |
---|
0:22:45 | posted to |
---|
0:22:47 | i don't know |
---|
0:22:51 | from |
---|
0:22:52 | sessions which |
---|
0:22:53 | you just |
---|
0:22:54 | no i have no information about it |
---|
0:22:57 | because |
---|
0:22:58 | because the |
---|
0:22:59 | what the sample |
---|
0:23:00 | or |
---|
0:23:01 | uh recording in the same |
---|
0:23:03 | it with the same microphone and exactly |
---|
0:23:06 | the same day so if there is |
---|
0:23:07 | no the there is no uh interior stationed viability |
---|
0:23:12 | there is only |
---|
0:23:13 | uh intraspeaker valuable |
---|
0:23:16 | it is controlled that |
---|
0:23:17 | the speaker hon that's a |
---|
0:23:19 | the |
---|
0:23:20 | the one i want to find an optional |
---|
0:23:25 | for example for half an hour or two |
---|
0:23:30 | open or something |
---|
0:23:32 | yes |
---|
0:23:33 | yeah |
---|
0:23:46 | oh |
---|
0:23:46 | oh |
---|
0:23:47 | right |
---|
0:23:48 | hmmm |
---|
0:23:54 | right |
---|
0:23:54 | hmmm |
---|