0:00:06 | i |
---|
0:00:07 | you start by one percent |
---|
0:00:10 | this evaluation this was and therefore we made penis paying two years ago |
---|
0:00:17 | this is the line of the presentation um |
---|
0:00:20 | i |
---|
0:00:21 | talk about the compass timeout duration than i would describe the this conditions |
---|
0:00:26 | which are more much liking the nist evaluation |
---|
0:00:30 | uh evaluations |
---|
0:00:31 | we took uh nist evaluations as us |
---|
0:00:35 | um an example and |
---|
0:00:37 | i understand that in fact |
---|
0:00:39 | so we can be uh |
---|
0:00:41 | maybe uh |
---|
0:00:42 | and ninety percent off |
---|
0:00:44 | the condition |
---|
0:00:45 | then uh i will describe |
---|
0:00:47 | uh basically |
---|
0:00:49 | we thought as possible |
---|
0:00:51 | uh the results and then give some conclusions |
---|
0:00:57 | well uh |
---|
0:00:58 | this uh evaluation was uh supported by the spanish thematic network on speech technology to spain |
---|
0:01:04 | uh it was uh uh but the feed the any of what's up on speech than only one was |
---|
0:01:09 | that can be bought in november |
---|
0:01:12 | with us tonight |
---|
0:01:14 | and in that but in that uh were so um |
---|
0:01:17 | the what the other two other operations on the speech trouble relation |
---|
0:01:22 | and screens |
---|
0:01:23 | speech synthesis |
---|
0:01:24 | but wrong |
---|
0:01:25 | the language recognition evaluation |
---|
0:01:27 | i don't you know |
---|
0:01:29 | and the what the was another motivation which was that uh or group was interested in developing |
---|
0:01:36 | language recognition did not you for uh |
---|
0:01:39 | spokane document retrieval applications |
---|
0:01:43 | well what we see on all the points we have |
---|
0:01:47 | we had in mind |
---|
0:01:48 | when you sign in the evaluation |
---|
0:01:50 | uh |
---|
0:01:51 | well to promote collaboration between research group |
---|
0:01:54 | in this pain also portable |
---|
0:01:57 | uh secondly uh |
---|
0:01:59 | to provide |
---|
0:02:00 | i speech database |
---|
0:02:01 | a specifically designed |
---|
0:02:03 | two |
---|
0:02:04 | uh perform and recognition in the language in spain |
---|
0:02:08 | therefore languages in spain |
---|
0:02:10 | not everybody knows that |
---|
0:02:12 | yeah that |
---|
0:02:13 | four official languages spoken in spain |
---|
0:02:16 | and |
---|
0:02:18 | then a another motivation was uh to ms word accuracy |
---|
0:02:22 | that the state of the art systems |
---|
0:02:24 | good a time for for this particular application |
---|
0:02:28 | because these languages uh |
---|
0:02:30 | yeah |
---|
0:02:31 | how people in jointly in spain |
---|
0:02:35 | so uh maybe this task |
---|
0:02:37 | could be more challenging than |
---|
0:02:39 | we could expect |
---|
0:02:41 | and finally uh mister |
---|
0:02:44 | this was a diffuse your uh motivation |
---|
0:02:47 | maybe for some you |
---|
0:02:49 | to mister the performance of systems developed |
---|
0:02:52 | on a limited |
---|
0:02:52 | a month |
---|
0:02:53 | data |
---|
0:02:56 | well uh the language |
---|
0:02:58 | detection task was defined |
---|
0:02:59 | same way of as for nist |
---|
0:03:01 | i don't describe this is |
---|
0:03:03 | the same |
---|
0:03:04 | or been described |
---|
0:03:06 | they were |
---|
0:03:08 | uh |
---|
0:03:11 | yeah that's |
---|
0:03:12 | and this can be assumes |
---|
0:03:13 | simple |
---|
0:03:14 | uh |
---|
0:03:15 | uh |
---|
0:03:16 | what is the |
---|
0:03:17 | described here this this is like |
---|
0:03:19 | uh the current system development which is |
---|
0:03:22 | a special one |
---|
0:03:23 | and we need differentiate |
---|
0:03:26 | yeah between uh |
---|
0:03:28 | systems developed uh in three conditions |
---|
0:03:31 | using any available materials |
---|
0:03:34 | and systems uh developed |
---|
0:03:36 | using only |
---|
0:03:38 | did the date that we provide |
---|
0:03:40 | okay |
---|
0:03:40 | that was very special for this evaluation |
---|
0:03:43 | we were interested in putting all the teams |
---|
0:03:46 | at the same point |
---|
0:03:47 | to develop their systems |
---|
0:03:49 | and then to evaluate |
---|
0:03:50 | what they could |
---|
0:03:51 | do |
---|
0:03:52 | starting from the |
---|
0:03:53 | okay |
---|
0:03:54 | well |
---|
0:03:55 | regarding the set of trials we define it as for nist the closest set |
---|
0:03:59 | uh this to one nation open set this one ratio |
---|
0:04:03 | uh we also be fine |
---|
0:04:05 | fig kind of segments |
---|
0:04:06 | of uh for a second and second and three seconds segments |
---|
0:04:11 | um we uh defined we used |
---|
0:04:13 | the same performance measures |
---|
0:04:15 | uh also uh defined by nice |
---|
0:04:19 | you have |
---|
0:04:20 | uh scene |
---|
0:04:22 | this must or sin |
---|
0:04:23 | the previous presentation |
---|
0:04:26 | average calls |
---|
0:04:27 | we also use |
---|
0:04:28 | the seattle area |
---|
0:04:30 | and finally that the course to |
---|
0:04:32 | give uh qualitative |
---|
0:04:34 | uh evaluation |
---|
0:04:35 | systems |
---|
0:04:36 | we uh we define the same priors and colours |
---|
0:04:40 | of the last |
---|
0:04:41 | to understand what we |
---|
0:04:45 | well then database way to record it |
---|
0:04:49 | it was found that call |
---|
0:04:50 | okay |
---|
0:04:51 | i recorded the database |
---|
0:04:53 | from T V |
---|
0:04:53 | in my home |
---|
0:04:56 | uh just connecting idea to record the to the |
---|
0:04:59 | to the decoder |
---|
0:05:00 | the |
---|
0:05:01 | couple T V reporter |
---|
0:05:03 | and is described in the paper that a right |
---|
0:05:07 | two thousand two |
---|
0:05:09 | so it it it um important for target languages |
---|
0:05:12 | spanish out on a second at least yeah |
---|
0:05:16 | and also all the languages |
---|
0:05:17 | just to i love open set test |
---|
0:05:20 | uh the languages |
---|
0:05:22 | uh |
---|
0:05:23 | where friends portable used your money and english |
---|
0:05:26 | five or two case is not so close to spanish that you can |
---|
0:05:30 | uh |
---|
0:05:31 | fig |
---|
0:05:31 | so |
---|
0:05:33 | for you want to these people uh |
---|
0:05:35 | find many different |
---|
0:05:37 | um the spanish too |
---|
0:05:38 | it should |
---|
0:05:39 | with the language |
---|
0:05:40 | well uh audio files uh where uh what files yeah |
---|
0:05:44 | um |
---|
0:05:45 | sixteen Q don't hurt |
---|
0:05:46 | uh |
---|
0:05:47 | uh the last frequencies |
---|
0:05:49 | sampling frequency |
---|
0:05:50 | yeah well they were single channel |
---|
0:05:52 | fig ten bits per sample compressed |
---|
0:05:55 | P C M |
---|
0:05:56 | uh this is another dot |
---|
0:05:58 | friends with about the nist evaluation |
---|
0:06:01 | speech signals would all start from T V souls including |
---|
0:06:05 | a lot of speech or spontaneous |
---|
0:06:06 | speech |
---|
0:06:07 | what kind of environment conditions |
---|
0:06:09 | uh yeah |
---|
0:06:11 | for instance that could be |
---|
0:06:12 | three speakers as speaking enough to be |
---|
0:06:15 | second segment |
---|
0:06:17 | so |
---|
0:06:17 | that could be many speakers |
---|
0:06:19 | speaking in the same |
---|
0:06:20 | yeah |
---|
0:06:21 | test set |
---|
0:06:23 | well we define |
---|
0:06:24 | he joins subsets of T V shows |
---|
0:06:27 | to train development and evaluation |
---|
0:06:29 | this was to make to i a guaranteed at different |
---|
0:06:33 | more or less |
---|
0:06:34 | that different speakers |
---|
0:06:36 | uh where in each |
---|
0:06:38 | in it |
---|
0:06:38 | subset |
---|
0:06:40 | and finally the only data bases pretty |
---|
0:06:42 | small |
---|
0:06:43 | four |
---|
0:06:44 | then use a standard |
---|
0:06:45 | it's just a fifty fifty hours long |
---|
0:06:49 | is distributed to |
---|
0:06:50 | C D V D but |
---|
0:06:52 | and um |
---|
0:06:53 | we are just now uh |
---|
0:06:56 | talking with |
---|
0:06:56 | the L D C to the distributed |
---|
0:06:59 | two |
---|
0:06:59 | L D C |
---|
0:07:00 | and the train data set in |
---|
0:07:03 | good |
---|
0:07:03 | yeah |
---|
0:07:04 | last |
---|
0:07:05 | and fifty six hours |
---|
0:07:07 | nine hours per target language |
---|
0:07:09 | we don't provide any uh they got to uh |
---|
0:07:13 | train |
---|
0:07:14 | or something that |
---|
0:07:15 | four |
---|
0:07:16 | oh the seven languages |
---|
0:07:17 | so just |
---|
0:07:18 | nine hours |
---|
0:07:19 | but that the language |
---|
0:07:20 | that's that's all |
---|
0:07:21 | and i audits of languages |
---|
0:07:23 | i'll be uh |
---|
0:07:24 | in the development dataset and in the evaluation that |
---|
0:07:28 | which are more or less the we |
---|
0:07:30 | have more or less the same structure |
---|
0:07:32 | but |
---|
0:07:33 | i don't |
---|
0:07:34 | i would then |
---|
0:07:36 | uh well when defining when deciding that about the database |
---|
0:07:40 | uh we only choose uh |
---|
0:07:42 | tools um |
---|
0:07:44 | high snr as speech |
---|
0:07:46 | described in sediments with |
---|
0:07:47 | right |
---|
0:07:48 | lemma noise |
---|
0:07:49 | uh speech overlaps |
---|
0:07:51 | they or what all of that all of them |
---|
0:07:54 | well fit the foul |
---|
0:07:55 | and the guidance documents for training they'll have to know then |
---|
0:08:00 | restrictions |
---|
0:08:02 | maybe five minutes |
---|
0:08:03 | you could train with a five minute segment |
---|
0:08:05 | with |
---|
0:08:07 | so um but for seven ms for the betterment of automation |
---|
0:08:11 | yeah |
---|
0:08:12 | to cut |
---|
0:08:13 | uh |
---|
0:08:13 | lend restrictions |
---|
0:08:15 | um |
---|
0:08:16 | we are defined automatic |
---|
0:08:18 | a way of constructing them |
---|
0:08:21 | and by ensuring that they would enclosed by silence |
---|
0:08:24 | more or less |
---|
0:08:26 | yeah and in fact the subsets |
---|
0:08:29 | well the subset of |
---|
0:08:30 | three second segment |
---|
0:08:31 | is |
---|
0:08:32 | extracted from the subset |
---|
0:08:34 | subset |
---|
0:08:35 | of |
---|
0:08:35 | then support six seven months |
---|
0:08:37 | and the same way that and second segment |
---|
0:08:40 | subset is extracted from the |
---|
0:08:42 | the the second |
---|
0:08:44 | segments option |
---|
0:08:45 | uh quite difficult but |
---|
0:08:47 | what |
---|
0:08:47 | we tried is to ensure that differences in |
---|
0:08:51 | in performance |
---|
0:08:53 | uh would you only to uh the the land |
---|
0:08:56 | not too |
---|
0:08:57 | um |
---|
0:08:58 | being testing against different material |
---|
0:09:02 | and |
---|
0:09:03 | the where sound tolerance in land |
---|
0:09:06 | we use in fact |
---|
0:09:08 | a segments between |
---|
0:09:09 | three and five seconds |
---|
0:09:10 | ten and twelve seconds and |
---|
0:09:12 | fig the active duty cycle |
---|
0:09:14 | where |
---|
0:09:15 | the door |
---|
0:09:16 | uh interval |
---|
0:09:18 | and finally that they don and that's it |
---|
0:09:20 | and the same for evaluation |
---|
0:09:22 | but uh |
---|
0:09:23 | one though send a candidate segment |
---|
0:09:26 | yeah |
---|
0:09:27 | sue me not i think for the three ratios so the where |
---|
0:09:31 | six hundred seven is but duration |
---|
0:09:33 | and for each iteration there were a one hundred twenty segments but by the language |
---|
0:09:38 | and |
---|
0:09:39 | i know that |
---|
0:09:40 | one hundred twenty seven minutes of all the syllable |
---|
0:09:43 | it's uh |
---|
0:09:45 | i have to say that |
---|
0:09:46 | this uh |
---|
0:09:48 | it means that |
---|
0:09:49 | yeah |
---|
0:09:50 | the where exactly it |
---|
0:09:52 | oh well too |
---|
0:09:53 | uh yeah |
---|
0:09:56 | twenty percent |
---|
0:09:57 | of um |
---|
0:09:59 | uh segments where i'll go |
---|
0:10:00 | from out of seven |
---|
0:10:02 | as |
---|
0:10:02 | in the in both the development and evaluation purposes |
---|
0:10:06 | which might |
---|
0:10:07 | exactly |
---|
0:10:08 | was what |
---|
0:10:09 | what was defined |
---|
0:10:10 | in the |
---|
0:10:11 | in the right |
---|
0:10:13 | thing |
---|
0:10:15 | um |
---|
0:10:17 | well |
---|
0:10:17 | everybody database design |
---|
0:10:19 | the proportions of known languages |
---|
0:10:21 | where mate |
---|
0:10:22 | they rely |
---|
0:10:23 | but difficult for me to promote it |
---|
0:10:25 | a different for development evaluation |
---|
0:10:28 | and to avoid uh |
---|
0:10:30 | tandem systems to reject specifically |
---|
0:10:33 | so uh |
---|
0:10:34 | kerry part of the table of uh |
---|
0:10:36 | the distribution of segments |
---|
0:10:38 | for development and evaluation |
---|
0:10:40 | you can see that |
---|
0:10:41 | there were |
---|
0:10:42 | seventy sevens for friends |
---|
0:10:43 | them for portuguese and |
---|
0:10:45 | forty four english |
---|
0:10:46 | and not for from the element in the development set |
---|
0:10:50 | and evaluation set |
---|
0:10:52 | the drawings were change |
---|
0:10:54 | between for example to be sent english and german |
---|
0:10:57 | so |
---|
0:10:58 | was |
---|
0:10:59 | may |
---|
0:11:00 | this way |
---|
0:11:02 | uh |
---|
0:11:03 | evaluation do simply |
---|
0:11:05 | there were there was on a rotation plan very similar to |
---|
0:11:08 | that companies |
---|
0:11:10 | uh the wherefore class conditions open set free open suppressed it consists of three judge that |
---|
0:11:16 | restrict it |
---|
0:11:17 | and three durations of the web |
---|
0:11:19 | to attract |
---|
0:11:20 | five |
---|
0:11:21 | for it |
---|
0:11:22 | this condition on uh it's fifteen percent just one single primary system |
---|
0:11:27 | and any number of compressed before alternative systems |
---|
0:11:31 | they wanted to pursue |
---|
0:11:32 | ah the solution should be |
---|
0:11:34 | so my submitted by by teams |
---|
0:11:37 | in this uh evaluations format |
---|
0:11:40 | at this file with one hundred trials section |
---|
0:11:43 | fig spline |
---|
0:11:45 | um but this depends what am i committed to specifically specified |
---|
0:11:49 | whether or not the scores may be interpreted |
---|
0:11:52 | us |
---|
0:11:52 | look like |
---|
0:11:53 | oh look |
---|
0:11:53 | like that the errors |
---|
0:11:55 | or not |
---|
0:11:56 | and also to send this presents and to participate um |
---|
0:12:01 | in the ann arbour scene with us tonight |
---|
0:12:03 | and then with evolution evolution works |
---|
0:12:06 | okay |
---|
0:12:07 | systems where one |
---|
0:12:08 | uh in fact according to their average goals |
---|
0:12:12 | and are defined that way |
---|
0:12:14 | in this |
---|
0:12:15 | fancy |
---|
0:12:16 | and |
---|
0:12:17 | though |
---|
0:12:18 | was run it |
---|
0:12:19 | and right now i'm not work for the best system |
---|
0:12:23 | the system the only you'll be in |
---|
0:12:25 | the least |
---|
0:12:25 | the average goes in there |
---|
0:12:27 | see a |
---|
0:12:28 | thirty condition |
---|
0:12:30 | close to restrict it |
---|
0:12:31 | on |
---|
0:12:32 | other subset of |
---|
0:12:34 | uh |
---|
0:12:35 | still be second seven |
---|
0:12:38 | well |
---|
0:12:38 | this was there |
---|
0:12:40 | the scale of their valuation |
---|
0:12:42 | uh |
---|
0:12:42 | in |
---|
0:12:44 | few words |
---|
0:12:45 | the work |
---|
0:12:46 | three months for developing your system |
---|
0:12:48 | and there were three weeks to uh |
---|
0:12:50 | uh process they want vision of |
---|
0:12:53 | and |
---|
0:12:54 | i have to say that uh the database produced the database was produced |
---|
0:12:58 | in |
---|
0:12:59 | three models |
---|
0:12:59 | from april to june |
---|
0:13:01 | depends on a |
---|
0:13:02 | and we also recorded some more um |
---|
0:13:06 | data in september |
---|
0:13:08 | to find something to |
---|
0:13:08 | two |
---|
0:13:09 | uh |
---|
0:13:11 | and uh complete |
---|
0:13:12 | the evaluation on the test |
---|
0:13:14 | okay |
---|
0:13:16 | that that you can find it in the paper |
---|
0:13:19 | well |
---|
0:13:20 | uh |
---|
0:13:21 | i now i begin to describe herself |
---|
0:13:24 | yeah |
---|
0:13:25 | the work for participants |
---|
0:13:27 | displayed in teams |
---|
0:13:29 | percent including systems |
---|
0:13:31 | things were from spain and what about |
---|
0:13:34 | and uh there were two teams percent in a state of the art systems more or less |
---|
0:13:39 | and the two first the first ones |
---|
0:13:41 | T one T two |
---|
0:13:43 | and the other two percent it assistance not |
---|
0:13:46 | specifically designed for uh |
---|
0:13:48 | a language recognition applications so |
---|
0:13:51 | the the source world |
---|
0:13:52 | just the table of uh the average cost |
---|
0:13:55 | four |
---|
0:13:56 | uh thirty second segment |
---|
0:13:58 | you can |
---|
0:13:59 | see that |
---|
0:14:00 | there |
---|
0:14:00 | performance as well |
---|
0:14:02 | very bad |
---|
0:14:04 | so uh in the following that it will only |
---|
0:14:08 | because |
---|
0:14:08 | talk about |
---|
0:14:09 | results of these two two |
---|
0:14:11 | to to |
---|
0:14:13 | okay |
---|
0:14:14 | well |
---|
0:14:15 | yeah no i somersaults uh first the |
---|
0:14:18 | condition i |
---|
0:14:20 | talk about |
---|
0:14:21 | is |
---|
0:14:21 | the the mandatory one |
---|
0:14:23 | for which they almost all the teams have to |
---|
0:14:26 | the centre system |
---|
0:14:28 | you can see here |
---|
0:14:29 | cool |
---|
0:14:30 | that's good |
---|
0:14:31 | um |
---|
0:14:33 | this uh |
---|
0:14:34 | like what this |
---|
0:14:35 | one is uh |
---|
0:14:37 | for a contrastive system funding from T one with |
---|
0:14:40 | in fact |
---|
0:14:41 | uh got the best result |
---|
0:14:43 | the best the average goes |
---|
0:14:45 | but the best primary system was also |
---|
0:14:47 | from T one |
---|
0:14:49 | uh |
---|
0:14:50 | they have the |
---|
0:14:51 | then um |
---|
0:14:54 | okay |
---|
0:14:55 | yeah |
---|
0:14:56 | when |
---|
0:14:57 | when channel |
---|
0:14:57 | this was i |
---|
0:14:58 | okay |
---|
0:14:59 | to say that this was this was in restrictive conditions |
---|
0:15:02 | these systems to come see a |
---|
0:15:04 | big difference with |
---|
0:15:06 | T seem to on T V |
---|
0:15:08 | team one |
---|
0:15:09 | uh because uh |
---|
0:15:11 | they were already uh develop their systems |
---|
0:15:14 | using |
---|
0:15:15 | this to the data provided in this one which |
---|
0:15:18 | not using |
---|
0:15:19 | any other sisters |
---|
0:15:20 | they rely on all the data and all the like |
---|
0:15:23 | okay |
---|
0:15:24 | so when changing to the three conditions |
---|
0:15:28 | uh |
---|
0:15:29 | with |
---|
0:15:29 | see the systems uh got |
---|
0:15:32 | uh |
---|
0:15:32 | much better |
---|
0:15:33 | performance |
---|
0:15:34 | around |
---|
0:15:35 | five percent equal error rate |
---|
0:15:37 | but the |
---|
0:15:39 | in fact we were surprised |
---|
0:15:41 | by this result because we expect it |
---|
0:15:43 | much better results |
---|
0:15:45 | around one percent or less |
---|
0:15:47 | yeah |
---|
0:15:49 | and uh we uh |
---|
0:15:51 | made some experiments afterwards |
---|
0:15:53 | the the the one mission which |
---|
0:15:55 | on system |
---|
0:15:57 | a system that got on there |
---|
0:16:00 | fig |
---|
0:16:01 | he percent or whatever right |
---|
0:16:02 | in the general language recognition task defined in use |
---|
0:16:05 | two thousand seven |
---|
0:16:07 | evaluation |
---|
0:16:08 | and we've got |
---|
0:16:09 | five |
---|
0:16:10 | yeah |
---|
0:16:11 | forty five percent whatever right |
---|
0:16:13 | so |
---|
0:16:14 | five |
---|
0:16:15 | it seems that uh this task |
---|
0:16:17 | mm |
---|
0:16:18 | the task defined for |
---|
0:16:19 | for about seeing this evaluation |
---|
0:16:22 | is |
---|
0:16:23 | uh more difficult than |
---|
0:16:24 | i'm spec |
---|
0:16:25 | okay |
---|
0:16:27 | there are some possible issues not the same that's another thing that data results comparable comparable |
---|
0:16:33 | between the knees |
---|
0:16:34 | evaluation on this evaluation |
---|
0:16:36 | maybe not |
---|
0:16:38 | the statistical significance |
---|
0:16:40 | there are not many uh trials |
---|
0:16:42 | only |
---|
0:16:43 | yeah six hundred there |
---|
0:16:44 | uh |
---|
0:16:46 | but the nation |
---|
0:16:47 | okay |
---|
0:16:48 | and there are also some possible explanations maybe the acoustic variability a speaker's channel |
---|
0:16:53 | background noise |
---|
0:16:55 | there were different conditions |
---|
0:16:56 | and also their phonetic and lexical we |
---|
0:16:59 | but for these |
---|
0:17:01 | uh |
---|
0:17:01 | the phonetic on lexical similarity among body language |
---|
0:17:05 | or more than one |
---|
0:17:07 | the same country |
---|
0:17:09 | oh no |
---|
0:17:10 | many years many centuries |
---|
0:17:11 | what we |
---|
0:17:12 | don't leave so |
---|
0:17:14 | maybe this is the |
---|
0:17:15 | then race |
---|
0:17:16 | in any case |
---|
0:17:18 | size have said that that seems uh challenging enough for that |
---|
0:17:22 | a lot of other research |
---|
0:17:24 | in |
---|
0:17:25 | language recognition technology |
---|
0:17:27 | well |
---|
0:17:27 | yeah this is |
---|
0:17:29 | we have been talking about their clothes |
---|
0:17:32 | set |
---|
0:17:32 | condition now i'm talking about |
---|
0:17:35 | that opens the condition |
---|
0:17:36 | the best performance in this case was |
---|
0:17:38 | worse |
---|
0:17:39 | like for |
---|
0:17:41 | yeah because there are |
---|
0:17:43 | uh well known languages |
---|
0:17:44 | in there |
---|
0:17:44 | the trials |
---|
0:17:46 | and with the systems that the system works around nine percent accurate |
---|
0:17:51 | this case |
---|
0:17:52 | which is almost |
---|
0:17:53 | two times they were raiding the |
---|
0:17:55 | close to completion |
---|
0:17:57 | so that |
---|
0:17:57 | three conditions |
---|
0:17:59 | yeah well |
---|
0:18:00 | or conclusion is that some unknown languages are being confused with body language is maybe or to be some friends |
---|
0:18:06 | we don't know |
---|
0:18:09 | well |
---|
0:18:09 | yeah you have uh |
---|
0:18:11 | there was these results |
---|
0:18:12 | uh the second rate for languages for target languages |
---|
0:18:15 | uh uh for the best system |
---|
0:18:18 | so you can hear you can see for the close |
---|
0:18:21 | set condition i'm for that opens the condition |
---|
0:18:24 | and |
---|
0:18:25 | the green who is for bus |
---|
0:18:28 | which got the best |
---|
0:18:29 | uh performance |
---|
0:18:31 | and then uh |
---|
0:18:32 | right put his fork at a time |
---|
0:18:34 | we've got a |
---|
0:18:35 | worst performance |
---|
0:18:36 | in opens the condition |
---|
0:18:38 | and you can see the uh that bus |
---|
0:18:42 | the change in the performance |
---|
0:18:43 | for bass |
---|
0:18:44 | really |
---|
0:18:45 | it's more |
---|
0:18:47 | also forced by means which is the |
---|
0:18:50 | i think |
---|
0:18:51 | but this is the kid a blue |
---|
0:18:53 | and the power point it's not easy and which also |
---|
0:18:56 | uh wasn't |
---|
0:18:57 | yeah |
---|
0:18:57 | it's performance but not as |
---|
0:18:59 | much as a forecast for qatar |
---|
0:19:02 | so are we have uh analyse this in more time |
---|
0:19:06 | with |
---|
0:19:06 | this table |
---|
0:19:07 | i have to say that |
---|
0:19:09 | there is are a right uh never in the paper |
---|
0:19:12 | uh these numbers are |
---|
0:19:13 | five |
---|
0:19:14 | the uh error rates |
---|
0:19:16 | uh |
---|
0:19:17 | you need some before somehow |
---|
0:19:19 | we missing there |
---|
0:19:21 | in the |
---|
0:19:22 | dialogue now and |
---|
0:19:23 | be false alarm aside diagonal |
---|
0:19:25 | um |
---|
0:19:27 | uh we mistake them as coast |
---|
0:19:29 | so yes this |
---|
0:19:30 | they they did but they soon as the same but |
---|
0:19:32 | the |
---|
0:19:34 | the numbers are not |
---|
0:19:35 | what the paper says |
---|
0:19:37 | they are |
---|
0:19:37 | okay |
---|
0:19:38 | ah as you can |
---|
0:19:40 | there is a reliable recall here |
---|
0:19:43 | the white meaning |
---|
0:19:44 | zero there are and black meaning |
---|
0:19:47 | one |
---|
0:19:48 | the maximum possible error |
---|
0:19:51 | uh yeah |
---|
0:19:51 | so this is for the close and so condition |
---|
0:19:54 | and when changing to the open set |
---|
0:19:57 | conceive here |
---|
0:19:59 | really |
---|
0:20:00 | usually that for at a time |
---|
0:20:01 | no languages |
---|
0:20:03 | tyler |
---|
0:20:04 | you find that |
---|
0:20:05 | um |
---|
0:20:06 | many uh |
---|
0:20:07 | trials |
---|
0:20:08 | corresponding to known languages where |
---|
0:20:10 | confused with qatar |
---|
0:20:12 | that's |
---|
0:20:12 | the origin of |
---|
0:20:14 | that uh changing the core |
---|
0:20:17 | for the open set condition |
---|
0:20:19 | okay |
---|
0:20:20 | uh |
---|
0:20:21 | this assault and not going to comment |
---|
0:20:23 | this because it's the same for us for me is |
---|
0:20:26 | the the performance |
---|
0:20:27 | uh watson's us |
---|
0:20:29 | the |
---|
0:20:30 | double of the land |
---|
0:20:31 | uh is |
---|
0:20:32 | less |
---|
0:20:34 | of the around |
---|
0:20:35 | the segment |
---|
0:20:37 | and uh |
---|
0:20:38 | this is for |
---|
0:20:39 | more interesting for me |
---|
0:20:41 | is |
---|
0:20:41 | because uh |
---|
0:20:43 | you can see |
---|
0:20:44 | what happens when you restrict |
---|
0:20:46 | they get the the bottom we conditions |
---|
0:20:49 | uh you have here um |
---|
0:20:51 | yeah two different teams |
---|
0:20:53 | the blue ones |
---|
0:20:54 | being one |
---|
0:20:55 | the right one is team too |
---|
0:20:57 | and |
---|
0:20:58 | for the uh three |
---|
0:21:00 | condition |
---|
0:21:01 | well then uh got more or less the same |
---|
0:21:04 | performance |
---|
0:21:06 | this |
---|
0:21:06 | to go |
---|
0:21:08 | but for the restricted condition when |
---|
0:21:10 | restricting the materials |
---|
0:21:12 | they could use |
---|
0:21:13 | to double their systems |
---|
0:21:16 | what uh the T one |
---|
0:21:18 | okay |
---|
0:21:19 | the the performance |
---|
0:21:21 | quite close to the |
---|
0:21:22 | to the other condition where is the the other one |
---|
0:21:25 | uh |
---|
0:21:26 | the performance was much |
---|
0:21:28 | much worse |
---|
0:21:29 | the difference |
---|
0:21:30 | it's |
---|
0:21:32 | uh |
---|
0:21:33 | forty percent word |
---|
0:21:34 | a or its goals |
---|
0:21:35 | to uh |
---|
0:21:36 | four hundred percent |
---|
0:21:38 | what are its goals |
---|
0:21:40 | these |
---|
0:21:41 | okay |
---|
0:21:41 | so |
---|
0:21:42 | i think this is important because |
---|
0:21:44 | you can |
---|
0:21:45 | i for me |
---|
0:21:46 | this is spent is much more robust |
---|
0:21:48 | now the other one |
---|
0:21:50 | because it |
---|
0:21:51 | not |
---|
0:21:52 | does |
---|
0:21:52 | does not depend |
---|
0:21:54 | on |
---|
0:21:55 | so much on the materials provided to to trying to to to train it |
---|
0:21:59 | okay |
---|
0:22:00 | well |
---|
0:22:01 | conclusions |
---|
0:22:04 | um |
---|
0:22:05 | well |
---|
0:22:06 | we i thought sent it uh that what was an evaluation involving |
---|
0:22:10 | the official language in spain |
---|
0:22:12 | ask around a listener spanish |
---|
0:22:14 | you seen uh material was a recording from till you drop |
---|
0:22:19 | davis |
---|
0:22:20 | since then uh playing |
---|
0:22:22 | state of technology got around five percent equal error rate |
---|
0:22:25 | in the close set |
---|
0:22:26 | three development condition |
---|
0:22:28 | just |
---|
0:22:28 | what's that |
---|
0:22:29 | the rest for for them |
---|
0:22:32 | and |
---|
0:22:33 | we think |
---|
0:22:34 | that |
---|
0:22:34 | fine task |
---|
0:22:35 | tasks |
---|
0:22:36 | in this uh evaluation |
---|
0:22:38 | my support for the developments in language information technology |
---|
0:22:42 | and uh will we form |
---|
0:22:44 | yeah darkness is its sensitivity to the bottom restrictions |
---|
0:22:48 | depending depending |
---|
0:22:50 | on the system |
---|
0:22:52 | uh |
---|
0:22:53 | from two different systems |
---|
0:22:54 | that |
---|
0:22:56 | uh |
---|
0:22:57 | fig creasing calls |
---|
0:22:58 | what's different |
---|
0:23:00 | for them |
---|
0:23:01 | my thing to be |
---|
0:23:03 | this |
---|
0:23:03 | uh |
---|
0:23:05 | condition i don't know if |
---|
0:23:06 | you are interested in restricting |
---|
0:23:08 | the materials |
---|
0:23:10 | but i think |
---|
0:23:11 | it could be interesting for me assimilation |
---|
0:23:14 | maybe i don't know |
---|
0:23:15 | you are interested but |
---|
0:23:17 | um |
---|
0:23:19 | on finally we found not the same performance um opening languages |
---|
0:23:23 | the best performance was formed for bass |
---|
0:23:25 | and the |
---|
0:23:26 | was performance was from for a time |
---|
0:23:29 | speculating about these |
---|
0:23:30 | we can uh say that bus |
---|
0:23:33 | is uh a special language |
---|
0:23:35 | not romance languages |
---|
0:23:37 | uh data |
---|
0:23:38 | its origins are different |
---|
0:23:40 | oh |
---|
0:23:40 | all the languages in spain |
---|
0:23:43 | and contamination roma's language |
---|
0:23:45 | which you may be |
---|
0:23:46 | usually confused |
---|
0:23:48 | people by the systems |
---|
0:23:50 | with portuguese or |
---|
0:23:51 | maybe friends |
---|
0:23:53 | or maybe |
---|
0:23:54 | spain uses pennies or at least |
---|
0:23:57 | well |
---|
0:23:58 | and |
---|
0:23:59 | finally i have to say or couldn't work |
---|
0:24:01 | is organising |
---|
0:24:03 | in this in this evaluation |
---|
0:24:04 | we are not just now |
---|
0:24:06 | or anything |
---|
0:24:07 | this evaluation that was seen two percent and language recognition evaluation |
---|
0:24:12 | yeah yeah |
---|
0:24:12 | we have a record it i knew we have extended |
---|
0:24:15 | the |
---|
0:24:16 | how like a database |
---|
0:24:17 | which was |
---|
0:24:18 | one used |
---|
0:24:19 | before |
---|
0:24:20 | to uh define come back to |
---|
0:24:23 | we have i did portuguese and english study languages |
---|
0:24:27 | maybe you are interested |
---|
0:24:28 | these languages |
---|
0:24:29 | a happy new they're set of unknown languages |
---|
0:24:32 | um have included |
---|
0:24:34 | i knew this condition for noisy speech |
---|
0:24:39 | is this getting we |
---|
0:24:40 | you can't of easter |
---|
0:24:42 | yeah until july |
---|
0:24:43 | fifteen |
---|
0:24:45 | i'm september you have |
---|
0:24:47 | more or less three months |
---|
0:24:49 | if you use them now |
---|
0:24:52 | until september twenty seven |
---|
0:24:54 | to uh the video systems |
---|
0:24:57 | and two weeks |
---|
0:24:58 | to uh process |
---|
0:25:00 | uh evaluation that |
---|
0:25:02 | then and the key file and results were released |
---|
0:25:05 | one double |
---|
0:25:07 | fifteen |
---|
0:25:08 | and the warsaw |
---|
0:25:10 | yeah |
---|
0:25:11 | language recognition what is and what's not |
---|
0:25:13 | we we had |
---|
0:25:14 | in november i beagle spain |
---|
0:25:18 | in a contest of how to do something that is uh |
---|
0:25:22 | what's up uh |
---|
0:25:23 | spain |
---|
0:25:25 | okay |
---|
0:25:26 | uh you can repeat step in this |
---|
0:25:28 | uh |
---|
0:25:30 | well |
---|
0:25:31 | and if you look is that please |
---|
0:25:33 | dissipate |
---|
0:25:35 | that's all |
---|
0:25:43 | they should |
---|
0:25:48 | you mentioned at the beginning |
---|
0:25:50 | when you collect the database |
---|
0:25:52 | you might sure |
---|
0:25:53 | uh |
---|
0:25:54 | now treat each speaker |
---|
0:25:56 | no no i'm not sure |
---|
0:25:57 | i |
---|
0:25:58 | try to uh |
---|
0:25:59 | distribute programs |
---|
0:26:01 | in different sets for instance one brother one T V so |
---|
0:26:04 | was only for evaluation |
---|
0:26:06 | another T V show was only for development and |
---|
0:26:10 | you |
---|
0:26:11 | for instance yeah |
---|
0:26:12 | this I T V so called colour |
---|
0:26:15 | in bass |
---|
0:26:16 | this T V show was |
---|
0:26:17 | i don't think it all for training |
---|
0:26:20 | but not for development not |
---|
0:26:22 | i'm not sure |
---|
0:26:23 | that if they are not |
---|
0:26:24 | the same speakers |
---|
0:26:26 | i like |
---|
0:26:27 | i i |
---|
0:26:27 | we tried |
---|
0:26:28 | two |
---|
0:26:29 | to um |
---|
0:26:31 | to manage |
---|
0:26:31 | the |
---|
0:26:33 | just understood |
---|
0:26:33 | oh |
---|
0:26:34 | i'm just speculating that |
---|
0:26:36 | uh |
---|
0:26:37 | the speakers are also like |
---|
0:26:39 | what |
---|
0:26:39 | well developed |
---|
0:26:40 | see |
---|
0:26:40 | oh |
---|
0:26:41 | if |
---|
0:26:42 | there's a lot |
---|
0:26:44 | uh_huh |
---|
0:26:45 | repeated speaker in |
---|
0:26:47 | elements |
---|
0:26:48 | no |
---|
0:26:49 | uh |
---|
0:26:50 | you could lead to recognise |
---|
0:26:51 | speaker |
---|
0:26:51 | right |
---|
0:26:52 | lang |
---|
0:26:53 | no because uh we try to put on so and |
---|
0:26:57 | and |
---|
0:26:58 | many speakers in in |
---|
0:27:00 | that is not problems like um |
---|
0:27:03 | um |
---|
0:27:04 | broadcast news |
---|
0:27:05 | where there is only one or two speakers speaking all the time |
---|
0:27:08 | or more |
---|
0:27:09 | much yeah |
---|
0:27:10 | much |
---|
0:27:11 | time |
---|
0:27:12 | we try to select various |
---|
0:27:13 | T V so |
---|
0:27:14 | different T V shows |
---|
0:27:16 | uh sincerely debates |
---|
0:27:18 | and talk shows where many people speak |
---|
0:27:21 | and they're also interviews so |
---|
0:27:24 | so |
---|
0:27:25 | maybe is to what you're telling |
---|
0:27:27 | you are suggesting |
---|
0:27:29 | but |
---|
0:27:30 | i don't think so we |
---|
0:27:32 | i don't know |
---|
0:27:39 | question uh |
---|
0:27:40 | with the growing to do the data format so you recorded to wideband speech |
---|
0:27:45 | and uh |
---|
0:27:46 | you also find that so uh |
---|
0:27:48 | it was a little harder task |
---|
0:27:50 | yeah |
---|
0:27:51 | expect |
---|
0:27:52 | speech |
---|
0:27:52 | in this |
---|
0:27:54 | nation |
---|
0:27:55 | no |
---|
0:27:56 | so you would have more information like |
---|
0:27:59 | speech |
---|
0:28:00 | but |
---|
0:28:01 | um |
---|
0:28:02 | it might also that might be the fact |
---|
0:28:04 | right |
---|
0:28:05 | we stick unique |
---|
0:28:06 | what effect it might be the fact that |
---|
0:28:08 | most people have been developed for telephone |
---|
0:28:11 | oh |
---|
0:28:12 | speech |
---|
0:28:13 | what would you be in there |
---|
0:28:16 | well |
---|
0:28:16 | i i not happy with that obviously |
---|
0:28:18 | because we are developing technology for |
---|
0:28:21 | for T V |
---|
0:28:22 | for for T V signals uh record it |
---|
0:28:25 | in one by one |
---|
0:28:26 | but by conditions |
---|
0:28:28 | so uh i i understand the reasons to organise |
---|
0:28:32 | then use the one races because |
---|
0:28:34 | was the sponsor or what |
---|
0:28:35 | the sponsor ones |
---|
0:28:36 | two |
---|
0:28:38 | by finance in the the |
---|
0:28:40 | they're all nations |
---|
0:28:41 | but |
---|
0:28:41 | from the point of view of the of the research uh community i think |
---|
0:28:45 | we should uh |
---|
0:28:48 | try to |
---|
0:28:49 | to organise how to kind of whatever whatever it seems more devoted to technology got it |
---|
0:28:54 | the bottom ends |
---|
0:28:55 | unless to the application i |
---|
0:28:57 | it's my opinion but |
---|
0:28:59 | uh |
---|
0:29:00 | we have to decide |
---|
0:29:01 | we had to maybe to find |
---|
0:29:03 | sponsors and |
---|
0:29:04 | i don't know |
---|
0:29:06 | if |
---|
0:29:06 | uh |
---|
0:29:06 | that's |
---|
0:29:08 | possible or not |
---|
0:29:16 | understood |
---|
0:29:17 | and |
---|
0:29:17 | i just |
---|
0:29:18 | session because the men should discussion going |
---|
0:29:22 | so |
---|
0:29:22 | thank you |
---|
0:29:23 | oh okay |
---|