0:00:06 | yeah |
---|
0:00:08 | i |
---|
0:00:09 | well |
---|
0:00:10 | describe the |
---|
0:00:12 | two thousand and nine |
---|
0:00:14 | on this |
---|
0:00:15 | language |
---|
0:00:16 | recognition evaluation |
---|
0:00:18 | lre O nine |
---|
0:00:20 | um |
---|
0:00:21 | this |
---|
0:00:23 | uh |
---|
0:00:24 | discordant in this evaluation now |
---|
0:00:27 | and your |
---|
0:00:28 | U S |
---|
0:00:28 | government sponsorship and |
---|
0:00:30 | this work was largely done with |
---|
0:00:32 | great greenberg like owing |
---|
0:00:34 | in them this yeah |
---|
0:00:35 | multimodal |
---|
0:00:36 | information group |
---|
0:00:42 | so the two thousand nine |
---|
0:00:44 | evaluation |
---|
0:00:46 | was the fit in the |
---|
0:00:48 | series of |
---|
0:00:50 | this coordinated lre is the first was in ninety six |
---|
0:00:54 | and everything yeah |
---|
0:00:56 | and we're evaluations in two thousand |
---|
0:00:59 | three and two thousand |
---|
0:01:00 | five |
---|
0:01:01 | two thousand seven |
---|
0:01:03 | two thousand nine |
---|
0:01:05 | uh oh |
---|
0:01:05 | one might suspect that the |
---|
0:01:07 | could be another evaluation |
---|
0:01:09 | twenty eleven |
---|
0:01:10 | um |
---|
0:01:12 | trying to to that in nine |
---|
0:01:15 | he changes |
---|
0:01:16 | uh we're in the |
---|
0:01:18 | nature of the data |
---|
0:01:19 | say more about that |
---|
0:01:21 | the treatment of dialogue |
---|
0:01:22 | dialect |
---|
0:01:23 | mutually intelligible languages |
---|
0:01:26 | and in the |
---|
0:01:27 | set of evaluation test condition |
---|
0:01:29 | we will get to those |
---|
0:01:30 | um the data the |
---|
0:01:33 | oh oreo nine or |
---|
0:01:35 | indicated there there were |
---|
0:01:36 | eighteen total |
---|
0:01:38 | participating sites |
---|
0:01:42 | um |
---|
0:01:43 | the prior |
---|
0:01:44 | nist evaluations |
---|
0:01:47 | used conversational telephone speech |
---|
0:01:50 | this involved |
---|
0:01:51 | paying subjects |
---|
0:01:52 | yeah |
---|
0:01:53 | they call it |
---|
0:01:54 | nature |
---|
0:01:55 | language recognition you just wanna make a single call |
---|
0:01:58 | in their native language |
---|
0:02:00 | ah |
---|
0:02:01 | in the U S preferably control channel conditions |
---|
0:02:04 | this |
---|
0:02:06 | paradigm is becoming expensive and impractical |
---|
0:02:09 | it's hard to pay people to make single call these days |
---|
0:02:12 | talk to me |
---|
0:02:14 | um |
---|
0:02:14 | um |
---|
0:02:15 | helpful |
---|
0:02:17 | um |
---|
0:02:18 | access |
---|
0:02:19 | is easy |
---|
0:02:21 | so lre O nine |
---|
0:02:22 | attempted to use primarily |
---|
0:02:24 | down data |
---|
0:02:26 | in this case |
---|
0:02:27 | down |
---|
0:02:28 | uh |
---|
0:02:29 | from what voice of america |
---|
0:02:31 | right yes |
---|
0:02:33 | data |
---|
0:02:34 | um |
---|
0:02:35 | this was this |
---|
0:02:36 | sampled by the |
---|
0:02:37 | with the |
---|
0:02:38 | uh data consortium actually |
---|
0:02:40 | found data from |
---|
0:02:42 | three different yours of |
---|
0:02:44 | um |
---|
0:02:46 | voice american to where you started about data |
---|
0:02:49 | the L D C S |
---|
0:02:51 | other conferences separately reported on this |
---|
0:02:53 | data collection |
---|
0:02:54 | the a feasibility study of using this data for lre |
---|
0:02:59 | was done before and then |
---|
0:03:01 | done in the by |
---|
0:03:02 | researchers |
---|
0:03:04 | uh |
---|
0:03:04 | here at the brno university of |
---|
0:03:07 | of of technology |
---|
0:03:08 | uh that was a key part |
---|
0:03:10 | a lot in the data for this evaluation |
---|
0:03:13 | um |
---|
0:03:13 | the selected |
---|
0:03:14 | segments that were |
---|
0:03:15 | actually |
---|
0:03:16 | used for testing also does it for |
---|
0:03:18 | development work |
---|
0:03:20 | uh segments |
---|
0:03:21 | uh |
---|
0:03:22 | determined by wanting to be involved narrow bands |
---|
0:03:24 | speech |
---|
0:03:25 | and we want to get as many different speakers as possible |
---|
0:03:28 | um |
---|
0:03:29 | evaluation also use |
---|
0:03:30 | cts data that had been collected previously but for various reasons |
---|
0:03:34 | and not been used in the uh |
---|
0:03:36 | in the prior evaluation |
---|
0:03:41 | so |
---|
0:03:43 | um |
---|
0:03:45 | here is our list of target languages for this evaluation and then using found data is that |
---|
0:03:51 | have |
---|
0:03:52 | um |
---|
0:03:54 | more |
---|
0:03:55 | target languages |
---|
0:03:56 | indeed we had that |
---|
0:03:59 | twenty three |
---|
0:04:00 | in this case |
---|
0:04:01 | ah |
---|
0:04:02 | in some cases we just list it is languages quite that would trees have been created is |
---|
0:04:07 | dialect |
---|
0:04:08 | um |
---|
0:04:09 | english american and in american english and indian english or |
---|
0:04:13 | also uh |
---|
0:04:14 | indian |
---|
0:04:16 | and working so this one says we just do these into it |
---|
0:04:20 | a single part languages will talk about the language here condition |
---|
0:04:25 | um |
---|
0:04:27 | any D |
---|
0:04:29 | we specified eight |
---|
0:04:32 | um |
---|
0:04:33 | language pairs as being a |
---|
0:04:34 | particular interest |
---|
0:04:36 | um |
---|
0:04:37 | they either languages that are |
---|
0:04:39 | similar patient |
---|
0:04:40 | english dialects uh |
---|
0:04:43 | indian |
---|
0:04:44 | or do may be viewed as a dialect station |
---|
0:04:47 | other languages are |
---|
0:04:49 | many cases mutually intelligible post processing croatian |
---|
0:04:53 | a real |
---|
0:04:54 | haitian and french are of interest |
---|
0:04:57 | uh |
---|
0:04:57 | include such pairs |
---|
0:04:59 | it's cantonese mandarin spanish |
---|
0:05:01 | portuguese |
---|
0:05:02 | so we specify these as |
---|
0:05:04 | these eight as being the |
---|
0:05:06 | of particular interest for those who |
---|
0:05:09 | wanted to investigate um |
---|
0:05:13 | uh |
---|
0:05:14 | so the |
---|
0:05:16 | evaluation |
---|
0:05:18 | ah consist of a long |
---|
0:05:20 | series of trials for each of the |
---|
0:05:23 | in addition |
---|
0:05:24 | and as in the past |
---|
0:05:25 | we |
---|
0:05:26 | charlie's |
---|
0:05:27 | test segment |
---|
0:05:29 | our approximately thirty or approximately ten or parts |
---|
0:05:31 | really |
---|
0:05:32 | a three seconds of speech |
---|
0:05:36 | uh i for each trial |
---|
0:05:39 | you have |
---|
0:05:40 | a target language hypothesis |
---|
0:05:43 | and |
---|
0:05:44 | and alternative |
---|
0:05:45 | i thought this |
---|
0:05:48 | and for each |
---|
0:05:49 | ah |
---|
0:05:50 | a trial we require |
---|
0:05:52 | i passed a decision |
---|
0:05:54 | and the score |
---|
0:05:56 | yeah we |
---|
0:05:57 | specify three |
---|
0:05:58 | different |
---|
0:05:59 | has conditions this year |
---|
0:06:01 | the close second edition this is the |
---|
0:06:04 | traditional condition it |
---|
0:06:06 | but part of all the evaluation is required condition |
---|
0:06:09 | and this reach |
---|
0:06:11 | language segment |
---|
0:06:14 | oh you have a one of the target languages |
---|
0:06:17 | as i thought |
---|
0:06:18 | each segment is running |
---|
0:06:19 | really |
---|
0:06:20 | target is a part |
---|
0:06:21 | the alternative hypothesis is |
---|
0:06:23 | it's a different target language |
---|
0:06:25 | one of the other twenty two |
---|
0:06:27 | the open second edition |
---|
0:06:29 | the alternative i thought this is |
---|
0:06:32 | not simply that |
---|
0:06:33 | one of those twenty two languages it could be that they could also be some other language an unknown how |
---|
0:06:38 | does that language |
---|
0:06:40 | and finally we introduce the C of the language here condition |
---|
0:06:44 | which is designed to look at |
---|
0:06:46 | ah |
---|
0:06:47 | i just distinguishing here so that the |
---|
0:06:50 | i have this in all cases a single |
---|
0:06:53 | line you know |
---|
0:06:55 | target languages english the alternative |
---|
0:06:57 | uh is that it |
---|
0:06:59 | it's french |
---|
0:07:00 | um |
---|
0:07:01 | ah so |
---|
0:07:02 | there are two and we twenty three target languages there are two hundred fifty three pairs and |
---|
0:07:07 | a part of language you want to look at this way and |
---|
0:07:10 | systems were invited to do |
---|
0:07:11 | all of them |
---|
0:07:13 | only a couple chose to do so |
---|
0:07:14 | or selected ones in particular the |
---|
0:07:17 | eight a |
---|
0:07:18 | mentioned above |
---|
0:07:23 | uh this gives you some |
---|
0:07:24 | indication of the |
---|
0:07:27 | um |
---|
0:07:27 | training and |
---|
0:07:28 | test segments |
---|
0:07:29 | that will provide in there |
---|
0:07:31 | there's a |
---|
0:07:33 | source so |
---|
0:07:34 | a green |
---|
0:07:35 | language it |
---|
0:07:37 | it indicating the number of segments and between segments of each duration |
---|
0:07:41 | uh |
---|
0:07:43 | um they were providing it'd be a weight training or be away |
---|
0:07:48 | ah yes |
---|
0:07:49 | where cts that that all the |
---|
0:07:51 | cts data from previous evaluations where |
---|
0:07:54 | we're also available |
---|
0:07:55 | um |
---|
0:07:56 | and the B Y training we provided |
---|
0:07:58 | you know we provided lots of data and not just limited to these selected segments but |
---|
0:08:03 | oh |
---|
0:08:04 | a corporate move around |
---|
0:08:06 | terabytes |
---|
0:08:07 | uh |
---|
0:08:08 | a drive the route |
---|
0:08:10 | but we're distributed people but |
---|
0:08:12 | rich language we haven't had about two hundred uh |
---|
0:08:15 | from the really data |
---|
0:08:16 | a segment of each |
---|
0:08:18 | duration separated |
---|
0:08:19 | but that S yeah |
---|
0:08:21 | we had open |
---|
0:08:22 | three or four hundred |
---|
0:08:23 | alright |
---|
0:08:25 | quite depending on availability |
---|
0:08:26 | and |
---|
0:08:27 | we we had |
---|
0:08:28 | training in languages which |
---|
0:08:32 | i i'm a |
---|
0:08:34 | and that we had with training data |
---|
0:08:36 | are all the languages for which i was not |
---|
0:08:38 | ah |
---|
0:08:40 | previous cts data in many languages that relevant data with cts but the new |
---|
0:08:45 | you data could be the other way |
---|
0:08:47 | um |
---|
0:08:48 | so |
---|
0:08:49 | that's |
---|
0:08:52 | numbers are there eighteen side |
---|
0:08:54 | they're listed here are many lamar |
---|
0:08:56 | represented in |
---|
0:08:57 | this room |
---|
0:09:00 | evaluation metric with the |
---|
0:09:02 | traditional metric yeah |
---|
0:09:04 | we have used a |
---|
0:09:05 | is essentially something like yeah |
---|
0:09:07 | total error rate |
---|
0:09:09 | we |
---|
0:09:10 | equally weight |
---|
0:09:11 | a lot of miss the cost of false alarm |
---|
0:09:15 | take an average of miss rate the false alarm rate but we |
---|
0:09:18 | average that over all possible |
---|
0:09:21 | oh |
---|
0:09:21 | uh target languages all possible alternative languages |
---|
0:09:25 | and they ended |
---|
0:09:26 | uh |
---|
0:09:27 | computed this way |
---|
0:09:28 | there's also waiting indicator for the open second edition |
---|
0:09:31 | of how we wait that |
---|
0:09:33 | the outer set alternative to the |
---|
0:09:35 | for the |
---|
0:09:36 | are actual target languages |
---|
0:09:40 | so it's turn |
---|
0:09:42 | results so |
---|
0:09:44 | terms of the official metric |
---|
0:09:46 | uh these are the results |
---|
0:09:47 | four systems are |
---|
0:09:49 | the average scores uh |
---|
0:09:51 | the close any open set |
---|
0:09:53 | in addition |
---|
0:09:54 | uh the scores are cumulative so that the three seconds or |
---|
0:09:59 | is the total of the green and the yellow and there |
---|
0:10:01 | red bar |
---|
0:10:04 | oh |
---|
0:10:05 | opens |
---|
0:10:05 | it's close to laugh opens another right we have |
---|
0:10:08 | labels |
---|
0:10:09 | oh some systems indicate yeah |
---|
0:10:12 | the same system close at an open set |
---|
0:10:14 | traditionally we have not identified |
---|
0:10:17 | systems with their scores are |
---|
0:10:20 | ah |
---|
0:10:21 | in public presentations but you can |
---|
0:10:23 | uh they're open close |
---|
0:10:25 | i was in languages and you know it |
---|
0:10:27 | you know |
---|
0:10:27 | it's really |
---|
0:10:28 | three seconds or ten seconds the three seconds that takes the |
---|
0:10:32 | big performance |
---|
0:10:32 | yeah |
---|
0:10:33 | it |
---|
0:10:33 | close in all three |
---|
0:10:35 | his clothes and language here is a |
---|
0:10:38 | oh but |
---|
0:10:39 | two sides |
---|
0:10:40 | be |
---|
0:10:41 | yeah yeah |
---|
0:10:43 | uh language we wouldn't see |
---|
0:10:45 | the relatively |
---|
0:10:47 | uh |
---|
0:10:48 | good performance as you might expect on |
---|
0:10:50 | and language pairs |
---|
0:10:55 | and we traditionally put these on |
---|
0:10:57 | yeah what |
---|
0:10:58 | uh |
---|
0:11:00 | there are that part with the |
---|
0:11:02 | close to have them alive we have the various uh |
---|
0:11:06 | that's another |
---|
0:11:07 | thirty second |
---|
0:11:08 | uh and of the right for once |
---|
0:11:10 | we give a flavour that |
---|
0:11:11 | different or in thirty seconds |
---|
0:11:13 | and second |
---|
0:11:14 | three seconds the |
---|
0:11:16 | linearity of the |
---|
0:11:17 | most of what |
---|
0:11:18 | uh |
---|
0:11:19 | suggest underlying |
---|
0:11:20 | normal distributions |
---|
0:11:22 | uh |
---|
0:11:24 | it was open set and you can |
---|
0:11:26 | see that |
---|
0:11:27 | problem you taking going |
---|
0:11:29 | what was that the |
---|
0:11:30 | open so that uh |
---|
0:11:34 | oh there we |
---|
0:11:35 | on the right but up of the |
---|
0:11:37 | close that an open set for each of the |
---|
0:11:39 | a three durations are |
---|
0:11:41 | uh give you |
---|
0:11:43 | a sense there |
---|
0:11:49 | findings an analysis |
---|
0:11:53 | um |
---|
0:11:54 | yeah |
---|
0:11:55 | and i will talk about the effect of |
---|
0:11:57 | averaging |
---|
0:11:58 | in that while the other terms |
---|
0:12:00 | pulling back at work |
---|
0:12:02 | moving away from the term |
---|
0:12:03 | cool |
---|
0:12:04 | we had a long discussion at the workshop is it right to |
---|
0:12:09 | average |
---|
0:12:09 | get |
---|
0:12:10 | across multiple we have the same data multi try out the multiple languages |
---|
0:12:15 | and we |
---|
0:12:16 | then resolve that |
---|
0:12:17 | what with all that but |
---|
0:12:18 | see that |
---|
0:12:20 | funny thing that happened in particular for the |
---|
0:12:22 | and here is is is |
---|
0:12:24 | two systems were right then |
---|
0:12:26 | ukrainian |
---|
0:12:29 | ah |
---|
0:12:30 | uh |
---|
0:12:31 | so the regions where they |
---|
0:12:33 | cranium language type uh |
---|
0:12:34 | that's in the |
---|
0:12:35 | lou the russian language that uh this |
---|
0:12:38 | and these |
---|
0:12:39 | yeah |
---|
0:12:40 | inherently a symmetry |
---|
0:12:43 | uh |
---|
0:12:44 | uh between these these cars 'cause |
---|
0:12:46 | this is the page that |
---|
0:12:47 | i think the only possibility does it |
---|
0:12:49 | russian or ukrainian |
---|
0:12:51 | and if you |
---|
0:12:52 | average those pulling together |
---|
0:12:55 | what happens or system on the combined curve and black |
---|
0:12:58 | all right through the middle |
---|
0:12:59 | that's what you |
---|
0:13:01 | expect random one |
---|
0:13:03 | system too |
---|
0:13:06 | the |
---|
0:13:06 | binder |
---|
0:13:09 | where uh |
---|
0:13:11 | uh i mean |
---|
0:13:12 | lester combined performance |
---|
0:13:14 | one is that um |
---|
0:13:18 | uh we show the |
---|
0:13:20 | distributions |
---|
0:13:21 | uh on the road |
---|
0:13:23 | um the rhino records for the two languages and then |
---|
0:13:26 | different shapes |
---|
0:13:27 | and uh another thing to note |
---|
0:13:29 | is the |
---|
0:13:30 | choruses |
---|
0:13:31 | show |
---|
0:13:33 | the actual decision points the circles we |
---|
0:13:36 | a minimum |
---|
0:13:38 | the average |
---|
0:13:38 | point |
---|
0:13:40 | and |
---|
0:13:41 | the first |
---|
0:13:42 | system on |
---|
0:13:43 | they're right on top of one another in the middle of |
---|
0:13:46 | but |
---|
0:13:47 | calibration |
---|
0:13:50 | two |
---|
0:13:51 | the |
---|
0:13:52 | right there |
---|
0:13:53 | way |
---|
0:13:54 | at the extremes |
---|
0:13:56 | in the case and uh with the |
---|
0:13:57 | sort of the middle indicating it indicating for calibration |
---|
0:14:01 | combine them |
---|
0:14:02 | but |
---|
0:14:03 | hello |
---|
0:14:04 | to it |
---|
0:14:04 | i is what you see |
---|
0:14:07 | um |
---|
0:14:08 | so as i said their questions |
---|
0:14:10 | is it the right thing to |
---|
0:14:12 | average |
---|
0:14:13 | across languages |
---|
0:14:14 | um |
---|
0:14:15 | we have done so |
---|
0:14:18 | if you look at language pairs |
---|
0:14:22 | uh this is for one system |
---|
0:14:24 | one of the system that all the language pairs |
---|
0:14:26 | are we look |
---|
0:14:27 | at |
---|
0:14:28 | george dunning created |
---|
0:14:29 | the curve i believe |
---|
0:14:31 | ah |
---|
0:14:31 | this looked at |
---|
0:14:32 | although there isn't shows the ones that have the |
---|
0:14:34 | why |
---|
0:14:35 | that |
---|
0:14:36 | um |
---|
0:14:37 | average error rate |
---|
0:14:39 | um |
---|
0:14:40 | so all the others were |
---|
0:14:42 | low two percent |
---|
0:14:43 | ah |
---|
0:14:45 | most confusable up of the top word |
---|
0:14:48 | in the or do |
---|
0:14:49 | and by then |
---|
0:14:51 | croatian |
---|
0:14:52 | um |
---|
0:14:53 | these were among the black |
---|
0:14:55 | pairs of interest in |
---|
0:14:56 | uh you know these are certainly mutually intelligible they may be considered dialect |
---|
0:15:01 | and indeed |
---|
0:15:03 | oh yeah |
---|
0:15:04 | at least arguable that |
---|
0:15:06 | he |
---|
0:15:06 | these |
---|
0:15:07 | language or dialect distinctions are based |
---|
0:15:09 | but also and political |
---|
0:15:11 | boundaries are |
---|
0:15:13 | are rather than um |
---|
0:15:17 | then uh more inherent language patterns |
---|
0:15:19 | any case those two of the most confusable |
---|
0:15:22 | next one for russian ukrainian |
---|
0:15:24 | the |
---|
0:15:24 | english |
---|
0:15:26 | dialect |
---|
0:15:26 | and |
---|
0:15:27 | a dari farsi which are |
---|
0:15:29 | generally considered |
---|
0:15:31 | usually |
---|
0:15:32 | palatable given to you |
---|
0:15:33 | you are there is a god in there |
---|
0:15:35 | real french and |
---|
0:15:36 | is |
---|
0:15:37 | is uh |
---|
0:15:39 | in the list |
---|
0:15:39 | um |
---|
0:15:42 | uh when we |
---|
0:15:42 | several of them |
---|
0:15:44 | no |
---|
0:15:46 | a little |
---|
0:15:47 | list of leading |
---|
0:15:48 | one |
---|
0:15:49 | two that were in our that's the |
---|
0:15:51 | pairs of interest |
---|
0:15:52 | yeah nice and mandarin |
---|
0:15:54 | portuguese and spanish |
---|
0:15:55 | maybe certain |
---|
0:15:57 | different ways |
---|
0:15:58 | languages that might be regarded a similar effect |
---|
0:16:00 | um |
---|
0:16:02 | maybe aren't in at least |
---|
0:16:03 | for the |
---|
0:16:05 | a system involve or not |
---|
0:16:07 | all that hard |
---|
0:16:08 | distinguish |
---|
0:16:12 | all that we can look at |
---|
0:16:13 | uh |
---|
0:16:14 | the terms were in the right |
---|
0:16:16 | to a particular target languages towards the |
---|
0:16:20 | if you of everything price |
---|
0:16:21 | languages here we do so looking at the training corpus |
---|
0:16:25 | type |
---|
0:16:25 | the |
---|
0:16:26 | they show the various |
---|
0:16:27 | languages for the |
---|
0:16:29 | that he had a training on the |
---|
0:16:32 | be away data |
---|
0:16:33 | and then we look at the ones that training on |
---|
0:16:35 | cts data |
---|
0:16:37 | um |
---|
0:16:37 | you see kind of a movement |
---|
0:16:39 | how would be either way |
---|
0:16:41 | ah yes |
---|
0:16:42 | performance |
---|
0:16:44 | was on |
---|
0:16:45 | one two Q is that we're languages |
---|
0:16:48 | um |
---|
0:16:49 | but uh |
---|
0:16:50 | done previously among many cases the training the cts and the |
---|
0:16:53 | yeah but |
---|
0:16:54 | realigned unless spanish korean |
---|
0:16:57 | mandarin |
---|
0:16:58 | for example were among the best performing languages |
---|
0:17:00 | worst performing or several |
---|
0:17:02 | indian languages i mean other confusions there in the |
---|
0:17:05 | or do indian english |
---|
0:17:12 | oh yeah we look at performance by |
---|
0:17:15 | but the |
---|
0:17:16 | what was it |
---|
0:17:17 | test corpus whether it be away or cts |
---|
0:17:20 | thirty hand and three |
---|
0:17:22 | um |
---|
0:17:25 | and |
---|
0:17:25 | one thing we were |
---|
0:17:26 | sorry please with |
---|
0:17:27 | you know we just introduced |
---|
0:17:29 | using the only data |
---|
0:17:30 | you know with the |
---|
0:17:32 | we we recognise well in fact |
---|
0:17:34 | the overall performance was probably comparable |
---|
0:17:38 | um |
---|
0:17:39 | this even though for some of the V O A languages that are |
---|
0:17:42 | training with cts |
---|
0:17:43 | four |
---|
0:17:44 | some reason i don't know we know why |
---|
0:17:46 | the uh |
---|
0:17:47 | cts |
---|
0:17:48 | curves here appear less linear |
---|
0:17:53 | and some history |
---|
0:17:56 | so we like to |
---|
0:17:58 | but back |
---|
0:17:59 | over the course of several evaluation |
---|
0:18:01 | how things change |
---|
0:18:03 | are we seeing better performance there have yet |
---|
0:18:05 | that that |
---|
0:18:06 | ah |
---|
0:18:08 | okay we have occurs over there |
---|
0:18:09 | evaluation use of the numbers of target languages |
---|
0:18:12 | go on |
---|
0:18:14 | up in recent evaluation |
---|
0:18:16 | number of participants will open up in them too much recent evaluations but we're |
---|
0:18:20 | yeah slightly into the nineteen thirty seven seven wonderful |
---|
0:18:23 | hereby try to |
---|
0:18:25 | uh |
---|
0:18:26 | you're simply blah |
---|
0:18:27 | and with an increasing number |
---|
0:18:30 | of um |
---|
0:18:31 | out of seven languages |
---|
0:18:37 | as for the basic |
---|
0:18:38 | one of the major |
---|
0:18:40 | um |
---|
0:18:43 | for thirty seconds |
---|
0:18:44 | with that |
---|
0:18:46 | nice |
---|
0:18:46 | uh |
---|
0:18:48 | right and uh |
---|
0:18:50 | you know |
---|
0:18:50 | garcia good |
---|
0:18:51 | data exchange languages it |
---|
0:18:53 | type change that but |
---|
0:18:54 | are we think uh improved results for |
---|
0:18:57 | three second |
---|
0:18:58 | four |
---|
0:18:59 | every second for the past |
---|
0:19:01 | couple evaluations are |
---|
0:19:03 | we seem to |
---|
0:19:04 | yeah |
---|
0:19:04 | have but a |
---|
0:19:06 | i'm terms of the |
---|
0:19:07 | the system |
---|
0:19:08 | also noted this year's three second performance was at the level |
---|
0:19:12 | thirty second performance |
---|
0:19:14 | in nineteen ninety six |
---|
0:19:19 | oh here we |
---|
0:19:19 | do some history looking at the best system |
---|
0:19:22 | you know caviar differences reflect |
---|
0:19:25 | well |
---|
0:19:25 | systems |
---|
0:19:26 | and |
---|
0:19:27 | someone changes in the task definition and of course |
---|
0:19:29 | different data in it |
---|
0:19:31 | hard to sort those out of it |
---|
0:19:33 | a different vol |
---|
0:19:34 | no less |
---|
0:19:35 | what can we say about how well |
---|
0:19:38 | romances |
---|
0:19:39 | there |
---|
0:19:40 | ah |
---|
0:19:42 | um |
---|
0:19:43 | i think we hinted that before but |
---|
0:19:45 | three seconds um |
---|
0:19:48 | we see a |
---|
0:19:49 | it was lacking |
---|
0:19:50 | oh nine wounded or seven media |
---|
0:19:52 | anything ewing performance improvement but |
---|
0:19:55 | in the |
---|
0:19:56 | there can second bite out in the |
---|
0:19:59 | thirty second maybe we |
---|
0:20:01 | right progress |
---|
0:20:02 | a bit |
---|
0:20:08 | oh really |
---|
0:20:09 | look at a couple of individual languages |
---|
0:20:11 | uh |
---|
0:20:12 | that's for sure |
---|
0:20:13 | tend to do the same language uh |
---|
0:20:15 | oh nine |
---|
0:20:16 | O seven in the |
---|
0:20:17 | of the 'cause O nine minutes of seven and the colours are one of the three durations |
---|
0:20:22 | and here |
---|
0:20:24 | to kind of language in which they were we have language |
---|
0:20:26 | pair since |
---|
0:20:27 | for korean |
---|
0:20:29 | oh |
---|
0:20:31 | we haven't seen improvements |
---|
0:20:32 | throughout |
---|
0:20:33 | right |
---|
0:20:34 | but |
---|
0:20:35 | the recycling three in two thousand nine is |
---|
0:20:38 | uh |
---|
0:20:38 | perfect the results are are |
---|
0:20:41 | ah |
---|
0:20:42 | languages |
---|
0:20:43 | part is one |
---|
0:20:44 | we see the overall having the |
---|
0:20:46 | we sing for the evaluation the whole |
---|
0:20:48 | ah |
---|
0:20:49 | improvement at three seconds uh |
---|
0:20:51 | a little change or even ridge regression |
---|
0:20:54 | thirty five |
---|
0:20:55 | and of course there are going to do that or |
---|
0:20:57 | new this year |
---|
0:20:59 | as well |
---|
0:21:04 | oh |
---|
0:21:04 | also here |
---|
0:21:06 | but dialect kind of has to be done previously to that |
---|
0:21:10 | american english and |
---|
0:21:11 | indian english uh |
---|
0:21:14 | uh |
---|
0:21:15 | that and we |
---|
0:21:16 | do see improvement like two thousand nine |
---|
0:21:19 | which is that the minutes |
---|
0:21:21 | thirty seconds |
---|
0:21:24 | and second |
---|
0:21:27 | and even more |
---|
0:21:28 | uh |
---|
0:21:34 | uh |
---|
0:21:35 | predicament |
---|
0:21:37 | a big there's three seconds |
---|
0:21:38 | american indian english |
---|
0:21:42 | and |
---|
0:21:42 | going to |
---|
0:21:44 | in the or do |
---|
0:21:46 | do you |
---|
0:21:47 | known to be a challenging language here |
---|
0:21:49 | but we see improvement thirty seconds |
---|
0:21:52 | three seconds |
---|
0:21:55 | yeah |
---|
0:21:55 | there's ten seconds |
---|
0:21:59 | oh |
---|
0:21:59 | and wait |
---|
0:22:00 | a three seconds |
---|
0:22:01 | ah |
---|
0:22:02 | three seconds |
---|
0:22:03 | well maybe this improvement |
---|
0:22:04 | yeah |
---|
0:22:05 | but have it |
---|
0:22:06 | three seconds in the order was that |
---|
0:22:09 | or too hard |
---|
0:22:10 | comparison |
---|
0:22:11 | performance little better than |
---|
0:22:13 | and random |
---|
0:22:16 | your words in summary |
---|
0:22:19 | are we experiment with a new |
---|
0:22:21 | data collection paradigm |
---|
0:22:23 | and we're reasonably satisfied with that producing a |
---|
0:22:26 | and effective evaluation get berkeley |
---|
0:22:29 | have trouble performance |
---|
0:22:31 | repeating this trick when the right data for future evaluations that remains a challenge |
---|
0:22:36 | uh we shall continue performance improvement |
---|
0:22:39 | uh of having a son |
---|
0:22:41 | a real nice based on the |
---|
0:22:43 | shorter segments |
---|
0:22:46 | um |
---|
0:22:48 | for both coding open say condition |
---|
0:22:51 | a language |
---|
0:22:52 | pairs was introduced |
---|
0:22:54 | here in particular for marketers it |
---|
0:22:56 | relative interesting poses challenges more likely |
---|
0:22:59 | you part of any |
---|
0:23:00 | in in |
---|
0:23:00 | if your evaluation that we do |
---|
0:23:03 | um |
---|
0:23:03 | this story |
---|
0:23:05 | an issue we've argued about about |
---|
0:23:07 | whether used actors average cross language |
---|
0:23:10 | and i think that yeah |
---|
0:23:11 | uh includes might |
---|
0:23:13 | right off |
---|
0:23:14 | thank you |
---|
0:23:21 | and |
---|
0:23:22 | information |
---|
0:23:30 | this is |
---|
0:23:30 | just |
---|
0:23:31 | a common |
---|
0:23:32 | on the |
---|
0:23:33 | comparing |
---|
0:23:34 | uh |
---|
0:23:34 | she happens |
---|
0:23:36 | tween |
---|
0:23:37 | uh |
---|
0:23:37 | that's done |
---|
0:23:38 | nist evaluations yes with the |
---|
0:23:40 | number |
---|
0:23:41 | target languages |
---|
0:23:42 | uh |
---|
0:23:43 | that's |
---|
0:23:45 | uh |
---|
0:23:46 | it |
---|
0:23:47 | uh |
---|
0:23:48 | they're more languages than the weight vanished that's it |
---|
0:23:52 | the hypothesis |
---|
0:23:55 | mostly |
---|
0:23:57 | so |
---|
0:23:58 | there are more languages |
---|
0:24:01 | uh |
---|
0:24:01 | you know list |
---|
0:24:02 | about five months |
---|
0:24:04 | 'cause you know |
---|
0:24:05 | you less sure about which one |
---|
0:24:07 | to be |
---|
0:24:08 | so |
---|
0:24:09 | um |
---|
0:24:10 | that makes it a little bit on that |
---|
0:24:13 | it makes it a little bit harder |
---|
0:24:14 | just one |
---|
0:24:16 | makes it a little bit |
---|
0:24:17 | not not a lot |
---|
0:24:19 | if you were doing just fine |
---|
0:24:21 | identification |
---|
0:24:22 | obviously |
---|
0:24:23 | the number of languages as a strong stick |
---|
0:24:26 | second autistic |
---|
0:24:28 | which |
---|
0:24:29 | which don't have |
---|
0:24:31 | so |
---|
0:24:32 | arguably if we just |
---|
0:24:33 | apparently tread water but it made the problem are doing |
---|
0:24:36 | introduce language people haven't seen before |
---|
0:24:38 | i'd argue that |
---|
0:24:39 | that |
---|
0:24:40 | it'd be apart |
---|
0:24:41 | it's also predicate argument for the |
---|
0:24:43 | language pairs condition which |
---|
0:24:45 | well |
---|
0:24:47 | so |
---|
0:24:47 | tenderly |
---|
0:24:48 | i think that affect yeah |
---|
0:24:54 | we should |
---|
0:24:57 | you plan to |
---|
0:24:58 | to use it |
---|
0:24:59 | voice of america |
---|
0:25:01 | uh it uh for the nist evaluation |
---|
0:25:04 | oh |
---|
0:25:06 | there |
---|
0:25:07 | other than your right |
---|
0:25:08 | he |
---|
0:25:08 | it we need to discuss this with i don't |
---|
0:25:11 | think |
---|
0:25:12 | we can hope thing |
---|
0:25:13 | just get more voice of america data we're |
---|
0:25:16 | exploring |
---|
0:25:17 | um |
---|
0:25:19 | other |
---|
0:25:20 | similar type |
---|
0:25:21 | or or that may be available that have multiple languages |
---|
0:25:25 | um are there any |
---|
0:25:26 | recommendation that people with them |
---|
0:25:34 | yep |
---|
0:26:02 | uh i'm honestly wondering why |
---|
0:26:04 | four |
---|
0:26:05 | uh |
---|
0:26:06 | identification |
---|
0:26:08 | oh |
---|
0:26:08 | just |
---|
0:26:09 | sure |
---|
0:26:10 | so make or break |
---|
0:26:11 | two |
---|
0:26:12 | cation |
---|
0:26:13 | four |
---|
0:26:14 | i mean |
---|
0:26:14 | to do that |
---|
0:26:15 | uh |
---|
0:26:16 | you should |
---|
0:26:16 | and i'm i'm |
---|
0:26:18 | and you are using uh |
---|
0:26:19 | uh that it |
---|
0:26:20 | editions |
---|
0:26:21 | oh |
---|
0:26:23 | um |
---|
0:26:24 | i would like to do i need to find a direct |
---|
0:26:27 | just |
---|
0:26:27 | hmmm |
---|
0:26:28 | you know |
---|
0:26:29 | to |
---|
0:26:30 | yeah |
---|
0:26:31 | see |
---|
0:26:31 | oh |
---|
0:26:32 | and identification |
---|
0:26:34 | and i wonder why |
---|
0:26:35 | um |
---|
0:26:36 | you you could try your interesting identification with |
---|
0:26:40 | recognition |
---|
0:26:42 | because |
---|
0:26:42 | it's a |
---|
0:26:44 | whatever |
---|
0:26:44 | this |
---|
0:26:46 | if you use |
---|
0:26:47 | you |
---|
0:26:48 | correlation |
---|
0:26:48 | we thank you |
---|
0:26:50 | right but |
---|
0:26:51 | there's no |
---|
0:26:52 | no |
---|
0:26:53 | you you can |
---|
0:26:54 | some |
---|
0:26:55 | oh |
---|
0:26:56 | accuracy |
---|
0:26:57 | yeah |
---|
0:26:57 | yeah |
---|
0:26:58 | i i always wonder why |
---|
0:27:00 | i wanna see how well it does |
---|
0:27:02 | yeah |
---|
0:27:04 | yeah |
---|
0:27:05 | you use you understand |
---|
0:27:06 | well |
---|
0:27:08 | and i am not |
---|
0:27:09 | sure you're saying you're interested in |
---|
0:27:11 | bring in distinguishing particular |
---|
0:27:13 | mostly related to i |
---|
0:27:15 | or are you saying i think of the identification problem |
---|
0:27:19 | yeah the language of their and possibilities which one is it |
---|
0:27:23 | it yeah i |
---|
0:27:24 | yeah |
---|
0:27:25 | you target for that |
---|
0:27:26 | dialect |
---|
0:27:27 | i think |
---|
0:27:27 | yeah i'm interested in education |
---|
0:27:30 | like |
---|
0:27:31 | see |
---|
0:27:31 | a comparison |
---|
0:27:33 | uh |
---|
0:27:34 | this |
---|
0:27:35 | if you use your |
---|
0:27:37 | oh |
---|
0:27:38 | but i mean the language here |
---|
0:27:41 | condition does that computing |
---|
0:27:44 | yeah |
---|
0:27:45 | but |
---|
0:27:45 | but |
---|
0:27:46 | yeah |
---|
0:27:47 | what |
---|
0:27:48 | you have |
---|
0:27:49 | oh |
---|
0:27:50 | oh |
---|
0:27:52 | right |
---|
0:27:52 | yeah |
---|
0:27:53 | okay |
---|
0:27:54 | oh |
---|
0:27:55 | uh |
---|
0:27:56 | no |
---|
0:27:58 | one |
---|
0:27:58 | like |
---|
0:27:59 | yes |
---|
0:27:59 | the |
---|
0:28:00 | i comparison |
---|
0:28:03 | oh |
---|
0:28:04 | not |
---|
0:28:06 | yeah |
---|
0:28:06 | and |
---|
0:28:07 | and |
---|
0:28:08 | i |
---|
0:28:09 | okay |
---|
0:28:10 | yeah |
---|
0:28:10 | i think |
---|
0:28:11 | yeah |
---|
0:28:11 | huh |
---|
0:28:12 | and |
---|
0:28:13 | right |
---|
0:28:14 | yeah |
---|
0:28:15 | no |
---|
0:28:16 | yeah |
---|
0:28:19 | as opposed to |
---|
0:28:23 | i'm not okay nectar |
---|
0:28:24 | a couple of that but maybe that's something we can talk about for the wrong one |
---|
0:28:35 | yes |
---|
0:28:35 | right |
---|
0:28:35 | i can like |
---|
0:28:36 | combine |
---|
0:28:37 | um |
---|
0:28:38 | uh |
---|
0:28:39 | i've |
---|
0:28:42 | it's one thing |
---|
0:28:43 | so |
---|
0:28:44 | yes |
---|
0:28:45 | um |
---|
0:28:46 | i've uh |
---|
0:28:47 | i think that's |
---|
0:28:48 | discuss |
---|
0:28:49 | but |
---|
0:28:49 | yeah |
---|
0:28:51 | uh |
---|
0:28:51 | qualitative |
---|
0:28:52 | uh |
---|
0:28:54 | this and |
---|
0:28:55 | and other what this |
---|
0:28:57 | oh |
---|
0:28:58 | someone is |
---|
0:28:59 | so |
---|
0:29:00 | elements of this |
---|
0:29:01 | the |
---|
0:29:01 | the |
---|
0:29:02 | pulling of the day good |
---|
0:29:04 | and equal error rate being one point |
---|
0:29:06 | oh |
---|
0:29:07 | on the go |
---|
0:29:08 | it's part of that discussion |
---|
0:29:10 | um |
---|
0:29:11 | so |
---|
0:29:12 | i'm not going to |
---|
0:29:14 | start that again |
---|
0:29:15 | no |
---|
0:29:16 | uh |
---|
0:29:17 | i think i have something useful to say about |
---|
0:29:19 | average |
---|
0:29:20 | which |
---|
0:29:21 | so |
---|
0:29:22 | if you doing |
---|
0:29:24 | identification |
---|
0:29:25 | uh |
---|
0:29:27 | given a speech segment |
---|
0:29:28 | you told |
---|
0:29:30 | you're in languages |
---|
0:29:32 | speech segment can be in one of these in language |
---|
0:29:35 | then |
---|
0:29:36 | uh |
---|
0:29:36 | you also have to assume some prior |
---|
0:29:39 | so you can assume a flat prior |
---|
0:29:41 | of the of those languages that you would |
---|
0:29:44 | uh |
---|
0:29:47 | yeah |
---|
0:29:47 | likely |
---|
0:29:48 | uh |
---|
0:29:49 | before you look |
---|
0:29:50 | the speech |
---|
0:29:51 | that that would be |
---|
0:29:52 | uh |
---|
0:29:52 | the identification problem |
---|
0:29:54 | so what nist is done |
---|
0:29:57 | is |
---|
0:29:58 | that i |
---|
0:29:58 | uh |
---|
0:29:59 | so |
---|
0:30:01 | something this problem |
---|
0:30:02 | if they're in languages at in doesn't apply |
---|
0:30:06 | so |
---|
0:30:08 | the in doesn't primes is |
---|
0:30:10 | uh |
---|
0:30:12 | target language number one |
---|
0:30:13 | as a prior |
---|
0:30:14 | oh |
---|
0:30:16 | and |
---|
0:30:16 | all of the other languages |
---|
0:30:18 | uh |
---|
0:30:19 | susan between them |
---|
0:30:20 | uh |
---|
0:30:22 | oh a probability of heart |
---|
0:30:24 | so it's |
---|
0:30:25 | it's just you try |
---|
0:30:27 | and |
---|
0:30:28 | then |
---|
0:30:28 | you go to the next topic |
---|
0:30:30 | two |
---|
0:30:30 | you say you know this one has probability of false |
---|
0:30:33 | all the others |
---|
0:30:34 | uh |
---|
0:30:35 | i have a smaller probability |
---|
0:30:37 | then |
---|
0:30:38 | you missus |
---|
0:30:39 | D |
---|
0:30:40 | uh |
---|
0:30:41 | uh |
---|
0:30:42 | essentially i didn't |
---|
0:30:43 | cation yeah right |
---|
0:30:45 | given that probably |
---|
0:30:46 | in times |
---|
0:30:47 | and you and all those |
---|
0:30:50 | it it right |
---|
0:30:51 | that's the that's the secret |
---|
0:30:54 | um |
---|
0:30:56 | so |
---|
0:31:00 | okay |
---|
0:31:02 | and he to be |
---|
0:31:03 | to go on |
---|
0:31:04 | the next |
---|
0:31:04 | speaker |
---|
0:31:05 | again |
---|
0:31:07 | interesting |
---|
0:31:12 | yeah |
---|