0:00:07 | oh money everyone my name is raymond and |
---|
0:00:09 | we are from the chinese university of hong kong and a institute for infocomm research in singapore |
---|
0:00:15 | oh first i thing i have to uh these two points which characterise our work today |
---|
0:00:20 | well first is uh unlike |
---|
0:00:22 | previous |
---|
0:00:22 | presentation which |
---|
0:00:24 | at least |
---|
0:00:24 | touch something about speaker recognition |
---|
0:00:26 | our work |
---|
0:00:28 | is exclusive |
---|
0:00:29 | exclusively on language recognition here |
---|
0:00:31 | today |
---|
0:00:31 | and the second point is uh we tried |
---|
0:00:34 | kind of an untidy alternative approach |
---|
0:00:36 | in a |
---|
0:00:37 | focusing a very specific asian language recognition |
---|
0:00:40 | we find that in uh the previous uh running two O nine they are |
---|
0:00:44 | or some very difficult languages |
---|
0:00:46 | so we |
---|
0:00:47 | just focus on these scenarios |
---|
0:00:49 | and |
---|
0:00:49 | that's why we have all this work all the action how dependent score calibration for language right |
---|
0:00:56 | so this outline |
---|
0:00:57 | today's presentation first |
---|
0:00:58 | oh |
---|
0:00:59 | will introduce the problem and then we we have a little bit about detection cost |
---|
0:01:04 | and now we will illustrate our collaboration with a two pass |
---|
0:01:07 | the first is a pairwise language recognition and then a general language recognition |
---|
0:01:11 | funding is the summer |
---|
0:01:15 | so the for language recognition task uh we defined is as follows given the |
---|
0:01:20 | target language the task of language recognition is to detect the presence of targets in the testing trial |
---|
0:01:26 | the practical linguistic all the calculates the school |
---|
0:01:29 | indicating the presence of the target |
---|
0:01:31 | and then uh make decisions |
---|
0:01:33 | when |
---|
0:01:33 | trainees decision is made then that is the detection cost |
---|
0:01:37 | so |
---|
0:01:37 | typical detection cost uh i think |
---|
0:01:39 | most of the overtime area with which a detection misses and false alarms |
---|
0:01:46 | and in our what we we interpret a score calibration as the adjustment of the markov items of |
---|
0:01:51 | score |
---|
0:01:51 | which in turn affect |
---|
0:01:53 | detectors decisions |
---|
0:01:54 | and the objective is |
---|
0:01:55 | to to calibration in order to have a minimum detection cost |
---|
0:01:59 | um |
---|
0:02:01 | more generally in uh in global |
---|
0:02:03 | calibration or |
---|
0:02:04 | as uh |
---|
0:02:05 | the remote set a |
---|
0:02:07 | application independent calibration |
---|
0:02:09 | the parameters of the detection cost function |
---|
0:02:11 | i usually ignored |
---|
0:02:15 | and the |
---|
0:02:15 | result of that is uh for for global cooperation is |
---|
0:02:19 | each transform the likelihood score in a global manner |
---|
0:02:22 | and it does not pay special attention to highly compressible try |
---|
0:02:25 | we do not say whether it is good or bad but |
---|
0:02:27 | in this work we going to do |
---|
0:02:30 | another way |
---|
0:02:31 | in language recognition two O nine there are some pairs of related languages |
---|
0:02:35 | uh listed already in the uh specifications |
---|
0:02:38 | so detection of these related languages becomes a bottleneck because because |
---|
0:02:42 | they are typical is easy to mix them up |
---|
0:02:44 | for example rush and then ukrainian |
---|
0:02:46 | in the end to do |
---|
0:02:47 | so in |
---|
0:02:48 | the following we will focus on these pan languages |
---|
0:02:51 | all we've always then one one at a time for example we call rest in the target language |
---|
0:02:56 | and then we have a related language call ukrainian |
---|
0:02:59 | and afterwards we have the high the language for all ukrainian and waited related language become russian |
---|
0:03:05 | and then we have ten rounds of calibrations |
---|
0:03:07 | such that the final or |
---|
0:03:09 | ever or will be reduced |
---|
0:03:14 | so not just |
---|
0:03:15 | very brief recap on the detection cost because uh you could you look at a lot diagram so |
---|
0:03:20 | just to have you uh |
---|
0:03:21 | comprehend what |
---|
0:03:22 | we don't we going to do |
---|
0:03:25 | so uh |
---|
0:03:29 | for example we have a two |
---|
0:03:32 | causes X T and H R two languages had a language related languages |
---|
0:03:37 | and then we have the uh log likelihood ratio form |
---|
0:03:41 | the target language |
---|
0:03:42 | so we call that a lamp H T |
---|
0:03:44 | it is the score from the detector H T |
---|
0:03:47 | so let a be the index of the test file and then if we plot the uh lambda |
---|
0:03:51 | H T against K it would be like this |
---|
0:03:54 | so you see a lot of of trials here so this is the the the school of one trial |
---|
0:04:00 | and uh they are circles and triangles |
---|
0:04:02 | circles many stands for the uh |
---|
0:04:05 | trials whose true that is |
---|
0:04:07 | we don't to H T |
---|
0:04:08 | and triangle stands for |
---|
0:04:10 | represent the the |
---|
0:04:11 | the trials where the two classes uh |
---|
0:04:14 | related uh target related costs H R |
---|
0:04:18 | so uh we focus on the field |
---|
0:04:20 | circles and triangles |
---|
0:04:22 | you'll be easy to understand you triangles uh false alarm because this |
---|
0:04:25 | about stressful |
---|
0:04:26 | and then the field circles are |
---|
0:04:29 | detection may because this is under the threshold |
---|
0:04:35 | so all again we keep it very simple to the objective is |
---|
0:04:39 | only two we used a |
---|
0:04:41 | you have to miss |
---|
0:04:42 | and false alarm |
---|
0:04:43 | but uh when we use that that means we want to reduce the kind of peace filled circles of |
---|
0:04:48 | and these few triangles and is kind of a discrete |
---|
0:04:51 | thing |
---|
0:04:51 | and we don't want to do that we want to |
---|
0:04:53 | do it in a quantitative way |
---|
0:04:54 | so all of this can be done by minimising the iranians deviation |
---|
0:04:58 | with respect to the detection threshold |
---|
0:05:00 | which means we want to minimise the |
---|
0:05:03 | where to it |
---|
0:05:04 | based |
---|
0:05:04 | all these you |
---|
0:05:05 | triangles empty circles from |
---|
0:05:07 | action detection threshold |
---|
0:05:08 | and we assume that this |
---|
0:05:10 | section turtle is already fixed |
---|
0:05:11 | at the very beginning |
---|
0:05:17 | so now we can uh |
---|
0:05:19 | introduce how we do the parrots language recognition |
---|
0:05:25 | first we make uh |
---|
0:05:26 | simple hypothesis |
---|
0:05:28 | because uh |
---|
0:05:28 | we we we |
---|
0:05:29 | told you that they are related tasks of languages |
---|
0:05:32 | so uh |
---|
0:05:34 | below like ways to solve these two related languages a number H T a number H ah |
---|
0:05:39 | contains very similar and |
---|
0:05:41 | complementary information |
---|
0:05:43 | so |
---|
0:05:44 | before you |
---|
0:05:45 | not at the route which shows you only uh the |
---|
0:05:48 | not like the racial foreground H T |
---|
0:05:51 | and now we introduce another dimension number H R which uh detection anyway so for related hypo |
---|
0:05:58 | and uh the |
---|
0:05:59 | trend of the of the schools |
---|
0:06:01 | normally follows |
---|
0:06:03 | this manner |
---|
0:06:04 | and uh to to understand this easily we can just pick |
---|
0:06:08 | any |
---|
0:06:09 | trial from a target cost sixty so it is natural that it has a very high school |
---|
0:06:14 | of number H T because it's detecting a target cost and it has a low score |
---|
0:06:18 | of number H ah |
---|
0:06:19 | because uh it is not |
---|
0:06:21 | belonging to the |
---|
0:06:23 | how to construct the rate cut |
---|
0:06:25 | and similarly for |
---|
0:06:26 | another trial in uh the related costs |
---|
0:06:29 | it has a high school in um the H R and those boring |
---|
0:06:32 | and number X T |
---|
0:06:34 | so uh |
---|
0:06:37 | this |
---|
0:06:37 | shape |
---|
0:06:37 | uh simply uh uh |
---|
0:06:39 | problems that |
---|
0:06:40 | to think of |
---|
0:06:41 | how about if we rotate the whole score space |
---|
0:06:44 | such that uh |
---|
0:06:45 | we can obtain a new |
---|
0:06:47 | score space and detection special like this |
---|
0:06:49 | and mathematically it is |
---|
0:06:51 | actually we |
---|
0:06:52 | when |
---|
0:06:53 | when we determine the detection threshold we not only consider |
---|
0:06:57 | a number H T but also consider the number H O which means |
---|
0:07:01 | we use the |
---|
0:07:02 | detect tech schools from to detect it |
---|
0:07:04 | or |
---|
0:07:05 | target language related language in order to her |
---|
0:07:08 | the final decision |
---|
0:07:09 | four |
---|
0:07:10 | well there |
---|
0:07:11 | this |
---|
0:07:11 | a tribunal to |
---|
0:07:13 | i cos |
---|
0:07:13 | steve |
---|
0:07:16 | so uh mathematical you want to formulate |
---|
0:07:18 | like this |
---|
0:07:19 | uh we talk about that uh we want to |
---|
0:07:21 | do this in a quantitative way to minimise it what to wear and use deviation which is the |
---|
0:07:26 | distance between these points |
---|
0:07:28 | from stress |
---|
0:07:30 | so uh we |
---|
0:07:32 | tech |
---|
0:07:33 | this |
---|
0:07:33 | he claims that was that |
---|
0:07:35 | we look into this uh lander minus the universe |
---|
0:07:38 | so this is uh |
---|
0:07:39 | the |
---|
0:07:41 | displacements of all |
---|
0:07:43 | these |
---|
0:07:43 | school from the press will |
---|
0:07:44 | so for detection based |
---|
0:07:45 | the ms is below the stress also |
---|
0:07:48 | this |
---|
0:07:48 | difference is negative |
---|
0:07:50 | and for false alarm the differences pause |
---|
0:07:52 | and why is |
---|
0:07:53 | uh representing the the it should label of |
---|
0:07:56 | the |
---|
0:07:57 | uh detection trial |
---|
0:07:59 | if it is appealing to that i have uh |
---|
0:08:02 | the better is one if it does not you don't to target cost about it is negative one |
---|
0:08:06 | so we can see that by |
---|
0:08:08 | multiplying the Y and this uh lambda minus three to |
---|
0:08:12 | four |
---|
0:08:14 | two cases of error as we always have |
---|
0:08:17 | some positive value |
---|
0:08:18 | and for correct acceptance and rejection as we always have negative better and then we use the |
---|
0:08:23 | max operation to remove |
---|
0:08:25 | these are all right acceptance rejection |
---|
0:08:27 | scores |
---|
0:08:28 | so is that |
---|
0:08:30 | finally what left over is |
---|
0:08:32 | only the erroneous deviation and then we sum over the whole database |
---|
0:08:35 | a week a problem |
---|
0:08:36 | first |
---|
0:08:37 | trial |
---|
0:08:38 | the the last row |
---|
0:08:39 | and we like to adjust the detection not letting way so where |
---|
0:08:42 | the adjusted the likelihood |
---|
0:08:44 | not the dash |
---|
0:08:45 | who produced |
---|
0:08:46 | this |
---|
0:08:46 | how to iran is T V H |
---|
0:08:56 | so perhaps i i should go back to the last line because uh |
---|
0:09:00 | what we do is to reduce the |
---|
0:09:02 | uh iran is deviation |
---|
0:09:04 | the the |
---|
0:09:05 | the distance between these errors from the stressful |
---|
0:09:07 | and the way we do that is |
---|
0:09:09 | by rotating the the |
---|
0:09:11 | the score space |
---|
0:09:12 | and |
---|
0:09:13 | the rotation of school space |
---|
0:09:15 | is actually accomplished |
---|
0:09:17 | uh i |
---|
0:09:18 | this equation because we want to do |
---|
0:09:21 | and linear combination of |
---|
0:09:22 | discourse from two detectors |
---|
0:09:24 | and the result is that the score space is rotated |
---|
0:09:28 | and ah here the whole problem is now formulated we have the objective function of iran is deviation |
---|
0:09:33 | and then we want to minimise that |
---|
0:09:35 | subject to uh be |
---|
0:09:38 | the union combination uh back to our for |
---|
0:09:41 | and then we also have uh |
---|
0:09:42 | little constraint |
---|
0:09:43 | uh just to make sure that |
---|
0:09:45 | uh the final result are updated |
---|
0:09:48 | a lot like racial would not be out of range |
---|
0:09:51 | and after uh |
---|
0:09:53 | we have done |
---|
0:09:54 | this |
---|
0:09:54 | optimisation of a rotating the small space with the developments that we upside these our parameter to D version dataset |
---|
0:10:02 | and then we go back to the normal |
---|
0:10:04 | or error metrics |
---|
0:10:06 | which is the detection cost |
---|
0:10:07 | and |
---|
0:10:08 | because this time we illustrate the pairwise language recognition process so we have one |
---|
0:10:12 | this |
---|
0:10:13 | term and one was a long term in the |
---|
0:10:15 | in the uh errors |
---|
0:10:17 | but |
---|
0:10:18 | that is |
---|
0:10:21 | so this is the key |
---|
0:10:23 | we uh diagram of our systems |
---|
0:10:26 | what we use is uh |
---|
0:10:28 | from what i think and prosodic fusion system oh |
---|
0:10:31 | i've to me that uh we only use one subsystem |
---|
0:10:34 | uh in |
---|
0:10:36 | you know is a tool for promote that takes so it's |
---|
0:10:38 | it's not a bad uh system to start with but |
---|
0:10:41 | what we want to try is to |
---|
0:10:43 | tried the or |
---|
0:10:44 | effectiveness of peace corps corporation in this particular scenario |
---|
0:10:48 | so how do we get the |
---|
0:10:50 | score from a different detectors |
---|
0:10:52 | oh we have located ten |
---|
0:10:54 | difficult idea languages |
---|
0:10:56 | then |
---|
0:10:57 | for each target languages |
---|
0:10:58 | uh we |
---|
0:10:59 | choose |
---|
0:11:00 | the lot like to resolve it |
---|
0:11:02 | so and then deal of leeway so of the related costs |
---|
0:11:06 | and then we do be |
---|
0:11:08 | parameter optimisation which means we we we will rotate |
---|
0:11:11 | the score space |
---|
0:11:13 | uh such that |
---|
0:11:14 | we have an update on the dash the updated a lot lighter racial |
---|
0:11:18 | the training data we use is uh |
---|
0:11:21 | this is a rarity |
---|
0:11:22 | a nineteen ninety six to two or seven corpora |
---|
0:11:25 | and the egg evaluation data is |
---|
0:11:27 | on these two or ninety version set to give you a brief idea of or how |
---|
0:11:31 | the amount of data we had uh |
---|
0:11:33 | for the general has |
---|
0:11:35 | which you see |
---|
0:11:36 | in a to slice |
---|
0:11:38 | the number of trout is about ten thousand |
---|
0:11:40 | for twenty three languages |
---|
0:11:43 | and |
---|
0:11:43 | to train this |
---|
0:11:45 | i'll for parameter rotating the score space we use a development set |
---|
0:11:49 | the departments that comes from these two or seventy version that is that and excerpts from |
---|
0:11:54 | two O nine B vitamins that |
---|
0:11:56 | and there's a total or |
---|
0:11:57 | six thousand trials |
---|
0:11:59 | and that estimations all thirty second |
---|
0:12:03 | so this is the result of the pairwise uh language recognition |
---|
0:12:07 | uh |
---|
0:12:08 | has |
---|
0:12:09 | the original E at least given here is about twenty percent for all these uh difficult languages |
---|
0:12:15 | and after we apply the us |
---|
0:12:17 | school |
---|
0:12:17 | calibration |
---|
0:12:18 | the error is about all |
---|
0:12:21 | nineteen present which is |
---|
0:12:22 | about five percent relative |
---|
0:12:24 | eer reduction |
---|
0:12:26 | we can see bosnian croatian confusion cannot |
---|
0:12:28 | be reduced by this method |
---|
0:12:30 | uh |
---|
0:12:31 | which |
---|
0:12:32 | kind of a |
---|
0:12:33 | which is because uh i guess |
---|
0:12:35 | the two languages |
---|
0:12:36 | mixed up very seriously in |
---|
0:12:39 | our |
---|
0:12:40 | score |
---|
0:12:41 | and in a related language pair confusion reduction is so |
---|
0:12:45 | more significant for the worst performing line |
---|
0:12:47 | let's see for example we compare |
---|
0:12:49 | oh |
---|
0:12:50 | ah see far |
---|
0:12:52 | harry and posse |
---|
0:12:53 | and uh the error reduction in there it is |
---|
0:12:56 | a more scientific |
---|
0:12:57 | with the help of a |
---|
0:12:59 | that is cool problem of prosody |
---|
0:13:07 | so the uh |
---|
0:13:08 | improvement of pairwise language recognition is |
---|
0:13:10 | not very significant but uh we want to extend |
---|
0:13:13 | this |
---|
0:13:14 | uh method to the general language recognition and then we'll see |
---|
0:13:17 | a more significant error reduction there |
---|
0:13:21 | oh |
---|
0:13:22 | we just will be uh |
---|
0:13:27 | 'cause |
---|
0:13:27 | average cost function for the pairwise language recognition again |
---|
0:13:31 | we have one with time and one false alarm time |
---|
0:13:34 | but uh if we move to D gender out 'cause then the cost function become a more complicated because they |
---|
0:13:40 | are more target languages |
---|
0:13:42 | and |
---|
0:13:42 | for the detection of each language's that is one this term and trendy to false alarm time to ponder |
---|
0:13:47 | two |
---|
0:13:49 | average score |
---|
0:13:51 | so as you see i highlighted that |
---|
0:13:53 | hard in red |
---|
0:13:57 | um |
---|
0:13:59 | because previously we have been opening and in data for two languages only |
---|
0:14:04 | so they're only circles and triangles |
---|
0:14:06 | but now |
---|
0:14:07 | when we expanded a general class there are |
---|
0:14:09 | more |
---|
0:14:11 | then you got that out of set or out of that data which is the data |
---|
0:14:15 | uh |
---|
0:14:16 | not |
---|
0:14:17 | reside in these two other languages |
---|
0:14:20 | so uh |
---|
0:14:21 | these |
---|
0:14:22 | so for all that data |
---|
0:14:23 | marked in red circles here |
---|
0:14:25 | and again |
---|
0:14:26 | i'll show you the general trend of |
---|
0:14:28 | the of the data in the |
---|
0:14:30 | in the uh |
---|
0:14:31 | detection scores of the two classifiers because |
---|
0:14:34 | the classifier of uh the the the a lot like away so all |
---|
0:14:38 | H T and H O uh |
---|
0:14:40 | giving very similar trend because these two languages are very similar |
---|
0:14:44 | so |
---|
0:14:44 | what |
---|
0:14:46 | has a high score in number sixty also give a high score in um H R |
---|
0:14:51 | and uh |
---|
0:14:52 | actually there are some modification we have to do uh |
---|
0:14:56 | when we proceed from the two language case to the general trend three language case |
---|
0:15:00 | first is a |
---|
0:15:02 | as as that we have many offset data we don't |
---|
0:15:05 | want to touch speed of that data because |
---|
0:15:08 | we are afraid |
---|
0:15:08 | then |
---|
0:15:09 | this may affect |
---|
0:15:10 | the detection of |
---|
0:15:11 | these |
---|
0:15:12 | other language classes |
---|
0:15:13 | the second thing is as as mentioned uh |
---|
0:15:16 | in the |
---|
0:15:16 | general cost function |
---|
0:15:19 | there is a |
---|
0:15:20 | there are twenty two but um term |
---|
0:15:23 | so the false alarm for each language pair |
---|
0:15:25 | become and the mine |
---|
0:15:27 | and we have |
---|
0:15:28 | two |
---|
0:15:30 | put more stress |
---|
0:15:31 | in a week you think detection ms |
---|
0:15:33 | one of the and or reducing the uh |
---|
0:15:35 | detection of a salami in order to have a low detection ah |
---|
0:15:39 | oh |
---|
0:15:40 | average score |
---|
0:15:43 | so this is the three moves we applied when we uh proceed from |
---|
0:15:47 | pairwise language recognition to the general outcast |
---|
0:15:50 | first will is we only select detection trials which are likely to belong to |
---|
0:15:54 | the two |
---|
0:15:54 | related languages H T N H O |
---|
0:15:57 | of course we do not know in advance which language they you don't to so we apply a holistic method |
---|
0:16:02 | which is not included in the paper just |
---|
0:16:04 | choose |
---|
0:16:05 | only these language to operate |
---|
0:16:06 | and |
---|
0:16:08 | the route to is we waited cost of detection miss trendy two times heavier |
---|
0:16:12 | as you see in a later slide or earlier |
---|
0:16:14 | we formulate the uh |
---|
0:16:16 | iranians |
---|
0:16:17 | deviation optimisation function so that it's a midterm and then days that for some time |
---|
0:16:22 | and we |
---|
0:16:22 | put the way twenty two times more forty |
---|
0:16:25 | after mister |
---|
0:16:26 | table three is uh |
---|
0:16:28 | we have the ship the reference point for the calculation of total awareness deviation |
---|
0:16:32 | the point of doing this is uh can be explained by one here |
---|
0:16:36 | we have said that detection miss is more important we have to put more focus |
---|
0:16:40 | in detector miss |
---|
0:16:41 | in in in the calibration |
---|
0:16:43 | and uh we go back to the original detection threshold |
---|
0:16:46 | feature here |
---|
0:16:48 | oh if you still remember we have |
---|
0:16:52 | field are only here and its deviation and then we move |
---|
0:16:56 | all of these |
---|
0:16:56 | right well see it because these reptiles |
---|
0:16:59 | suppose to fall into the region of right as that |
---|
0:17:02 | and then it was not handle |
---|
0:17:04 | in |
---|
0:17:04 | anyway |
---|
0:17:05 | in if if we don't do anything |
---|
0:17:07 | so uh |
---|
0:17:08 | if detection misses is so important |
---|
0:17:10 | why don't we just uh |
---|
0:17:12 | also try to look at these |
---|
0:17:14 | like hungry points |
---|
0:17:16 | by |
---|
0:17:16 | moving the detection price already to be higher |
---|
0:17:19 | forty below actually be allowed |
---|
0:17:21 | section four so to |
---|
0:17:24 | fluctuates uh and then we try |
---|
0:17:26 | the best |
---|
0:17:27 | oops the on which will give us the lowest |
---|
0:17:29 | general language recognition or |
---|
0:17:32 | so this is the revised objective function basically is the same exactly the same problem of the previous night you |
---|
0:17:38 | see all for the calibration with two languages |
---|
0:17:40 | but now we have uh |
---|
0:17:42 | the three modifications as shown in red here |
---|
0:17:45 | and after we have done the uh |
---|
0:17:48 | calibration with the development set |
---|
0:17:50 | then we |
---|
0:17:51 | go back to the E variation |
---|
0:17:53 | it is that and then |
---|
0:17:54 | use the |
---|
0:17:55 | convention uh |
---|
0:17:57 | average cost function to |
---|
0:17:59 | you first eer |
---|
0:18:02 | um |
---|
0:18:04 | this diagram is |
---|
0:18:05 | this page is maybe a little bit intimidating so all any you sometimes to to to |
---|
0:18:10 | two |
---|
0:18:11 | to explain |
---|
0:18:12 | so therefore diagrams here |
---|
0:18:14 | all we use |
---|
0:18:15 | the development set |
---|
0:18:16 | to tune the alpha parameters for for schools base rotation |
---|
0:18:20 | so this is they the score for lambda H T and number H O before |
---|
0:18:25 | rotation and this is uh the rotation |
---|
0:18:28 | as you can see we only choose a subset |
---|
0:18:30 | uh actually the black box a be a lot like this for for the kind of car and that we |
---|
0:18:35 | got a little late |
---|
0:18:36 | school for related costs |
---|
0:18:38 | so only |
---|
0:18:39 | suppose only the black and the wind blows up at that and they are rotated a little bit |
---|
0:18:44 | and this is the result for your eyes instead of course lucy more messy |
---|
0:18:49 | and um |
---|
0:18:50 | there are also some kind of rotation here |
---|
0:18:53 | and then uh |
---|
0:18:56 | we'll see what we want to do |
---|
0:18:57 | we want to do the rotation such that uh |
---|
0:19:00 | they are more |
---|
0:19:02 | target cost |
---|
0:19:03 | school or |
---|
0:19:04 | uh staying in the in the in upper end of the Y X |
---|
0:19:07 | so that would be less detection based |
---|
0:19:09 | so in the development set it |
---|
0:19:12 | it isn't very clear because uh the |
---|
0:19:15 | target class |
---|
0:19:16 | the black dots are already up |
---|
0:19:18 | hi in the Y axis |
---|
0:19:19 | so in the emergence as we can see |
---|
0:19:21 | those like balls |
---|
0:19:23 | scattering down the the the |
---|
0:19:25 | the the curve here which makes up with that |
---|
0:19:29 | red and green dollars |
---|
0:19:30 | have |
---|
0:19:31 | already moved up after |
---|
0:19:33 | rotation of the |
---|
0:19:35 | score space |
---|
0:19:37 | so this is the overall result of the uh |
---|
0:19:40 | equal error rate after applying the score |
---|
0:19:42 | space rotation |
---|
0:19:44 | before we have all four point four five |
---|
0:19:46 | equal error rate and we use |
---|
0:19:48 | single detection threshold for |
---|
0:19:50 | the detection of all language |
---|
0:19:52 | and after this quotation the uh |
---|
0:19:55 | error is reduced to about three point three percent which is about uh |
---|
0:19:59 | twenty five percent relative reduction of uh you have trend |
---|
0:20:07 | and oh we |
---|
0:20:08 | also introduce a |
---|
0:20:10 | before there is a |
---|
0:20:12 | parameter of seed on which |
---|
0:20:16 | accounts for the |
---|
0:20:17 | shifting of the detection threshold |
---|
0:20:19 | and if we ship it |
---|
0:20:21 | a porno or |
---|
0:20:22 | if they don't is louder and louder that means we |
---|
0:20:25 | uh |
---|
0:20:27 | become more and more cost to these boundary points |
---|
0:20:30 | two |
---|
0:20:30 | possibly |
---|
0:20:31 | um |
---|
0:20:32 | academies point |
---|
0:20:34 | so uh we've tried different concealment of signal is three point five we've got the lowest equal error |
---|
0:20:41 | so he comes a summary of today's top |
---|
0:20:44 | in language recognition |
---|
0:20:46 | uh language pair detection possible five pairs of related languages |
---|
0:20:50 | a linear combination of detection scores between |
---|
0:20:53 | target language and the related language brings about five point eight present market |
---|
0:20:57 | eer reduction |
---|
0:20:58 | and we we wise the parameters |
---|
0:21:00 | four of four |
---|
0:21:02 | optimisation of it |
---|
0:21:03 | this score space rotation |
---|
0:21:05 | and the application dependent |
---|
0:21:06 | calibration can be applied to do with that of detection that brings about |
---|
0:21:10 | twenty five percent relative reduction of eer |
---|
0:21:13 | so all for future work uh we have been thinking of |
---|
0:21:15 | some unsupervised methods to find these related targets |
---|
0:21:19 | because |
---|
0:21:20 | in this where we start with the people like that |
---|
0:21:22 | target |
---|
0:21:23 | uh oh |
---|
0:21:24 | with no derivation because this is already included specification |
---|
0:21:28 | and uh we have also thought about application to other |
---|
0:21:31 | you second pass but uh |
---|
0:21:33 | we understand |
---|
0:21:34 | this |
---|
0:21:35 | oh where is |
---|
0:21:36 | very specific to this particular language correction pass and |
---|
0:21:39 | we think |
---|
0:21:40 | um |
---|
0:21:41 | special case have to be taken if we |
---|
0:21:43 | um migrate that to audit detection part |
---|
0:21:46 | and this is the end of today's presentation thank you for much |
---|
0:21:57 | uh before |
---|
0:21:58 | the questions uh |
---|
0:21:59 | uh |
---|
0:22:00 | the two source me the amounts that |
---|
0:22:03 | the organising committee just |
---|
0:22:04 | the |
---|
0:22:05 | we do so do |
---|
0:22:06 | of to the solution |
---|
0:22:07 | uh so |
---|
0:22:08 | uh |
---|
0:22:09 | any questions for retirement |
---|
0:22:19 | i have been taking part in these evaluations will be questionable evaluation which seems to relate to what you're doing |
---|
0:22:25 | sounds like what you're saying is that when you're testing |
---|
0:22:28 | a simple |
---|
0:22:30 | and you see is destruction |
---|
0:22:33 | uh in the training you can do anything you want |
---|
0:22:35 | pursuing virtually the training training in the training in calibration and and and uh |
---|
0:22:42 | you can settings you threshold anywhere you want within the testing |
---|
0:22:46 | or you know to see this to super looks very much like russian |
---|
0:22:51 | but |
---|
0:22:52 | and and my task is to sit attrition but i happen to know the details of the ukrainian model that |
---|
0:22:57 | looks more like ukrainian |
---|
0:22:59 | are you allowed to do that interesting timers that this remote for some reason |
---|
0:23:06 | using a so you can look at all the languages and see which is close to so |
---|
0:23:10 | so |
---|
0:23:16 | okay so you can just forced to assume that |
---|
0:23:20 | so actually in doing test we just |
---|
0:23:22 | have a small part to hold it |
---|
0:23:24 | how come from |
---|
0:23:25 | different languages and then we compare |
---|
0:23:27 | to choose all day is maybe possibly russian |
---|
0:23:31 | okay |
---|
0:23:36 | you still linear combination |
---|
0:23:38 | hmmm yeah |
---|
0:23:39 | uh i mean all languages are related |
---|
0:23:41 | so we we |
---|
0:23:43 | uh_huh |
---|
0:23:43 | why not use a linear combination of all that |
---|
0:23:46 | a small one |
---|
0:23:47 | oh we have actually tried |
---|
0:23:49 | and the uh each good thing is that it only works for these kind of related language |
---|
0:23:55 | because oh |
---|
0:23:56 | i think |
---|
0:23:57 | a very simple |
---|
0:23:58 | i said to to to |
---|
0:24:00 | selection is then why this work is |
---|
0:24:02 | if the two languages are more and more famous |
---|
0:24:04 | then |
---|
0:24:04 | the |
---|
0:24:06 | oh okay |
---|
0:24:07 | scores from these two detectors |
---|
0:24:09 | have more complementary effects because |
---|
0:24:11 | oh |
---|
0:24:11 | say brush and then ukrainian they are very similar |
---|
0:24:14 | that means |
---|
0:24:14 | if i use |
---|
0:24:15 | the the the score combination of these two languages |
---|
0:24:18 | then i can have applied the competence to |
---|
0:24:21 | we just |
---|
0:24:22 | languages which are |
---|
0:24:24 | not rushing and then and |
---|
0:24:25 | right |
---|
0:24:26 | and this is the main of performance |
---|
0:24:29 | oh reduction of performance |
---|
0:24:31 | uh improvement we get |
---|
0:24:32 | we have |
---|
0:24:33 | all |
---|
0:24:34 | get a significant a reduction of |
---|
0:24:37 | false alarm |
---|
0:24:38 | two |
---|
0:24:39 | other language |
---|
0:24:40 | but not to the related language |
---|
0:24:41 | by doing |
---|
0:24:52 | the remote questions uh than the |
---|
0:24:55 | it's gonna during lunch |
---|
0:24:56 | the |
---|
0:24:57 | this time the speaker |
---|