0:00:16 | so my name is monday madam and i'm presenting today work which was actually done |
---|
0:00:21 | by my intent |
---|
0:00:22 | at the beginning of the and the we should but i got vol |
---|
0:00:28 | okay |
---|
0:00:32 | so this work is based on the same data set that was just presented no |
---|
0:00:36 | by on trash |
---|
0:00:39 | and the this was the first release of that was in march |
---|
0:00:45 | with an updated version in june actually |
---|
0:00:48 | this dataset is you have just scene consisting about |
---|
0:00:51 | fifty j meaning representation reference utterances |
---|
0:00:56 | but as |
---|
0:00:57 | off |
---|
0:00:58 | of the following kind is as you can see here |
---|
0:01:02 | and a starting point in this work |
---|
0:01:04 | was to test this idea which |
---|
0:01:08 | as been advocated in this block quite well known in the our and then deep |
---|
0:01:13 | learning community |
---|
0:01:14 | the unreasonable effectiveness of recurrent neural networks |
---|
0:01:18 | a by an underage property |
---|
0:01:21 | actually is i is it especially you know if stressing tractor just not about bits |
---|
0:01:27 | required no |
---|
0:01:29 | and |
---|
0:01:29 | we wanted to test this simple id if a can we go |
---|
0:01:34 | with and out of the box car based sec two sec model |
---|
0:01:39 | with minimal intervention on all parts |
---|
0:01:42 | so i in april about the same time as the data for the challenge was |
---|
0:01:48 | first released |
---|
0:01:50 | was really this framework by then you brits and collaborate tackle the |
---|
0:01:55 | tf for transfer rule |
---|
0:01:57 | sec two sec framework |
---|
0:02:00 | which was original downfall experiments massive experiments with different configurations option and so on |
---|
0:02:07 | in your own machine translation |
---|
0:02:10 | with many yes |
---|
0:02:13 | options and parameters which are pretty simple |
---|
0:02:15 | to the two |
---|
0:02:18 | to concede your |
---|
0:02:20 | in net namely the number of layers of the hour and then is whether it |
---|
0:02:24 | is the gru lstm |
---|
0:02:26 | optimization regimes with the stochastic gradient descent of different types |
---|
0:02:32 | a bidirectional on coding is possible we so that in the previous talk bidirectional coding |
---|
0:02:38 | a different attention mechanisms also |
---|
0:02:41 | and option web based as opposed to cap based |
---|
0:02:45 | and this is the picture representation from that paper |
---|
0:02:50 | overview of the model is a standard |
---|
0:02:53 | on code the decoder sing with this possibilities this options |
---|
0:02:59 | what we need |
---|
0:03:01 | so we directly train a complex version of this framework |
---|
0:03:05 | with bi directional encoding the source and plus some attention mechanism |
---|
0:03:10 | on the data |
---|
0:03:11 | namely this means that if you look at this data a namely the meaning representation |
---|
0:03:16 | name the elicit right so |
---|
0:03:18 | we take that as a simple string of characters with out any preprocessing any change |
---|
0:03:24 | to that right |
---|
0:03:26 | and similarly for the utterance the generated utterance than we hear human produced |
---|
0:03:34 | we take this string of characters |
---|
0:03:37 | we don't do any |
---|
0:03:38 | well pretty a post-processing we don't do any tokenization no low a casing |
---|
0:03:44 | and maybe very importantly no then lexically station |
---|
0:03:48 | the icsi causation is we have since produce in some talks |
---|
0:03:52 | is the process but we you replaced certain and name that it is typically by |
---|
0:03:56 | such as small but at the start time |
---|
0:04:01 | so i want to make a note that there isn't the right there is a |
---|
0:04:05 | problem well known problem with word based model in sec two sec |
---|
0:04:10 | called the real world problem where well problem |
---|
0:04:13 | which is due to the fact that you need to have very big vocabularies and |
---|
0:04:17 | that the value that type sec two sec model |
---|
0:04:20 | and in section six mobile |
---|
0:04:22 | doesn't know how to copy |
---|
0:04:25 | words i it only knows how to |
---|
0:04:27 | known |
---|
0:04:29 | that a web scores in the source corresponds to a word in the target and |
---|
0:04:33 | you need to |
---|
0:04:34 | to learn these things in the plenty of each other so that means that excuse |
---|
0:04:37 | nation is way to avoid this problem and all other mechanisms for like coping mechanisms |
---|
0:04:43 | to handle this problem too |
---|
0:04:45 | but with a base model you don't have this problem at all because the vocabulary |
---|
0:04:48 | of symbols |
---|
0:04:49 | is very small |
---|
0:04:51 | or in our case in the order of fifty seventy characters were used in total |
---|
0:04:57 | and no need to do delexicalise |
---|
0:05:00 | and then and we conducted an evaluation |
---|
0:05:04 | of our results on a the original dataset |
---|
0:05:10 | so the bleu score there was a twenty five which is pretty low |
---|
0:05:13 | but this was used to the fact that the original dataset dean group |
---|
0:05:20 | the |
---|
0:05:20 | the human reference a menu several around five missing human references |
---|
0:05:26 | meaning representation different |
---|
0:05:27 | and zero that's that the group them |
---|
0:05:30 | and meaning that the blue evaluation that we did was a basically a single write |
---|
0:05:34 | the evaluation |
---|
0:05:36 | which gives much lower result then the result more recent evaluation that we did |
---|
0:05:40 | on the probably grouped with multi rate |
---|
0:05:43 | and this gave us all an order of seventy |
---|
0:05:46 | point blue point which is much more |
---|
0:05:49 | we also need a small scale human evaluation |
---|
0:05:54 | i wish to evaluate those and what we found there is that the predictions of |
---|
0:05:59 | the model where almost perfect in terms of linguistic quality |
---|
0:06:03 | in terms of grammatical at and naturalness |
---|
0:06:07 | there were no unknown words produced a normal has been invented words isolated words which |
---|
0:06:12 | can happen |
---|
0:06:13 | could i and we scatter bayes model because they produce character by character they don't |
---|
0:06:17 | evolution of where |
---|
0:06:19 | and the |
---|
0:06:21 | annotators of and judge that the |
---|
0:06:25 | prediction from the model was superior to the actually to the human reference which sometimes |
---|
0:06:29 | was not was not great |
---|
0:06:31 | in terms of linguistic |
---|
0:06:33 | the content is |
---|
0:06:36 | i thought that where some important semantic adequacy issue |
---|
0:06:41 | the this the source prediction of the model that the prediction of the model right |
---|
0:06:45 | was semantically correct in only fifty percent of the cases |
---|
0:06:50 | and the main problem actually deal almost only problem was the admissions some sometimes a |
---|
0:06:56 | mission of semantic material |
---|
0:07:00 | all in or around fifty percent of the cosby test |
---|
0:07:04 | have a perfect solution boss linguistically and from the point of view of semantic content |
---|
0:07:11 | was found a into twenty based least |
---|
0:07:14 | in around seventy percent of the cases |
---|
0:07:17 | you know and |
---|
0:07:18 | this is what we're stop that the summation time but |
---|
0:07:22 | since then we have we've been working on explore trying to exploit that reranking model |
---|
0:07:26 | models and |
---|
0:07:27 | and similar scene |
---|
0:07:29 | so |
---|
0:07:31 | since a lot |
---|
0:07:32 | i think |
---|
0:07:33 | many details because they want to see that the pasta also |
---|