0:00:16so my name is monday madam and i'm presenting today work which was actually done
0:00:21by my intent
0:00:22at the beginning of the and the we should but i got vol
0:00:28okay
0:00:32so this work is based on the same data set that was just presented no
0:00:36by on trash
0:00:39and the this was the first release of that was in march
0:00:45with an updated version in june actually
0:00:48this dataset is you have just scene consisting about
0:00:51fifty j meaning representation reference utterances
0:00:56but as
0:00:57off
0:00:58of the following kind is as you can see here
0:01:02and a starting point in this work
0:01:04was to test this idea which
0:01:08as been advocated in this block quite well known in the our and then deep
0:01:13learning community
0:01:14the unreasonable effectiveness of recurrent neural networks
0:01:18a by an underage property
0:01:21actually is i is it especially you know if stressing tractor just not about bits
0:01:27required no
0:01:29and
0:01:29we wanted to test this simple id if a can we go
0:01:34with and out of the box car based sec two sec model
0:01:39with minimal intervention on all parts
0:01:42so i in april about the same time as the data for the challenge was
0:01:48first released
0:01:50was really this framework by then you brits and collaborate tackle the
0:01:55tf for transfer rule
0:01:57sec two sec framework
0:02:00which was original downfall experiments massive experiments with different configurations option and so on
0:02:07in your own machine translation
0:02:10with many yes
0:02:13options and parameters which are pretty simple
0:02:15to the two
0:02:18to concede your
0:02:20in net namely the number of layers of the hour and then is whether it
0:02:24is the gru lstm
0:02:26optimization regimes with the stochastic gradient descent of different types
0:02:32a bidirectional on coding is possible we so that in the previous talk bidirectional coding
0:02:38a different attention mechanisms also
0:02:41and option web based as opposed to cap based
0:02:45and this is the picture representation from that paper
0:02:50overview of the model is a standard
0:02:53on code the decoder sing with this possibilities this options
0:02:59what we need
0:03:01so we directly train a complex version of this framework
0:03:05with bi directional encoding the source and plus some attention mechanism
0:03:10on the data
0:03:11namely this means that if you look at this data a namely the meaning representation
0:03:16name the elicit right so
0:03:18we take that as a simple string of characters with out any preprocessing any change
0:03:24to that right
0:03:26and similarly for the utterance the generated utterance than we hear human produced
0:03:34we take this string of characters
0:03:37we don't do any
0:03:38well pretty a post-processing we don't do any tokenization no low a casing
0:03:44and maybe very importantly no then lexically station
0:03:48the icsi causation is we have since produce in some talks
0:03:52is the process but we you replaced certain and name that it is typically by
0:03:56such as small but at the start time
0:04:01so i want to make a note that there isn't the right there is a
0:04:05problem well known problem with word based model in sec two sec
0:04:10called the real world problem where well problem
0:04:13which is due to the fact that you need to have very big vocabularies and
0:04:17that the value that type sec two sec model
0:04:20and in section six mobile
0:04:22doesn't know how to copy
0:04:25words i it only knows how to
0:04:27known
0:04:29that a web scores in the source corresponds to a word in the target and
0:04:33you need to
0:04:34to learn these things in the plenty of each other so that means that excuse
0:04:37nation is way to avoid this problem and all other mechanisms for like coping mechanisms
0:04:43to handle this problem too
0:04:45but with a base model you don't have this problem at all because the vocabulary
0:04:48of symbols
0:04:49is very small
0:04:51or in our case in the order of fifty seventy characters were used in total
0:04:57and no need to do delexicalise
0:05:00and then and we conducted an evaluation
0:05:04of our results on a the original dataset
0:05:10so the bleu score there was a twenty five which is pretty low
0:05:13but this was used to the fact that the original dataset dean group
0:05:20the
0:05:20the human reference a menu several around five missing human references
0:05:26meaning representation different
0:05:27and zero that's that the group them
0:05:30and meaning that the blue evaluation that we did was a basically a single write
0:05:34the evaluation
0:05:36which gives much lower result then the result more recent evaluation that we did
0:05:40on the probably grouped with multi rate
0:05:43and this gave us all an order of seventy
0:05:46point blue point which is much more
0:05:49we also need a small scale human evaluation
0:05:54i wish to evaluate those and what we found there is that the predictions of
0:05:59the model where almost perfect in terms of linguistic quality
0:06:03in terms of grammatical at and naturalness
0:06:07there were no unknown words produced a normal has been invented words isolated words which
0:06:12can happen
0:06:13could i and we scatter bayes model because they produce character by character they don't
0:06:17evolution of where
0:06:19and the
0:06:21annotators of and judge that the
0:06:25prediction from the model was superior to the actually to the human reference which sometimes
0:06:29was not was not great
0:06:31in terms of linguistic
0:06:33the content is
0:06:36i thought that where some important semantic adequacy issue
0:06:41the this the source prediction of the model that the prediction of the model right
0:06:45was semantically correct in only fifty percent of the cases
0:06:50and the main problem actually deal almost only problem was the admissions some sometimes a
0:06:56mission of semantic material
0:07:00all in or around fifty percent of the cosby test
0:07:04have a perfect solution boss linguistically and from the point of view of semantic content
0:07:11was found a into twenty based least
0:07:14in around seventy percent of the cases
0:07:17you know and
0:07:18this is what we're stop that the summation time but
0:07:22since then we have we've been working on explore trying to exploit that reranking model
0:07:26models and
0:07:27and similar scene
0:07:29so
0:07:31since a lot
0:07:32i think
0:07:33many details because they want to see that the pasta also