0:00:18 | so we should move on i think |
---|
0:00:24 | so |
---|
0:00:25 | that's the neck speaker |
---|
0:00:29 | so |
---|
0:00:31 | the next paper |
---|
0:00:33 | is towards and round learning for efficient dialogue agent by modeling looking ahead ability |
---|
0:00:41 | so unfortunately the order of the paper have these subproblems |
---|
0:00:45 | so where we have a stand in |
---|
0:00:48 | percent through here |
---|
0:00:49 | a main issue |
---|
0:00:52 | so i think for that reason |
---|
0:00:57 | not possible to announce ask questions unfortunately |
---|
0:01:02 | but please |
---|
0:01:03 | go ahead |
---|
0:01:35 | you'd have to the one just work at this paper entitled with two words and |
---|
0:01:39 | you and learning |
---|
0:01:41 | for efficient than all agent by modeling look you have the ability |
---|
0:01:46 | it is also the by george intel xeon email |
---|
0:01:49 | meanwhile |
---|
0:01:50 | gmm so actually |
---|
0:01:52 | from ivn research china and that the institute of technology |
---|
0:02:09 | first let me introduce the background |
---|
0:02:11 | the dialogue systems attract a lot of attention recently |
---|
0:02:15 | due to get huge value to reduce your mobile work in many commercial domains |
---|
0:02:21 | right |
---|
0:02:22 | restaurant reservation and travel planning |
---|
0:02:25 | unlike those chitchat counterparts |
---|
0:02:28 | the majority of dialog agents |
---|
0:02:30 | with goals |
---|
0:02:31 | are expected to be efficient |
---|
0:02:34 | to complete tasks with as few as possible dialog turns |
---|
0:02:40 | the right first |
---|
0:02:42 | three the right first example shows that a chit chat board |
---|
0:02:46 | right bars |
---|
0:02:48 | the dialogue to be as small as possible |
---|
0:02:51 | how never |
---|
0:02:52 | the below example shows that |
---|
0:02:54 | in goal oriented dialogues |
---|
0:02:57 | the purse that should to be efficient with as few as possible dialog turns |
---|
0:03:06 | here yet another decoding down whole for expressing the cmi years |
---|
0:03:12 | that we want to book a table or twelve o'clock |
---|
0:03:15 | the only efficient examples stands for turns |
---|
0:03:18 | why are efficient one only needs to turns |
---|
0:03:22 | not looking at the t efficient examples |
---|
0:03:25 | the human |
---|
0:03:26 | we don't have found you cables at the eleven o'clock tomorrow |
---|
0:03:30 | all i restored |
---|
0:03:31 | the agent are applied |
---|
0:03:33 | what have time is available |
---|
0:03:36 | the humans there twelve o'clock it's okay |
---|
0:03:39 | the agent about y alright we want that |
---|
0:03:42 | though it took for turns |
---|
0:03:44 | but efficient examples |
---|
0:03:46 | the human |
---|
0:03:47 | there we don't have empty tables at eleven o'clock tomorrow or a third the agent |
---|
0:03:52 | can reply |
---|
0:03:54 | hot swap o'clock |
---|
0:03:56 | we also okay |
---|
0:03:58 | what is one it only take tutors |
---|
0:04:06 | as showing the right figure |
---|
0:04:08 | the dialogue manager why domain considered to be responsible for efficiency |
---|
0:04:14 | cell or probably it's |
---|
0:04:16 | how to learn and efficient dialogue model |
---|
0:04:18 | or |
---|
0:04:19 | a dialogue manager from the data |
---|
0:04:23 | we have two fold existing works |
---|
0:04:27 | either to too many manual efforts such as for reinforcement learning |
---|
0:04:32 | we have to designed to strategy the reward function and the expert what training and |
---|
0:04:38 | test three |
---|
0:04:39 | or |
---|
0:04:40 | for a sequence of sequence methods |
---|
0:04:42 | they tend to generate |
---|
0:04:44 | generate response |
---|
0:04:45 | for example like |
---|
0:04:47 | i don't know |
---|
0:04:48 | yes okay |
---|
0:04:50 | and that they can distinguish different context |
---|
0:04:59 | in this paper we address in this paper |
---|
0:05:04 | we address the problem |
---|
0:05:06 | from the perspective off and you and dialogue modeling in order to reduce the human |
---|
0:05:11 | intervention in system designs |
---|
0:05:14 | and propose a new sequence to see what's model by modeling the looking had ability |
---|
0:05:22 | our intuition bus |
---|
0:05:23 | by predicting the several future turns |
---|
0:05:27 | the agent can make a better position of what to say |
---|
0:05:32 | for current turn |
---|
0:05:33 | for achieving dialog goals as soon as |
---|
0:05:36 | possible |
---|
0:05:39 | well mastered has several advantages |
---|
0:05:42 | it yet they had human and does not spend too much manual work |
---|
0:05:46 | and all experiments chose it is more efficient than nine you've segment just sequence methods |
---|
0:05:59 | this architecture |
---|
0:06:01 | overall model |
---|
0:06:03 | from the but |
---|
0:06:05 | to a |
---|
0:06:06 | there are many three components |
---|
0:06:09 | the far to get card encoding module |
---|
0:06:12 | and the intermediate the lucky had module and the top one get carded decoding modules |
---|
0:06:19 | in the including more dues |
---|
0:06:21 | we call the street kind of information but at directional gru |
---|
0:06:26 | it won't be the case particle utterances |
---|
0:06:29 | and the current utterance |
---|
0:06:31 | and that the goals |
---|
0:06:35 | the goals are represented by one hot vectors |
---|
0:06:38 | similar to the bag of words |
---|
0:06:41 | after getting did find kind of representations |
---|
0:06:44 | it were be the representation for the current utterance |
---|
0:06:48 | and the bidirectional representation from the current utterance and the bidirectional a representation from the |
---|
0:06:57 | goals |
---|
0:06:58 | and this fight kind of revenge can connect together |
---|
0:07:01 | and it would be |
---|
0:07:05 | and their artwork speed of the look at the modules the input so it won't |
---|
0:07:10 | be the actual one the actual and the input |
---|
0:07:17 | for the log you had a mode you idiot a little different from the bidirectional |
---|
0:07:21 | gru all terrorists |
---|
0:07:24 | we have only one direction |
---|
0:07:26 | but |
---|
0:07:27 | the predict the future turns shot we now by the first hidden state |
---|
0:07:32 | because it work we used to predict the aperture system utterance for the current turn |
---|
0:07:38 | so |
---|
0:07:38 | the information will be translated for war and the backward |
---|
0:07:42 | then combining the two e direction information |
---|
0:07:47 | the future each term in the predicted by the queen models |
---|
0:07:51 | that we what it would be that probably h one average to the registry and |
---|
0:07:55 | the b h k |
---|
0:07:58 | and this modeling looks like a shoelace |
---|
0:08:01 | so we design and you are with them to learn to model |
---|
0:08:06 | each time of training |
---|
0:08:08 | part of the parameters are fixed and others a big and the turn around |
---|
0:08:14 | so it seems like |
---|
0:08:17 | expectation maximization algorithm |
---|
0:08:19 | but it is not real my with |
---|
0:08:23 | in the decoding mode you |
---|
0:08:25 | with something that every predict the future dialogues |
---|
0:08:28 | and that generate a real system utterances |
---|
0:08:31 | by an attention model |
---|
0:08:34 | the loss function contents three terms |
---|
0:08:37 | the first yet |
---|
0:08:38 | for modeling for modelling language model the second it for modeling the looking i had |
---|
0:08:44 | ability and the last it's for predicting the final state of conversation |
---|
0:08:49 | we |
---|
0:08:50 | this the we |
---|
0:08:56 | you know experiments |
---|
0:08:57 | that it is that should have goals |
---|
0:09:00 | we use to kind of data sets |
---|
0:09:02 | what you want to go station from a project division task |
---|
0:09:06 | the that it is that |
---|
0:09:08 | it is generated by the mastered with the goals |
---|
0:09:12 | you cater for the details from paper |
---|
0:09:15 | for preparing the training and testing samples |
---|
0:09:18 | let's see the examples |
---|
0:09:20 | at every turn |
---|
0:09:21 | do it a sample with the heat |
---|
0:09:24 | historical utterances and the current utterance in total we have about |
---|
0:09:29 | certainly case what is updated at one and the ten k forty the set to |
---|
0:09:40 | we use the goal achievement the ratio and average dialog turns is true matrix |
---|
0:09:46 | see our method and you haven't |
---|
0:09:48 | we used a user simulator it has the sequence of signals model with goals |
---|
0:09:54 | and also to human evaluators i invited to talk with the agent |
---|
0:09:59 | we program the network using the prior approach and there are some parameter settings details |
---|
0:10:05 | can be referred in paper |
---|
0:10:12 | this it experimental results |
---|
0:10:15 | in the table they are four models |
---|
0:10:18 | this equal to see |
---|
0:10:21 | with the goal means |
---|
0:10:23 | encoding used canonical utterances |
---|
0:10:25 | and the goals together |
---|
0:10:27 | then outputting the system utterance |
---|
0:10:30 | sequences you got class they can meetings |
---|
0:10:33 | it's pretty the final competition state agree or disagree |
---|
0:10:39 | six to see go look |
---|
0:10:41 | means it can look ahead |
---|
0:10:44 | but not predict the final state |
---|
0:10:47 | the last baseline means it can do everything |
---|
0:10:51 | we can find |
---|
0:10:52 | where the looking we can find what the market had ability shows |
---|
0:10:57 | the optimal efficiency |
---|
0:10:59 | performance |
---|
0:11:00 | both my simulator and the human evaluators |
---|
0:11:04 | below and the parameter tuning |
---|
0:11:07 | then last |
---|
0:11:07 | four fingers |
---|
0:11:09 | show the performance with different look you have had steps |
---|
0:11:14 | we find |
---|
0:11:15 | setting the step |
---|
0:11:17 | for |
---|
0:11:17 | should be best |
---|
0:11:19 | we think this parameter it and have recall and depends on the datasets |
---|
0:11:25 | the rightful figures |
---|
0:11:26 | show the performance with different hidden state dimension |
---|
0:11:30 | from one hundred twenty eight |
---|
0:11:32 | two one thousand twenty four |
---|
0:11:34 | we find a stack as the two hundred fifty think fifty six can be better |
---|
0:11:46 | here it example dialogues with the agent |
---|
0:11:49 | the last |
---|
0:11:50 | the laughter shows sometimes if the agent tends to agree |
---|
0:11:55 | it was and what dialogue turns |
---|
0:11:59 | the right shows |
---|
0:12:01 | although all the dialogues |
---|
0:12:02 | and agreements |
---|
0:12:04 | our model |
---|
0:12:06 | spending this dialogue turns |
---|
0:12:08 | of course |
---|
0:12:09 | here we remove the unknown words because the language generation is not so perfect |
---|
0:12:19 | e sum rate this paper proposed an end-to-end a model towards the problem of |
---|
0:12:24 | how to learn like efficient dialogue manager without taking your much manual work |
---|
0:12:29 | experiment experiments on to detect that illustrate our model you the more efficient |
---|
0:12:36 | the contributions include the new problem from the perspective of deep learning |
---|
0:12:41 | a novel method to model the monkeys had ability |
---|
0:12:45 | and the effective experiments |
---|
0:12:48 | in the future we will investigate other matters to the problem |
---|
0:12:53 | also |
---|
0:12:53 | the language generation quality should be paid more attention |
---|
0:13:00 | that of what is well for this paper |
---|