so we should move on i think
so
that's the neck speaker
so
the next paper
is towards and round learning for efficient dialogue agent by modeling looking ahead ability
so unfortunately the order of the paper have these subproblems
so where we have a stand in
percent through here
a main issue
so i think for that reason
not possible to announce ask questions unfortunately
but please
go ahead
you'd have to the one just work at this paper entitled with two words and
you and learning
for efficient than all agent by modeling look you have the ability
it is also the by george intel xeon email
meanwhile
gmm so actually
from ivn research china and that the institute of technology
first let me introduce the background
the dialogue systems attract a lot of attention recently
due to get huge value to reduce your mobile work in many commercial domains
right
restaurant reservation and travel planning
unlike those chitchat counterparts
the majority of dialog agents
with goals
are expected to be efficient
to complete tasks with as few as possible dialog turns
the right first
three the right first example shows that a chit chat board
right bars
the dialogue to be as small as possible
how never
the below example shows that
in goal oriented dialogues
the purse that should to be efficient with as few as possible dialog turns
here yet another decoding down whole for expressing the cmi years
that we want to book a table or twelve o'clock
the only efficient examples stands for turns
why are efficient one only needs to turns
not looking at the t efficient examples
the human
we don't have found you cables at the eleven o'clock tomorrow
all i restored
the agent are applied
what have time is available
the humans there twelve o'clock it's okay
the agent about y alright we want that
though it took for turns
but efficient examples
the human
there we don't have empty tables at eleven o'clock tomorrow or a third the agent
can reply
hot swap o'clock
we also okay
what is one it only take tutors
as showing the right figure
the dialogue manager why domain considered to be responsible for efficiency
cell or probably it's
how to learn and efficient dialogue model
or
a dialogue manager from the data
we have two fold existing works
either to too many manual efforts such as for reinforcement learning
we have to designed to strategy the reward function and the expert what training and
test three
or
for a sequence of sequence methods
they tend to generate
generate response
for example like
i don't know
yes okay
and that they can distinguish different context
in this paper we address in this paper
we address the problem
from the perspective off and you and dialogue modeling in order to reduce the human
intervention in system designs
and propose a new sequence to see what's model by modeling the looking had ability
our intuition bus
by predicting the several future turns
the agent can make a better position of what to say
for current turn
for achieving dialog goals as soon as
possible
well mastered has several advantages
it yet they had human and does not spend too much manual work
and all experiments chose it is more efficient than nine you've segment just sequence methods
this architecture
overall model
from the but
to a
there are many three components
the far to get card encoding module
and the intermediate the lucky had module and the top one get carded decoding modules
in the including more dues
we call the street kind of information but at directional gru
it won't be the case particle utterances
and the current utterance
and that the goals
the goals are represented by one hot vectors
similar to the bag of words
after getting did find kind of representations
it were be the representation for the current utterance
and the bidirectional representation from the current utterance and the bidirectional a representation from the
goals
and this fight kind of revenge can connect together
and it would be
and their artwork speed of the look at the modules the input so it won't
be the actual one the actual and the input
for the log you had a mode you idiot a little different from the bidirectional
gru all terrorists
we have only one direction
but
the predict the future turns shot we now by the first hidden state
because it work we used to predict the aperture system utterance for the current turn
so
the information will be translated for war and the backward
then combining the two e direction information
the future each term in the predicted by the queen models
that we what it would be that probably h one average to the registry and
the b h k
and this modeling looks like a shoelace
so we design and you are with them to learn to model
each time of training
part of the parameters are fixed and others a big and the turn around
so it seems like
expectation maximization algorithm
but it is not real my with
in the decoding mode you
with something that every predict the future dialogues
and that generate a real system utterances
by an attention model
the loss function contents three terms
the first yet
for modeling for modelling language model the second it for modeling the looking i had
ability and the last it's for predicting the final state of conversation
we
this the we
you know experiments
that it is that should have goals
we use to kind of data sets
what you want to go station from a project division task
the that it is that
it is generated by the mastered with the goals
you cater for the details from paper
for preparing the training and testing samples
let's see the examples
at every turn
do it a sample with the heat
historical utterances and the current utterance in total we have about
certainly case what is updated at one and the ten k forty the set to
we use the goal achievement the ratio and average dialog turns is true matrix
see our method and you haven't
we used a user simulator it has the sequence of signals model with goals
and also to human evaluators i invited to talk with the agent
we program the network using the prior approach and there are some parameter settings details
can be referred in paper
this it experimental results
in the table they are four models
this equal to see
with the goal means
encoding used canonical utterances
and the goals together
then outputting the system utterance
sequences you got class they can meetings
it's pretty the final competition state agree or disagree
six to see go look
means it can look ahead
but not predict the final state
the last baseline means it can do everything
we can find
where the looking we can find what the market had ability shows
the optimal efficiency
performance
both my simulator and the human evaluators
below and the parameter tuning
then last
four fingers
show the performance with different look you have had steps
we find
setting the step
for
should be best
we think this parameter it and have recall and depends on the datasets
the rightful figures
show the performance with different hidden state dimension
from one hundred twenty eight
two one thousand twenty four
we find a stack as the two hundred fifty think fifty six can be better
here it example dialogues with the agent
the last
the laughter shows sometimes if the agent tends to agree
it was and what dialogue turns
the right shows
although all the dialogues
and agreements
our model
spending this dialogue turns
of course
here we remove the unknown words because the language generation is not so perfect
e sum rate this paper proposed an end-to-end a model towards the problem of
how to learn like efficient dialogue manager without taking your much manual work
experiment experiments on to detect that illustrate our model you the more efficient
the contributions include the new problem from the perspective of deep learning
a novel method to model the monkeys had ability
and the effective experiments
in the future we will investigate other matters to the problem
also
the language generation quality should be paid more attention
that of what is well for this paper