and
it's reference resolution in situated dialogue with learned semantics
so
a last look at the
iterative dialogue for
sitting at the dialog situated in and
environment
like a
for example
for human robot interaction
in this image this human was trying to teach this robot to learn the map
of the physical environment in this room
and the next example it's the intelligent during system
and this other tier was trying to teach college steering is to use computer to
solve complex problems
so as we can see the natural language dialogue between in those environments are highly
related to the environment it they frequently referred to the objects or events
that happen in the environment
but here is an example from the
tutorial dialogue which is about java programming
in each a tutorial session there is a human tutor and the human student this
tutor was trying to teach this student on and java programming as we can see
everything here the dialogue is a it's related to the content of the java code
say the they talk about the objects
in the java code
so to build a intelligent tutoring system to understand the dialogue of the user we
have to understand
the dialogue with in this environment including
interpret the referring expressions
so
the problem is defined as
given a referring expression which it is a sequence of a little worse or tokens
and the an environment
in this case with simplified the environment as a set of a objects so the
goal here is to find the most compatible
object for this referring expression
though for the rest part of this card i'll introduce the corpus we used and
the challenges and related words solution experiments than final a future work
though the to the corpus we used is from
is tutorial that all it's a set of all started dialogues from a java programming
though those dialogs are between human you're in humans do you need
here's the interface how we collect the data which is eclipse plug-in so this plugin
will lead you want your in the in the students to work remotely like in
different rooms
just like using google dot so whenever the steering and
and it the code that you're will see it
and they can also see and text message to each other
i within this
this tool
so we class of the dialogue between them and
all of the editing behaviors
so
at it
the tutorial dialogue is mostly on introductory
programming in java programming
which involves creating traversing
and modifying parallel
a race this out was collected in two thousand seven which includes forty five two
recessions
almost five thousand our results in total for each session a like last about one
hour which has a on average one hundred and eight
our races
though they are some challenges to do there were a reference resolution in such a
a setting
the easy cases like when the user refer to something in the java code only
use the name the proper name
if you
intuitively if you're we can just to compare
the screen
from the object and from the referring expression to see whether they match one out
but this only account about a third of or
all the cases
it could be even it could be harder which means the user refers to something
in the java code only use the attributes
not the name
like for the two dimensional array the array
and it could be even harder they refer to something that
are not properly defined
channel a concept or
which could be just a piece of code
i for example here
you could apply and use that cone if you want it
so back alone is just the random
line of code difficult here so
those are the three challenges or
actually two
and the
the last one is the number of objects in the java code could be very
large which include the map the parables objects or any piece of a co and
is dynamic because as the programming goes
there could be objects removed from the vocal or introduced
so
that's it
and
then i'll talk about some closely related to prepare for
how people
like to do with this talented before like the first one if that either something
and
paper they work on reference resolution
for a dialogues from the collaborative game which is called the at n-gram in this
game there are seven objects and they are two players to play this game one
is the instructor the other one
well apply the instruction from the instructor to the to manipulate those objects
so the used dialogue his rate and have his rate which are for dialogue his
rate is
any object that were mentioned
recently or from the beginning of the dialogue
and that have his rate was
any objects
that were manipulated
from the beginning of the task
that's how they do it
and
the next one
is
we can and some fifteen paper
i the used a word as classifier to learn the
a relationship between
referring expression tokens
to
physical attributes
in this setting
the a set of a objects so they use the kind like a co location
information like for a token
they find all of the
the co location co-located attributes are with the they manually comic a match the referring
expression and the referent so the find the co locating does the co location information
between
tokens and attributes
so the use the learner
i like intention
to predict the referent for a new giving referring expression
so in this paper we follow the either a suntan we use
similar dialogue history and the task is very features
so here's an example of a from the corpus
look here the student just a typed a line of code
a rate goes to new
int but well
then
another line of code there are only if that
the minor a look like it is set up correctly now so we can see
here is a relationship
kinda like a
between the behavior
and the
the referring behavior so after that the t is that in the forum what should
you be storing in that ray so is also coming very close so they will
refer to the same thing i go kind like repeatedly locally
so that's why we think this a dialogue history and
task is very are very important
so we use them
the third kind of information when you is semantic information
given the referring expression which is a noun phrase this noun phrase has different segments
used argument could indicate some kind of a attribute i'll the referent for this referring
expression so a we used a
conditional random field to segment and label this referring expression
is to find out
the attribute information it gives so
though
after this is a segmentation and labeling we confine the attribute segments
like in this a referring expression data rate if it's a category
and the two dimension
in the case
the dimension of this ray
so after that we extract the attribute value from each segment
here
and we use at this added to make the attribute vector so this attribute baxter
is that if the south of a attributes that this referring that this referent of
this or referring expression should have
if we do it correctly right
and so after
before starting their reference resolution a task we want to come like a make a
candidate list because the number of objects in the
in the java code could be very large
i because
i
contain like everything you know
so is a very intuitive approach with your first late we use all of the
mission objects so far
from the beginning of the session
and
we include all of the manipulate objects
from the beginning of the session into the candidate list
and the final a we include all of the object that match any attribute of
this
in this mission in this referring expression in this
so the reason
here
to match only one attribute is we don't want to miss any
real referent just a
from mistake in padding the
but semantics so that's how we do the
create the candidate list
and the
here
the reference resolution task is defined as to find the most compatible referent most compatible
object from the candidate list
for this referring expression so
this probability is defined as the output
of a classification function
so for the classification function here
we use the four different kinds of the classifiers to see
how do they work in the setting
we used a logistic regression decision tree
nine but yes and then you're networks
so here
when
we can see the probability
of
referring this given referring expression and
candidate in the candidate list
so we can rank this
probability for all of the candidates and pick the candidate with the highest probability as
the referent
so that's how we
did it
so
we used here
are the features we use the first group is the dialogue history features
which are
when this object
like we're mission
how long ago
was it mention
the second
a group of a features are
the task is very features
like a how long ago was this object
manipulated like a tight
or selected or
kind of this
the third group of a features are
the semantic features like to measure how
the semantics of a
the referring expression match
a given candidate
though
for the experiments we use the
six sessions
the tutorial data tutorial dialogue
and
the contain three hundred sixty four are referring expressions
and that we manually
label their referendum from the java code
and
we had two annotators
and
the
we got a cap of a your voice six five
and we used six fold cross validation which is basically take one session out in
do the training was
there are the other five sessions in the test on the
the
the last one
two
evaluate our approach we compare with two baseline models the first one is you know
baseline
model they use the dialogue his rate and task is rate in their in their
task
in their approach
so to make it fair
we and the handcrafted lexicon
to provide some
semantic
information for this model
the second baseline models the content and baseline model
because it was
weakly supervised approach
and in
dead-end perform the river
reference resolution
in a dialog setting so
to make it fair we add
the dialogue history and task he three features to this approach
after that you're are the
the results
we got
as we can the our approach got
a higher
our accuracy on the reference resolution have the reason why is higher
is
is the
the semantics wheeler using the conditional random fields which has a higher accuracy on the
semantics
though
actually there are two groups for the referring fashion
for the reference resolution task because
some of the are referring expressions
contain some semantic information
the estimate are indicates it's
and some of them are just the
products
which does not have semantic
information in it
our so our work here
the contribution here is a basic is mostly on the hour of reference resolution for
those referring expression that contain semantic information
and
so to see a
approach could work given the better
semantic information so we test and using gold standard semantic labels which are made manually
so
here begins the
using the goal in our semantics to run the same approach again we got
a higher
accuracy
though
this means the semantic information here is very important in doing this reference resolution task
and but is do you like a
there are still room to improve because
the human agreement
are
like is eighty five percent which is a lot higher than the approach that the
remote from the approach
as
but we did for the future work
i think it will be promising to consider the structure of a
of the dialogue
and
also an unsupervised or weakly supervised approach will be battery also very interesting it doesn't
require much annotation
we're
that's it
and want to
thank our colleagues or their input
and
thank our sponsors
in q
i'm your repeat request so you were saying i we have different problem approaches for
a referring expression
like
as the pronoun and the non-frontal
right
yes
the difference here is only on the semantic information because we
this
the conch the main contribution of this work
is employing the
semantic information from a referring expression but problems they are pretty simple we don't have
much information from it so we can like run this
this model this approach
by splitting this
the set of referring expressions they are kind looks similar but
yes i think we will consider this when we really being the entire interview system
thank you
yes
the eye gaze would definitely give us more information
like when we do the reference resolution
it's
come back into two like a sum and assumption here
they won't look at the object when they refer to it
so that
could be
another feature directly added to this approach
or maybe there will be some more
sophisticated way to use this kind of information
thank you
the mouse cursor
actually we use the
the selection
which is
the student my flat
this part of a coding task how do they look like or
as a question about it
kind like a hard
one case of a using the mouse or
the cursor
yes but that's definitely and also very interesting
information to consider this case
a well
actually i haven't had a very deep consideration on this
i just
if you like a
in different
like
additions
of a
the discourse structure this could give some
interesting information on determining
like duh the referent
the details