Speech Transcript - Reference Resolution in Situated Dialogue with Learned Semantics

and

it's reference resolution in situated dialogue with learned semantics

a last look at the

iterative dialogue for

sitting at the dialog situated in and

environment

like a

for example

for human robot interaction

in this image this human was trying to teach this robot to learn the map

of the physical environment in this room

and the next example it's the intelligent during system

and this other tier was trying to teach college steering is to use computer to

solve complex problems

so as we can see the natural language dialogue between in those environments are highly

related to the environment it they frequently referred to the objects or events

that happen in the environment

but here is an example from the

tutorial dialogue which is about java programming

in each a tutorial session there is a human tutor and the human student this

tutor was trying to teach this student on and java programming as we can see

everything here the dialogue is a it's related to the content of the java code

say the they talk about the objects

in the java code

so to build a intelligent tutoring system to understand the dialogue of the user we

have to understand

the dialogue with in this environment including

interpret the referring expressions

the problem is defined as

given a referring expression which it is a sequence of a little worse or tokens

and the an environment

in this case with simplified the environment as a set of a objects so the

goal here is to find the most compatible

object for this referring expression

though for the rest part of this card i'll introduce the corpus we used and

the challenges and related words solution experiments than final a future work

though the to the corpus we used is from

is tutorial that all it's a set of all started dialogues from a java programming

though those dialogs are between human you're in humans do you need

here's the interface how we collect the data which is eclipse plug-in so this plugin

will lead you want your in the in the students to work remotely like in

different rooms

just like using google dot so whenever the steering and

and it the code that you're will see it

and they can also see and text message to each other

i within this

this tool

so we class of the dialogue between them and

all of the editing behaviors

at it

the tutorial dialogue is mostly on introductory

programming in java programming

which involves creating traversing

and modifying parallel

a race this out was collected in two thousand seven which includes forty five two

recessions

almost five thousand our results in total for each session a like last about one

hour which has a on average one hundred and eight

our races

though they are some challenges to do there were a reference resolution in such a

a setting

the easy cases like when the user refer to something in the java code only

use the name the proper name

if you

intuitively if you're we can just to compare

the screen

from the object and from the referring expression to see whether they match one out

but this only account about a third of or

all the cases

it could be even it could be harder which means the user refers to something

in the java code only use the attributes

not the name

like for the two dimensional array the array

and it could be even harder they refer to something that

are not properly defined

channel a concept or

which could be just a piece of code

i for example here

you could apply and use that cone if you want it

so back alone is just the random

line of code difficult here so

those are the three challenges or

actually two

and the

the last one is the number of objects in the java code could be very

large which include the map the parables objects or any piece of a co and

is dynamic because as the programming goes

there could be objects removed from the vocal or introduced

that's it

and

then i'll talk about some closely related to prepare for

how people

like to do with this talented before like the first one if that either something

and

paper they work on reference resolution

for a dialogues from the collaborative game which is called the at n-gram in this

game there are seven objects and they are two players to play this game one

is the instructor the other one

well apply the instruction from the instructor to the to manipulate those objects

so the used dialogue his rate and have his rate which are for dialogue his

rate is

any object that were mentioned

recently or from the beginning of the dialogue

and that have his rate was

any objects

that were manipulated

from the beginning of the task

that's how they do it

and

the next one

we can and some fifteen paper

i the used a word as classifier to learn the

a relationship between

referring expression tokens

physical attributes

in this setting

the a set of a objects so they use the kind like a co location

information like for a token

they find all of the

the co location co-located attributes are with the they manually comic a match the referring

expression and the referent so the find the co locating does the co location information

between

tokens and attributes

so the use the learner

i like intention

to predict the referent for a new giving referring expression

so in this paper we follow the either a suntan we use

similar dialogue history and the task is very features

so here's an example of a from the corpus

look here the student just a typed a line of code

a rate goes to new

int but well

then

another line of code there are only if that

the minor a look like it is set up correctly now so we can see

here is a relationship

kinda like a

between the behavior

and the

the referring behavior so after that the t is that in the forum what should

you be storing in that ray so is also coming very close so they will

refer to the same thing i go kind like repeatedly locally

so that's why we think this a dialogue history and

task is very are very important

so we use them

the third kind of information when you is semantic information

given the referring expression which is a noun phrase this noun phrase has different segments

used argument could indicate some kind of a attribute i'll the referent for this referring

expression so a we used a

conditional random field to segment and label this referring expression

is to find out

the attribute information it gives so

though

after this is a segmentation and labeling we confine the attribute segments

like in this a referring expression data rate if it's a category

and the two dimension

in the case

the dimension of this ray

so after that we extract the attribute value from each segment

here

and we use at this added to make the attribute vector so this attribute baxter

is that if the south of a attributes that this referring that this referent of

this or referring expression should have

if we do it correctly right

and so after

before starting their reference resolution a task we want to come like a make a

candidate list because the number of objects in the

in the java code could be very large

i because

contain like everything you know

so is a very intuitive approach with your first late we use all of the

mission objects so far

from the beginning of the session

and

we include all of the manipulate objects

from the beginning of the session into the candidate list

and the final a we include all of the object that match any attribute of

this

in this mission in this referring expression in this

so the reason

here

to match only one attribute is we don't want to miss any

real referent just a

from mistake in padding the

but semantics so that's how we do the

create the candidate list

and the

here

the reference resolution task is defined as to find the most compatible referent most compatible

object from the candidate list

for this referring expression so

this probability is defined as the output

of a classification function

so for the classification function here

we use the four different kinds of the classifiers to see

how do they work in the setting

we used a logistic regression decision tree

nine but yes and then you're networks

so here

when

we can see the probability

referring this given referring expression and

candidate in the candidate list

so we can rank this

probability for all of the candidates and pick the candidate with the highest probability as

the referent

so that's how we

did it

we used here

are the features we use the first group is the dialogue history features

which are

when this object

like we're mission

how long ago

was it mention

the second

a group of a features are

the task is very features

like a how long ago was this object

manipulated like a tight

or selected or

kind of this

the third group of a features are

the semantic features like to measure how

the semantics of a

the referring expression match

a given candidate

though

for the experiments we use the

six sessions

the tutorial data tutorial dialogue

and

the contain three hundred sixty four are referring expressions

and that we manually

label their referendum from the java code

and

we had two annotators

and

the

we got a cap of a your voice six five

and we used six fold cross validation which is basically take one session out in

do the training was

there are the other five sessions in the test on the

the

the last one

two

evaluate our approach we compare with two baseline models the first one is you know

baseline

model they use the dialogue his rate and task is rate in their in their

task

in their approach

so to make it fair

we and the handcrafted lexicon

to provide some

semantic

information for this model

the second baseline models the content and baseline model

because it was

weakly supervised approach

and in

dead-end perform the river

reference resolution

in a dialog setting so

to make it fair we add

the dialogue history and task he three features to this approach

after that you're are the

the results

we got

as we can the our approach got

a higher

our accuracy on the reference resolution have the reason why is higher

is the

the semantics wheeler using the conditional random fields which has a higher accuracy on the

semantics

though

actually there are two groups for the referring fashion

for the reference resolution task because

some of the are referring expressions

contain some semantic information

the estimate are indicates it's

and some of them are just the

products

which does not have semantic

information in it

our so our work here

the contribution here is a basic is mostly on the hour of reference resolution for

those referring expression that contain semantic information

and

so to see a

approach could work given the better

semantic information so we test and using gold standard semantic labels which are made manually

here begins the

using the goal in our semantics to run the same approach again we got

a higher

accuracy

though

this means the semantic information here is very important in doing this reference resolution task

and but is do you like a

there are still room to improve because

the human agreement

are

like is eighty five percent which is a lot higher than the approach that the

remote from the approach

but we did for the future work

i think it will be promising to consider the structure of a

of the dialogue

and

also an unsupervised or weakly supervised approach will be battery also very interesting it doesn't

require much annotation

we're

that's it

and want to

thank our colleagues or their input

and

thank our sponsors

in q

i'm your repeat request so you were saying i we have different problem approaches for

a referring expression

as the pronoun and the non-frontal

right

yes

the difference here is only on the semantic information because we

this

the conch the main contribution of this work

is employing the

semantic information from a referring expression but problems they are pretty simple we don't have

much information from it so we can like run this

this model this approach

by splitting this

the set of referring expressions they are kind looks similar but

yes i think we will consider this when we really being the entire interview system

thank you

yes

the eye gaze would definitely give us more information

like when we do the reference resolution

it's

come back into two like a sum and assumption here

they won't look at the object when they refer to it

so that

could be

another feature directly added to this approach

or maybe there will be some more

sophisticated way to use this kind of information

thank you

the mouse cursor

actually we use the

the selection

which is

the student my flat

this part of a coding task how do they look like or

as a question about it

kind like a hard

one case of a using the mouse or

the cursor

yes but that's definitely and also very interesting

information to consider this case

a well

actually i haven't had a very deep consideration on this

i just

if you like a

in different

additions

of a

the discourse structure this could give some

interesting information on determining

like duh the referent

the details

Reference Resolution in Situated Dialogue with Learned Semantics

Oral Session 5: Semantics: Learning and Inference

Xiaolong Li and Kristy Boyer