0:00:15 | and |
---|
0:00:16 | it's reference resolution in situated dialogue with learned semantics |
---|
0:00:21 | so |
---|
0:00:23 | a last look at the |
---|
0:00:26 | iterative dialogue for |
---|
0:00:29 | sitting at the dialog situated in and |
---|
0:00:32 | environment |
---|
0:00:33 | like a |
---|
0:00:35 | for example |
---|
0:00:36 | for human robot interaction |
---|
0:00:38 | in this image this human was trying to teach this robot to learn the map |
---|
0:00:45 | of the physical environment in this room |
---|
0:00:48 | and the next example it's the intelligent during system |
---|
0:00:53 | and this other tier was trying to teach college steering is to use computer to |
---|
0:01:00 | solve complex problems |
---|
0:01:02 | so as we can see the natural language dialogue between in those environments are highly |
---|
0:01:10 | related to the environment it they frequently referred to the objects or events |
---|
0:01:15 | that happen in the environment |
---|
0:01:18 | but here is an example from the |
---|
0:01:22 | tutorial dialogue which is about java programming |
---|
0:01:28 | in each a tutorial session there is a human tutor and the human student this |
---|
0:01:34 | tutor was trying to teach this student on and java programming as we can see |
---|
0:01:38 | everything here the dialogue is a it's related to the content of the java code |
---|
0:01:44 | say the they talk about the objects |
---|
0:01:48 | in the java code |
---|
0:01:50 | so to build a intelligent tutoring system to understand the dialogue of the user we |
---|
0:01:56 | have to understand |
---|
0:01:58 | the dialogue with in this environment including |
---|
0:02:01 | interpret the referring expressions |
---|
0:02:04 | so |
---|
0:02:05 | the problem is defined as |
---|
0:02:07 | given a referring expression which it is a sequence of a little worse or tokens |
---|
0:02:13 | and the an environment |
---|
0:02:15 | in this case with simplified the environment as a set of a objects so the |
---|
0:02:20 | goal here is to find the most compatible |
---|
0:02:25 | object for this referring expression |
---|
0:02:29 | though for the rest part of this card i'll introduce the corpus we used and |
---|
0:02:35 | the challenges and related words solution experiments than final a future work |
---|
0:02:42 | though the to the corpus we used is from |
---|
0:02:47 | is tutorial that all it's a set of all started dialogues from a java programming |
---|
0:02:53 | though those dialogs are between human you're in humans do you need |
---|
0:02:58 | here's the interface how we collect the data which is eclipse plug-in so this plugin |
---|
0:03:05 | will lead you want your in the in the students to work remotely like in |
---|
0:03:10 | different rooms |
---|
0:03:12 | just like using google dot so whenever the steering and |
---|
0:03:15 | and it the code that you're will see it |
---|
0:03:17 | and they can also see and text message to each other |
---|
0:03:20 | i within this |
---|
0:03:22 | this tool |
---|
0:03:23 | so we class of the dialogue between them and |
---|
0:03:26 | all of the editing behaviors |
---|
0:03:28 | so |
---|
0:03:31 | at it |
---|
0:03:31 | the tutorial dialogue is mostly on introductory |
---|
0:03:36 | programming in java programming |
---|
0:03:38 | which involves creating traversing |
---|
0:03:41 | and modifying parallel |
---|
0:03:42 | a race this out was collected in two thousand seven which includes forty five two |
---|
0:03:48 | recessions |
---|
0:03:49 | almost five thousand our results in total for each session a like last about one |
---|
0:03:55 | hour which has a on average one hundred and eight |
---|
0:04:00 | our races |
---|
0:04:02 | though they are some challenges to do there were a reference resolution in such a |
---|
0:04:08 | a setting |
---|
0:04:10 | the easy cases like when the user refer to something in the java code only |
---|
0:04:15 | use the name the proper name |
---|
0:04:17 | if you |
---|
0:04:18 | intuitively if you're we can just to compare |
---|
0:04:22 | the screen |
---|
0:04:23 | from the object and from the referring expression to see whether they match one out |
---|
0:04:28 | but this only account about a third of or |
---|
0:04:30 | all the cases |
---|
0:04:32 | it could be even it could be harder which means the user refers to something |
---|
0:04:37 | in the java code only use the attributes |
---|
0:04:40 | not the name |
---|
0:04:41 | like for the two dimensional array the array |
---|
0:04:45 | and it could be even harder they refer to something that |
---|
0:04:49 | are not properly defined |
---|
0:04:51 | channel a concept or |
---|
0:04:55 | which could be just a piece of code |
---|
0:04:57 | i for example here |
---|
0:05:01 | you could apply and use that cone if you want it |
---|
0:05:05 | so back alone is just the random |
---|
0:05:08 | line of code difficult here so |
---|
0:05:12 | those are the three challenges or |
---|
0:05:14 | actually two |
---|
0:05:15 | and the |
---|
0:05:16 | the last one is the number of objects in the java code could be very |
---|
0:05:21 | large which include the map the parables objects or any piece of a co and |
---|
0:05:27 | is dynamic because as the programming goes |
---|
0:05:32 | there could be objects removed from the vocal or introduced |
---|
0:05:37 | so |
---|
0:05:38 | that's it |
---|
0:05:39 | and |
---|
0:05:40 | then i'll talk about some closely related to prepare for |
---|
0:05:45 | how people |
---|
0:05:46 | like to do with this talented before like the first one if that either something |
---|
0:05:51 | and |
---|
0:05:52 | paper they work on reference resolution |
---|
0:05:56 | for a dialogues from the collaborative game which is called the at n-gram in this |
---|
0:06:01 | game there are seven objects and they are two players to play this game one |
---|
0:06:08 | is the instructor the other one |
---|
0:06:10 | well apply the instruction from the instructor to the to manipulate those objects |
---|
0:06:15 | so the used dialogue his rate and have his rate which are for dialogue his |
---|
0:06:20 | rate is |
---|
0:06:21 | any object that were mentioned |
---|
0:06:23 | recently or from the beginning of the dialogue |
---|
0:06:26 | and that have his rate was |
---|
0:06:28 | any objects |
---|
0:06:29 | that were manipulated |
---|
0:06:31 | from the beginning of the task |
---|
0:06:34 | that's how they do it |
---|
0:06:35 | and |
---|
0:06:36 | the next one |
---|
0:06:38 | is |
---|
0:06:39 | we can and some fifteen paper |
---|
0:06:42 | i the used a word as classifier to learn the |
---|
0:06:46 | a relationship between |
---|
0:06:49 | referring expression tokens |
---|
0:06:51 | to |
---|
0:06:54 | physical attributes |
---|
0:06:56 | in this setting |
---|
0:06:58 | the a set of a objects so they use the kind like a co location |
---|
0:07:04 | information like for a token |
---|
0:07:07 | they find all of the |
---|
0:07:11 | the co location co-located attributes are with the they manually comic a match the referring |
---|
0:07:17 | expression and the referent so the find the co locating does the co location information |
---|
0:07:22 | between |
---|
0:07:24 | tokens and attributes |
---|
0:07:27 | so the use the learner |
---|
0:07:28 | i like intention |
---|
0:07:30 | to predict the referent for a new giving referring expression |
---|
0:07:35 | so in this paper we follow the either a suntan we use |
---|
0:07:41 | similar dialogue history and the task is very features |
---|
0:07:46 | so here's an example of a from the corpus |
---|
0:07:52 | look here the student just a typed a line of code |
---|
0:07:56 | a rate goes to new |
---|
0:07:58 | int but well |
---|
0:08:00 | then |
---|
0:08:01 | another line of code there are only if that |
---|
0:08:05 | the minor a look like it is set up correctly now so we can see |
---|
0:08:11 | here is a relationship |
---|
0:08:13 | kinda like a |
---|
0:08:15 | between the behavior |
---|
0:08:17 | and the |
---|
0:08:17 | the referring behavior so after that the t is that in the forum what should |
---|
0:08:25 | you be storing in that ray so is also coming very close so they will |
---|
0:08:31 | refer to the same thing i go kind like repeatedly locally |
---|
0:08:35 | so that's why we think this a dialogue history and |
---|
0:08:39 | task is very are very important |
---|
0:08:42 | so we use them |
---|
0:08:43 | the third kind of information when you is semantic information |
---|
0:08:47 | given the referring expression which is a noun phrase this noun phrase has different segments |
---|
0:08:52 | used argument could indicate some kind of a attribute i'll the referent for this referring |
---|
0:08:58 | expression so a we used a |
---|
0:09:05 | conditional random field to segment and label this referring expression |
---|
0:09:09 | is to find out |
---|
0:09:11 | the attribute information it gives so |
---|
0:09:18 | though |
---|
0:09:19 | after this is a segmentation and labeling we confine the attribute segments |
---|
0:09:26 | like in this a referring expression data rate if it's a category |
---|
0:09:31 | and the two dimension |
---|
0:09:33 | in the case |
---|
0:09:34 | the dimension of this ray |
---|
0:09:36 | so after that we extract the attribute value from each segment |
---|
0:09:42 | here |
---|
0:09:44 | and we use at this added to make the attribute vector so this attribute baxter |
---|
0:09:49 | is that if the south of a attributes that this referring that this referent of |
---|
0:09:55 | this or referring expression should have |
---|
0:09:59 | if we do it correctly right |
---|
0:10:01 | and so after |
---|
0:10:05 | before starting their reference resolution a task we want to come like a make a |
---|
0:10:12 | candidate list because the number of objects in the |
---|
0:10:18 | in the java code could be very large |
---|
0:10:21 | i because |
---|
0:10:22 | i |
---|
0:10:23 | contain like everything you know |
---|
0:10:25 | so is a very intuitive approach with your first late we use all of the |
---|
0:10:32 | mission objects so far |
---|
0:10:35 | from the beginning of the session |
---|
0:10:36 | and |
---|
0:10:38 | we include all of the manipulate objects |
---|
0:10:41 | from the beginning of the session into the candidate list |
---|
0:10:44 | and the final a we include all of the object that match any attribute of |
---|
0:10:50 | this |
---|
0:10:52 | in this mission in this referring expression in this |
---|
0:10:56 | so the reason |
---|
0:10:57 | here |
---|
0:10:58 | to match only one attribute is we don't want to miss any |
---|
0:11:02 | real referent just a |
---|
0:11:05 | from mistake in padding the |
---|
0:11:08 | but semantics so that's how we do the |
---|
0:11:11 | create the candidate list |
---|
0:11:14 | and the |
---|
0:11:16 | here |
---|
0:11:17 | the reference resolution task is defined as to find the most compatible referent most compatible |
---|
0:11:26 | object from the candidate list |
---|
0:11:29 | for this referring expression so |
---|
0:11:32 | this probability is defined as the output |
---|
0:11:35 | of a classification function |
---|
0:11:37 | so for the classification function here |
---|
0:11:40 | we use the four different kinds of the classifiers to see |
---|
0:11:43 | how do they work in the setting |
---|
0:11:45 | we used a logistic regression decision tree |
---|
0:11:48 | nine but yes and then you're networks |
---|
0:11:51 | so here |
---|
0:11:54 | when |
---|
0:11:55 | we can see the probability |
---|
0:11:58 | of |
---|
0:12:00 | referring this given referring expression and |
---|
0:12:03 | candidate in the candidate list |
---|
0:12:05 | so we can rank this |
---|
0:12:08 | probability for all of the candidates and pick the candidate with the highest probability as |
---|
0:12:14 | the referent |
---|
0:12:14 | so that's how we |
---|
0:12:16 | did it |
---|
0:12:18 | so |
---|
0:12:20 | we used here |
---|
0:12:24 | are the features we use the first group is the dialogue history features |
---|
0:12:29 | which are |
---|
0:12:31 | when this object |
---|
0:12:34 | like we're mission |
---|
0:12:36 | how long ago |
---|
0:12:38 | was it mention |
---|
0:12:40 | the second |
---|
0:12:41 | a group of a features are |
---|
0:12:44 | the task is very features |
---|
0:12:46 | like a how long ago was this object |
---|
0:12:50 | manipulated like a tight |
---|
0:12:52 | or selected or |
---|
0:12:55 | kind of this |
---|
0:12:56 | the third group of a features are |
---|
0:13:00 | the semantic features like to measure how |
---|
0:13:03 | the semantics of a |
---|
0:13:05 | the referring expression match |
---|
0:13:07 | a given candidate |
---|
0:13:10 | though |
---|
0:13:13 | for the experiments we use the |
---|
0:13:17 | six sessions |
---|
0:13:18 | the tutorial data tutorial dialogue |
---|
0:13:21 | and |
---|
0:13:22 | the contain three hundred sixty four are referring expressions |
---|
0:13:27 | and that we manually |
---|
0:13:29 | label their referendum from the java code |
---|
0:13:32 | and |
---|
0:13:34 | we had two annotators |
---|
0:13:35 | and |
---|
0:13:37 | the |
---|
0:13:38 | we got a cap of a your voice six five |
---|
0:13:42 | and we used six fold cross validation which is basically take one session out in |
---|
0:13:49 | do the training was |
---|
0:13:50 | there are the other five sessions in the test on the |
---|
0:13:53 | the |
---|
0:13:55 | the last one |
---|
0:13:59 | two |
---|
0:14:01 | evaluate our approach we compare with two baseline models the first one is you know |
---|
0:14:07 | baseline |
---|
0:14:08 | model they use the dialogue his rate and task is rate in their in their |
---|
0:14:14 | task |
---|
0:14:15 | in their approach |
---|
0:14:16 | so to make it fair |
---|
0:14:19 | we and the handcrafted lexicon |
---|
0:14:23 | to provide some |
---|
0:14:25 | semantic |
---|
0:14:26 | information for this model |
---|
0:14:29 | the second baseline models the content and baseline model |
---|
0:14:33 | because it was |
---|
0:14:36 | weakly supervised approach |
---|
0:14:38 | and in |
---|
0:14:39 | dead-end perform the river |
---|
0:14:41 | reference resolution |
---|
0:14:43 | in a dialog setting so |
---|
0:14:45 | to make it fair we add |
---|
0:14:47 | the dialogue history and task he three features to this approach |
---|
0:14:53 | after that you're are the |
---|
0:14:56 | the results |
---|
0:14:58 | we got |
---|
0:14:58 | as we can the our approach got |
---|
0:15:02 | a higher |
---|
0:15:04 | our accuracy on the reference resolution have the reason why is higher |
---|
0:15:10 | is |
---|
0:15:11 | is the |
---|
0:15:12 | the semantics wheeler using the conditional random fields which has a higher accuracy on the |
---|
0:15:18 | semantics |
---|
0:15:20 | though |
---|
0:15:22 | actually there are two groups for the referring fashion |
---|
0:15:27 | for the reference resolution task because |
---|
0:15:29 | some of the are referring expressions |
---|
0:15:31 | contain some semantic information |
---|
0:15:34 | the estimate are indicates it's |
---|
0:15:36 | and some of them are just the |
---|
0:15:38 | products |
---|
0:15:39 | which does not have semantic |
---|
0:15:43 | information in it |
---|
0:15:46 | our so our work here |
---|
0:15:49 | the contribution here is a basic is mostly on the hour of reference resolution for |
---|
0:15:55 | those referring expression that contain semantic information |
---|
0:16:02 | and |
---|
0:16:03 | so to see a |
---|
0:16:07 | approach could work given the better |
---|
0:16:09 | semantic information so we test and using gold standard semantic labels which are made manually |
---|
0:16:17 | so |
---|
0:16:18 | here begins the |
---|
0:16:20 | using the goal in our semantics to run the same approach again we got |
---|
0:16:26 | a higher |
---|
0:16:28 | accuracy |
---|
0:16:30 | though |
---|
0:16:31 | this means the semantic information here is very important in doing this reference resolution task |
---|
0:16:39 | and but is do you like a |
---|
0:16:44 | there are still room to improve because |
---|
0:16:47 | the human agreement |
---|
0:16:49 | are |
---|
0:16:51 | like is eighty five percent which is a lot higher than the approach that the |
---|
0:16:58 | remote from the approach |
---|
0:17:00 | as |
---|
0:17:02 | but we did for the future work |
---|
0:17:05 | i think it will be promising to consider the structure of a |
---|
0:17:10 | of the dialogue |
---|
0:17:12 | and |
---|
0:17:14 | also an unsupervised or weakly supervised approach will be battery also very interesting it doesn't |
---|
0:17:22 | require much annotation |
---|
0:17:25 | we're |
---|
0:17:27 | that's it |
---|
0:17:29 | and want to |
---|
0:17:31 | thank our colleagues or their input |
---|
0:17:34 | and |
---|
0:17:35 | thank our sponsors |
---|
0:17:39 | in q |
---|
0:18:10 | i'm your repeat request so you were saying i we have different problem approaches for |
---|
0:18:14 | a referring expression |
---|
0:18:17 | like |
---|
0:18:18 | as the pronoun and the non-frontal |
---|
0:18:20 | right |
---|
0:18:21 | yes |
---|
0:18:23 | the difference here is only on the semantic information because we |
---|
0:18:28 | this |
---|
0:18:28 | the conch the main contribution of this work |
---|
0:18:30 | is employing the |
---|
0:18:33 | semantic information from a referring expression but problems they are pretty simple we don't have |
---|
0:18:39 | much information from it so we can like run this |
---|
0:18:44 | this model this approach |
---|
0:18:46 | by splitting this |
---|
0:18:48 | the set of referring expressions they are kind looks similar but |
---|
0:18:53 | yes i think we will consider this when we really being the entire interview system |
---|
0:18:59 | thank you |
---|
0:19:28 | yes |
---|
0:19:29 | the eye gaze would definitely give us more information |
---|
0:19:33 | like when we do the reference resolution |
---|
0:19:35 | it's |
---|
0:19:36 | come back into two like a sum and assumption here |
---|
0:19:40 | they won't look at the object when they refer to it |
---|
0:19:43 | so that |
---|
0:19:45 | could be |
---|
0:19:46 | another feature directly added to this approach |
---|
0:19:49 | or maybe there will be some more |
---|
0:19:53 | sophisticated way to use this kind of information |
---|
0:19:58 | thank you |
---|
0:20:07 | the mouse cursor |
---|
0:20:10 | actually we use the |
---|
0:20:13 | the selection |
---|
0:20:15 | which is |
---|
0:20:16 | the student my flat |
---|
0:20:18 | this part of a coding task how do they look like or |
---|
0:20:23 | as a question about it |
---|
0:20:25 | kind like a hard |
---|
0:20:27 | one case of a using the mouse or |
---|
0:20:29 | the cursor |
---|
0:20:32 | yes but that's definitely and also very interesting |
---|
0:20:37 | information to consider this case |
---|
0:20:52 | a well |
---|
0:20:55 | actually i haven't had a very deep consideration on this |
---|
0:21:00 | i just |
---|
0:21:01 | if you like a |
---|
0:21:03 | in different |
---|
0:21:05 | like |
---|
0:21:06 | additions |
---|
0:21:08 | of a |
---|
0:21:09 | the discourse structure this could give some |
---|
0:21:12 | interesting information on determining |
---|
0:21:15 | like duh the referent |
---|
0:21:22 | the details |
---|