0:00:16 | so i'm presenting the syllable they have to a whole team a people from my |
---|
0:00:21 | agency shown on and you are here j is not |
---|
0:00:28 | and this is gonna be a little bit different you "'cause" we're gonna have no |
---|
0:00:32 | neural networks knock or run with of the pause and no f scores |
---|
0:00:38 | no numbers |
---|
0:00:39 | so is gonna be a little difference |
---|
0:00:41 | so here's a the problem that at that |
---|
0:00:45 | we are i |
---|
0:00:46 | the so we start state-of-the-art in dialogue systems actually a couple of you please and |
---|
0:00:52 | the you know and others have had a similar slide |
---|
0:00:57 | what we're doing mostly is very simple parsing based on keywords phrases and so on |
---|
0:01:04 | a regular expressions as one |
---|
0:01:08 | very simple dialogue models based on either finite state somehow or frame systems with slot |
---|
0:01:18 | filling its own |
---|
0:01:20 | engineer for a specific application |
---|
0:01:24 | and there's |
---|
0:01:24 | sounds of applications for these |
---|
0:01:28 | but |
---|
0:01:29 | every single dialog system is developed for that specific application in which you some cases |
---|
0:01:33 | here this in get out |
---|
0:01:37 | modified domain but essentially there's sort of separate dialogue systems they're kind of work together |
---|
0:01:41 | with a single the interface |
---|
0:01:44 | but importantly there is no transfer between these domains there is no generic |
---|
0:01:52 | capability in these systems the transfer from |
---|
0:01:56 | one domain to another |
---|
0:01:59 | and as far as the kind of interactions that these other systems allow |
---|
0:02:04 | there's |
---|
0:02:05 | no effective the verification or corrections the kind of dialogue that allow is actually very |
---|
0:02:12 | limited |
---|
0:02:15 | so here's our position |
---|
0:02:17 | dialogue is an activity that we can be and should be modeled independently of the |
---|
0:02:23 | application domain |
---|
0:02:26 | we i understanding of language to effectively and robustly handle the a broad range of |
---|
0:02:32 | user utterances that the same |
---|
0:02:36 | intention can be expressed in so many different ways |
---|
0:02:41 | added |
---|
0:02:42 | most of these |
---|
0:02:45 | finite state based and with simple parsing hubris of data that are sitting in a |
---|
0:02:50 | day just common |
---|
0:02:53 | all the somebody's willing to spend years just |
---|
0:02:57 | encoding what's a regular expressions i suppose |
---|
0:03:00 | and we also think that the community needs to the frameworks to facilitate the development |
---|
0:03:05 | of these a complex mixed-initiative systems with very sophisticated back-end recently and i think there's |
---|
0:03:12 | a fierce of such tools |
---|
0:03:15 | we see for example in parsing with a stand for the tools or nltk or |
---|
0:03:21 | other various tools |
---|
0:03:23 | people adopted them and they started using them and they got better outcomes of that |
---|
0:03:29 | but in the dialogue maybe we don't have sophisticated enough tools |
---|
0:03:34 | a tool allows for the for people to a develop such systems |
---|
0:03:39 | so |
---|
0:03:41 | as use only the title our model is |
---|
0:03:45 | based on the collaborative problem solving so what is collaborative problem solving |
---|
0:03:50 | well when they collaborate what they do they rehabilitate you they developed jointly solutions the |
---|
0:03:57 | identify and resolve errors problems of the here a kind of the progress as the |
---|
0:04:03 | task is going on |
---|
0:04:07 | they jointly perform actions the of course they can negotiate roles |
---|
0:04:13 | and they learned from one another |
---|
0:04:14 | at all these things are done through communication right it's not necessarily by language communication |
---|
0:04:21 | could be gestures it could be other kinds of communication but it is by communication |
---|
0:04:26 | so we need to |
---|
0:04:29 | so |
---|
0:04:30 | our central thesis is that essentially all or at least most of the human machine |
---|
0:04:36 | language based communication can you model effectively |
---|
0:04:40 | as collaborative problem solving |
---|
0:04:44 | so |
---|
0:04:45 | what does the collected for solving a model in table |
---|
0:04:50 | so what we need by this is the is that we need to model the |
---|
0:04:54 | shared initial space between the two agents or some people actually have a |
---|
0:05:01 | i and the something about |
---|
0:05:04 | modified agents a sort of |
---|
0:05:07 | once i |
---|
0:05:09 | agent dialogue here we just limit ourselves to two but |
---|
0:05:13 | even with multiple the same response applied |
---|
0:05:16 | so what is this and intersentential spaceport kind of objects that we are dealing with |
---|
0:05:22 | these are particles solutions |
---|
0:05:24 | and understanding common ground session that strange |
---|
0:05:29 | and all this shared understanding |
---|
0:05:32 | arises from communication we need to communicate and agree on things and so on |
---|
0:05:39 | one page counts |
---|
0:05:40 | create a collaborative goal or as solution there has you to a pursue something together |
---|
0:05:48 | obviously a selection japan like to go without |
---|
0:05:51 | the other person |
---|
0:05:53 | so this is i pictures taken from a paper i data alone and a couple |
---|
0:05:59 | other of my calling problem |
---|
0:06:01 | two thousand two |
---|
0:06:03 | the models place the sort of the this case of tasks in this model in |
---|
0:06:10 | four different areas communicative interaction a collaborative problem solving a problem solving a individual problem |
---|
0:06:19 | solving actually |
---|
0:06:20 | i don't i did of course might interest in this talk is just about that |
---|
0:06:24 | solve the problem solving here |
---|
0:06:26 | which really can look at the object in there really reflect the problem solving actually |
---|
0:06:32 | the same kind of thing |
---|
0:06:33 | except that their properties |
---|
0:06:35 | so |
---|
0:06:37 | the central thesis that we have that in the two thousand every two thousand and |
---|
0:06:43 | it wasn't just ask other people have the same idea is that at that level |
---|
0:06:48 | when you can a reason in a domain independent represent things in the domain independent |
---|
0:06:54 | way |
---|
0:06:55 | but this has never been rated are properly and we also didn't we problems today |
---|
0:07:02 | we have a larger prototype we never really did it so here today i'm announcing |
---|
0:07:08 | that we know that |
---|
0:07:10 | and |
---|
0:07:12 | this |
---|
0:07:13 | this architecture would be familiar to all of you it doesn't look very different from |
---|
0:07:17 | other things that we which is so far |
---|
0:07:19 | so we have natural understanding there's lexicon ontology |
---|
0:07:23 | the dialogue management which is really the class problem solving agent at that we have |
---|
0:07:28 | it is in the centre |
---|
0:07:29 | there's a the backend problem solving or okay here |
---|
0:07:33 | a behavioral agent there's generation so this doesn't look very a different from other systems |
---|
0:07:41 | the parts that are in colour or the components of cogent |
---|
0:07:46 | of is domainindependent shall right so by itself people look at that you're not gonna |
---|
0:07:51 | have a dialogue system just by that having that but you can have the this |
---|
0:07:56 | dialogue system i dialogue system by adding to that |
---|
0:08:01 | the behavior spectrum domain specific and not to mention that language generation and of course |
---|
0:08:07 | generation you could press all have some higher level but mainly depend generation components but |
---|
0:08:14 | we don't have it |
---|
0:08:16 | so a lot of people can do sort of in domain you an iteration |
---|
0:08:22 | and |
---|
0:08:23 | so |
---|
0:08:25 | we also to do i'm just gonna talk a little bit about that components there |
---|
0:08:29 | so the natural language understanding the workforce of everything that we didn't for the last |
---|
0:08:35 | twenty some years as in the tricks parser |
---|
0:08:39 | it's a d |
---|
0:08:45 | the that is too sparse to use a very representation of the meaning of every |
---|
0:08:50 | a sentence it has a very sure principle ontology it has a very large lexicon |
---|
0:08:57 | some of it or ten thousand maybe more |
---|
0:09:00 | are handled lexical entries we it stand by learning from a word that but a |
---|
0:09:07 | session we derive automatically so freebase for example for we driver automatically the roles that |
---|
0:09:13 | have the they are from definitions |
---|
0:09:16 | it's and so on |
---|
0:09:19 | and |
---|
0:09:22 | i'm not gonna talk about to make too many details but it is available online |
---|
0:09:26 | and you can actually check it there's a there's a web service for the basic |
---|
0:09:30 | parser and or number of variations of the parser as well |
---|
0:09:34 | the output |
---|
0:09:37 | positions |
---|
0:09:51 | i don't see that |
---|
0:09:53 | data |
---|
0:09:54 | so i don't think this is actually visible but |
---|
0:09:58 | so this is the |
---|
0:10:00 | web interface i just put of sensors earlier something that it came up earlier i |
---|
0:10:06 | need a hotel in the centre of calibration |
---|
0:10:09 | and that's |
---|
0:10:10 | what a parse multiply and you can see that |
---|
0:10:15 | everything so there's a speech act at all |
---|
0:10:18 | every single more represented here has a type in the ontology |
---|
0:10:24 | so for hotel accommodation for needed one is one |
---|
0:10:28 | can the residual graphic region |
---|
0:10:31 | i even with the british spelling and their got that right |
---|
0:10:37 | and if you look for example at the next one i prefer very nice hotels |
---|
0:10:42 | when you can see that before is also one just like need which is something |
---|
0:10:47 | that you probably want to you |
---|
0:10:51 | and you can see how adjectives have |
---|
0:10:55 | very interesting types here the space here is basically a value on a scale of |
---|
0:11:00 | expressiveness as it for and so on |
---|
0:11:03 | so you get very rich representation |
---|
0:11:14 | well |
---|
0:11:15 | there's an additional thing is here the dealing with reference resolution ellipsis processing ontology mapping |
---|
0:11:21 | i'm not gonna talk too much about this |
---|
0:11:25 | i one is the here is that the there's conventional speech act identification still sometimes |
---|
0:11:29 | you can ask a question by making socially an assertion or you can you can |
---|
0:11:36 | make an assertion by asking a question for making a request asking it a question |
---|
0:11:41 | so there's conventional mapping between the surface speech act and the user speech act but |
---|
0:11:49 | you just really |
---|
0:11:51 | so not to do this yes agent |
---|
0:11:54 | so a |
---|
0:11:56 | essentially the output of all these national chance any sizes a feed into the a |
---|
0:12:01 | collaborative problem solving agent and what it does is it provides a domain and model |
---|
0:12:06 | communication adaptable to new domains |
---|
0:12:10 | on |
---|
0:12:11 | what side it just |
---|
0:12:13 | what really could be called just intention recognition |
---|
0:12:16 | so there's communicated at coming in from user utterance you want i understand would be |
---|
0:12:22 | fashion of the user is i and we call that can also be guy |
---|
0:12:28 | and obviously on the other side adjusting for someone to the specs much time on |
---|
0:12:32 | that |
---|
0:12:34 | if the system itself once to communicate to the user it will do that is |
---|
0:12:39 | actually creating a collaborative problem solving task which can get sense to the generation component |
---|
0:12:45 | and eventually we'll get into like that |
---|
0:12:50 | so this section does that and essentially maintain the quality of a state |
---|
0:12:57 | which |
---|
0:12:58 | all these acts together essentially drive the a conversational structure so that's why it is |
---|
0:13:03 | a dialogue model |
---|
0:13:06 | and again going to repeat myself here but this is primes good idea that there |
---|
0:13:09 | is in the in domain and the semantics of language that supports |
---|
0:13:14 | reasoning about intentions |
---|
0:13:18 | so there but |
---|
0:13:20 | there is attention here between the desire for domain independent processing and the need for |
---|
0:13:26 | very affordable a specific processing so |
---|
0:13:29 | understanding detection of user is almost always it possible to do in just the domain |
---|
0:13:35 | independent way so the way we deal with this problem is that essentially the collaborative |
---|
0:13:41 | problem solving agent should be understanding of the user intention is a hypothesis |
---|
0:13:48 | and then this is over to the behavioral agent which concludes sort of grounding of |
---|
0:13:52 | all objects and is actually trying to figure out does this make any sense in |
---|
0:13:57 | this particular state of the task does this makes test and if so then that |
---|
0:14:04 | i guess |
---|
0:14:05 | committed as a show if it's a goal then the system can mislead as a |
---|
0:14:12 | as a shared real but if not there can be clarification so on going on |
---|
0:14:19 | so is actually the way this is done based on the previous evaluate commit a |
---|
0:14:24 | little |
---|
0:14:25 | so the collaborative problem solving agent will figure out a probably problem solving a which |
---|
0:14:32 | explains the user utterance |
---|
0:14:35 | would send an evaluation and evaluate at the behavioral agent |
---|
0:14:40 | and the behavioral agent agree use it will send back an acceptable and only and |
---|
0:14:45 | we have a commit to the goal of the shared |
---|
0:14:51 | and this is the same way that we're dealing with a request proposals of those |
---|
0:14:57 | are questions as well |
---|
0:15:00 | if the va |
---|
0:15:03 | doesn't |
---|
0:15:04 | a light |
---|
0:15:05 | at the evaluation there's many different that there are several different ways it can handle |
---|
0:15:10 | with this one is just say a rejection actually i think this should be unacceptable |
---|
0:15:15 | but anyway |
---|
0:15:16 | but |
---|
0:15:17 | we use the like to do this and it can actually give a release |
---|
0:15:23 | it's a horizontal we don't have enough box for corporate law |
---|
0:15:28 | it is also possible to propose alternative way and together that for a to the |
---|
0:15:34 | resulting |
---|
0:15:36 | i'm gonna skip on aspect is just models |
---|
0:15:39 | so in the paper is a very detailed description of the various a quite a |
---|
0:15:44 | problem solving a |
---|
0:15:46 | so i'm not gonna going to the detail so there's a number of them have |
---|
0:15:50 | to deal with gold so we cannot do not select d for a goal if |
---|
0:15:55 | you don't wanna deal with the right now you can completely abandon the goal or |
---|
0:15:59 | we can really easy to release it means that it's completed |
---|
0:16:02 | satisfactorily more or not |
---|
0:16:06 | and there's a there's a bunch back support knowledge in make an assertion that is |
---|
0:16:10 | actually once is committed to that means of the agent a now believe whatever you |
---|
0:16:18 | don't the whatever that whenever the human user in intense corpus and the belief |
---|
0:16:25 | this question is a ask even task w a just to what |
---|
0:16:30 | questions |
---|
0:16:31 | you can see in a number of examples that |
---|
0:16:33 | quite complicated example these are actual examples from system you |
---|
0:16:38 | including something like doesn't amount of sorely |
---|
0:16:41 | at the conditional you |
---|
0:16:43 | at a one that if we increase the amount of whatever the some other proteins |
---|
0:16:47 | all |
---|
0:16:48 | or i wh with choices of the gt wagner propose which are regulated by a |
---|
0:16:53 | reinstall |
---|
0:16:55 | so this is all the little and there's a number of access related to the |
---|
0:16:58 | a problem solving status so again acceptable not an unacceptable are essentially interpretation yes where |
---|
0:17:06 | the da says i like that i don't like it that goes can be we |
---|
0:17:11 | use will reject it |
---|
0:17:13 | they can be failures of execution i answers to questions and execution status which can |
---|
0:17:20 | be either |
---|
0:17:21 | done at the very end but it can still it can be also used to |
---|
0:17:25 | just more progress i'm still working on this |
---|
0:17:30 | okay well as you one is the u |
---|
0:17:35 | so |
---|
0:17:36 | what is mean to add a behavioral agent to actually haven't i was system based |
---|
0:17:42 | on cogent |
---|
0:17:43 | so |
---|
0:17:45 | you can think of the cts access establishes a sort of a protocol was implemented |
---|
0:17:49 | protocol and any sure that the obligations that these things create |
---|
0:17:55 | are satisfied |
---|
0:17:57 | then after that there's nothing else to do essentially there's no requirement for how the |
---|
0:18:02 | behavioral agent represents intuitively |
---|
0:18:05 | i think what it's a line system or a very simple database lookup |
---|
0:18:09 | what kind appended complexity has |
---|
0:18:12 | how many some agents are out there are a as long as there's a single |
---|
0:18:17 | interface a single overarching yea everything should be fine |
---|
0:18:21 | with it has a models alone |
---|
0:18:24 | there are some related ways of affecting how the natural language understanding works |
---|
0:18:30 | but is somewhat so you really want to use this and actually |
---|
0:18:36 | change how the natural language understanding work because it's not good enough you ask the |
---|
0:18:42 | did you never i'm not reliable |
---|
0:18:45 | so we have a number of very implement coded based systems in very different domains |
---|
0:18:51 | very different interactions is |
---|
0:18:54 | so by duration |
---|
0:18:57 | that station in an assistant a biologist assist and a bunch of systems that have |
---|
0:19:03 | to do with the blocks world |
---|
0:19:05 | more or less |
---|
0:19:07 | and some others the that are sort of music composition visual storytelling that's creating such |
---|
0:19:13 | scenarios for making movies essentially with animated characters so with very different domains very different |
---|
0:19:20 | vocabulary very different interaction style |
---|
0:19:25 | so i'm not gonna go too much into a into a we have used systems |
---|
0:19:29 | but one of the reviewers we want to see the by iteration a system |
---|
0:19:34 | and i could put too much into the paper because it wasn't published and it |
---|
0:19:38 | still isn't really |
---|
0:19:40 | but i'm gonna give you a little video of the system and |
---|
0:19:45 | so these are all systems except for the one that you are represented the other |
---|
0:19:50 | day all these systems are not develop is people power cogent and they developed on |
---|
0:19:56 | the role |
---|
0:19:57 | so let's look at it of a dialogue |
---|
0:20:09 | providing you understand looks like logical systems like |
---|
0:20:15 | was there |
---|
0:20:19 | one is going to be sensor |
---|
0:20:21 | but the trees are a little bit |
---|
0:20:23 | the rule machine i don't want the one here |
---|
0:20:27 | sorry but |
---|
0:20:31 | alright so here we would have sort of a the dialogue history then is a |
---|
0:20:36 | idea a system by averages |
---|
0:20:39 | what you from an implementation and what you what the goal here i want to |
---|
0:20:45 | find out how you be shown in the |
---|
0:20:50 | b equal to these two genes |
---|
0:20:54 | and there's just outline i think it's probably best work |
---|
0:20:58 | so i'm so what is the goal here i want to find an explanation so |
---|
0:21:03 | it's a very interesting type of goal of how this happens |
---|
0:21:09 | and the way the system knows how to provide an answer that time is to |
---|
0:21:13 | build and what a model of the molecular interactions |
---|
0:21:18 | and can try to find out |
---|
0:21:20 | one that you are maybe we which is kind of the source |
---|
0:21:25 | useless is g the joan i in this particular cells |
---|
0:21:30 | so |
---|
0:21:31 | i'm gonna you go your |
---|
0:21:39 | so the user then asks how does your maybe if we regulate pi okay now |
---|
0:21:43 | why did they know you can see here about the p eight we hate you |
---|
0:21:47 | "'cause" they're biologists obviously this is not a system for novices |
---|
0:21:51 | and what the system does it actually looks also there's a huge array of a |
---|
0:21:57 | by will just pacific agents |
---|
0:21:59 | including ones that go look up a ways in a perfect database is |
---|
0:22:05 | there's one but actually read papers and can we can extract information from the air |
---|
0:22:11 | so it defines a watermark task between these two |
---|
0:22:16 | g and it creates a network that the user can use it as a source |
---|
0:22:22 | of information |
---|
0:22:24 | so i'm gonna speed up because i know my ties are already right it is |
---|
0:22:29 | okay |
---|
0:22:31 | so a and creates a so |
---|
0:22:35 | i'm just gonna lexical and only because it is below |
---|
0:22:39 | so not the user creates with the system at i a very specific don't model |
---|
0:22:44 | of this |
---|
0:22:46 | the system actually based on what it sees it can suggest additional information based on |
---|
0:22:52 | what it knows |
---|
0:22:54 | and the user can look at it and say well okay that |
---|
0:22:56 | good enough with an actual i know something even more specific than that |
---|
0:23:00 | and the system comes back you can see here |
---|
0:23:03 | but |
---|
0:23:05 | to actually explain |
---|
0:23:07 | the original question that the user a |
---|
0:23:11 | and there's more it can actually take this and create a dynamic model about it |
---|
0:23:17 | can ask questions for example is the monitor for whatever protein high and you can |
---|
0:23:23 | see all kinds of useful information about so i'll stop here |
---|
0:23:44 | for |
---|
0:23:46 | four point recognition we actually don't to a |
---|
0:23:49 | in the in the air agent in the cccs agent |
---|
0:23:53 | we don't actually use right no plan recognition i know that i |
---|
0:23:58 | more me |
---|
0:23:59 | running when you |
---|
0:24:01 | understanding dialog |
---|
0:24:03 | we |
---|
0:24:04 | for now we don't at high |
---|
0:24:11 | so |
---|
0:24:12 | the i you can see some essentially the one where of i've answering this question |
---|
0:24:17 | is why was why where we successful with this where we're reward before and done |
---|
0:24:23 | more work before because of this the way we split |
---|
0:24:27 | what can be done in the domain independent way from what can be done in |
---|
0:24:30 | a domain independent way |
---|
0:24:32 | so a lot of the time i is a set in this evaluates commit little |
---|
0:24:36 | we basically just wrote things over the fast and say well you figure it out |
---|
0:24:39 | so most of the situational context and in there is not a model of user |
---|
0:24:43 | modelling in this thing but the were all of this would actually reside right now |
---|
0:24:48 | in to be a obviously you want at the at this is a level to |
---|
0:24:52 | have some of it |
---|
0:24:53 | to be able to do some walk some more reasoning but right now we don't |
---|
0:25:10 | we don't offer a deterioration that all the teams that have worked on this have |
---|
0:25:15 | essentially created template case the generation on the role and so we did we don't |
---|
0:25:25 | provide |
---|
0:25:34 | no |
---|
0:25:35 | shortcuts |
---|
0:25:37 | would be very difficult |
---|
0:25:51 | well we started with similar goals right it with the collagen there are |
---|
0:26:00 | actually some of these older papers dealing more with that question about the differences |
---|
0:26:07 | i |
---|
0:26:08 | there are some limitations in the collagen model there are some really good features the |
---|
0:26:14 | colour to model |
---|
0:26:15 | so i think we can at the same in the same direction but kind of |
---|
0:26:20 | tackle things a little bit differently but actually i just wanna learn recently that the |
---|
0:26:29 | the chart for each and others that have put together idea i toolkit |
---|
0:26:35 | moving in the same direction |
---|
0:26:37 | although as far as i understand i haven't seen it in practice that their there's |
---|
0:26:42 | is more task oriented kind of like reading floor |
---|
0:26:47 | so you know what you know way they can move their expectations as the kind |
---|
0:26:52 | of reduce their expectations |
---|
0:26:57 | so i don't know discourse on the slice sliding it was at a |
---|
0:27:03 | link you can actually download it recommended to use |
---|
0:27:07 | at least the parser you can actually do much better than what we people do |
---|
0:27:10 | and if you want to use the whole system will be |
---|