so i'm presenting the syllable they have to a whole team a people from my
agency shown on and you are here j is not
and this is gonna be a little bit different you "'cause" we're gonna have no
neural networks knock or run with of the pause and no f scores
no numbers
so is gonna be a little difference
so here's a the problem that at that
we are i
the so we start state-of-the-art in dialogue systems actually a couple of you please and
the you know and others have had a similar slide
what we're doing mostly is very simple parsing based on keywords phrases and so on
a regular expressions as one
very simple dialogue models based on either finite state somehow or frame systems with slot
filling its own
engineer for a specific application
and there's
sounds of applications for these
but
every single dialog system is developed for that specific application in which you some cases
here this in get out
modified domain but essentially there's sort of separate dialogue systems they're kind of work together
with a single the interface
but importantly there is no transfer between these domains there is no generic
capability in these systems the transfer from
one domain to another
and as far as the kind of interactions that these other systems allow
there's
no effective the verification or corrections the kind of dialogue that allow is actually very
limited
so here's our position
dialogue is an activity that we can be and should be modeled independently of the
application domain
we i understanding of language to effectively and robustly handle the a broad range of
user utterances that the same
intention can be expressed in so many different ways
added
most of these
finite state based and with simple parsing hubris of data that are sitting in a
day just common
all the somebody's willing to spend years just
encoding what's a regular expressions i suppose
and we also think that the community needs to the frameworks to facilitate the development
of these a complex mixed-initiative systems with very sophisticated back-end recently and i think there's
a fierce of such tools
we see for example in parsing with a stand for the tools or nltk or
other various tools
people adopted them and they started using them and they got better outcomes of that
but in the dialogue maybe we don't have sophisticated enough tools
a tool allows for the for people to a develop such systems
so
as use only the title our model is
based on the collaborative problem solving so what is collaborative problem solving
well when they collaborate what they do they rehabilitate you they developed jointly solutions the
identify and resolve errors problems of the here a kind of the progress as the
task is going on
they jointly perform actions the of course they can negotiate roles
and they learned from one another
at all these things are done through communication right it's not necessarily by language communication
could be gestures it could be other kinds of communication but it is by communication
so we need to
so
our central thesis is that essentially all or at least most of the human machine
language based communication can you model effectively
as collaborative problem solving
so
what does the collected for solving a model in table
so what we need by this is the is that we need to model the
shared initial space between the two agents or some people actually have a
i and the something about
modified agents a sort of
once i
agent dialogue here we just limit ourselves to two but
even with multiple the same response applied
so what is this and intersentential spaceport kind of objects that we are dealing with
these are particles solutions
and understanding common ground session that strange
and all this shared understanding
arises from communication we need to communicate and agree on things and so on
one page counts
create a collaborative goal or as solution there has you to a pursue something together
obviously a selection japan like to go without
the other person
so this is i pictures taken from a paper i data alone and a couple
other of my calling problem
two thousand two
the models place the sort of the this case of tasks in this model in
four different areas communicative interaction a collaborative problem solving a problem solving a individual problem
solving actually
i don't i did of course might interest in this talk is just about that
solve the problem solving here
which really can look at the object in there really reflect the problem solving actually
the same kind of thing
except that their properties
so
the central thesis that we have that in the two thousand every two thousand and
it wasn't just ask other people have the same idea is that at that level
when you can a reason in a domain independent represent things in the domain independent
way
but this has never been rated are properly and we also didn't we problems today
we have a larger prototype we never really did it so here today i'm announcing
that we know that
and
this
this architecture would be familiar to all of you it doesn't look very different from
other things that we which is so far
so we have natural understanding there's lexicon ontology
the dialogue management which is really the class problem solving agent at that we have
it is in the centre
there's a the backend problem solving or okay here
a behavioral agent there's generation so this doesn't look very a different from other systems
the parts that are in colour or the components of cogent
of is domainindependent shall right so by itself people look at that you're not gonna
have a dialogue system just by that having that but you can have the this
dialogue system i dialogue system by adding to that
the behavior spectrum domain specific and not to mention that language generation and of course
generation you could press all have some higher level but mainly depend generation components but
we don't have it
so a lot of people can do sort of in domain you an iteration
and
so
we also to do i'm just gonna talk a little bit about that components there
so the natural language understanding the workforce of everything that we didn't for the last
twenty some years as in the tricks parser
it's a d
the that is too sparse to use a very representation of the meaning of every
a sentence it has a very sure principle ontology it has a very large lexicon
some of it or ten thousand maybe more
are handled lexical entries we it stand by learning from a word that but a
session we derive automatically so freebase for example for we driver automatically the roles that
have the they are from definitions
it's and so on
and
i'm not gonna talk about to make too many details but it is available online
and you can actually check it there's a there's a web service for the basic
parser and or number of variations of the parser as well
the output
positions
i don't see that
data
so i don't think this is actually visible but
so this is the
web interface i just put of sensors earlier something that it came up earlier i
need a hotel in the centre of calibration
and that's
what a parse multiply and you can see that
everything so there's a speech act at all
every single more represented here has a type in the ontology
so for hotel accommodation for needed one is one
can the residual graphic region
i even with the british spelling and their got that right
and if you look for example at the next one i prefer very nice hotels
when you can see that before is also one just like need which is something
that you probably want to you
and you can see how adjectives have
very interesting types here the space here is basically a value on a scale of
expressiveness as it for and so on
so you get very rich representation
well
there's an additional thing is here the dealing with reference resolution ellipsis processing ontology mapping
i'm not gonna talk too much about this
i one is the here is that the there's conventional speech act identification still sometimes
you can ask a question by making socially an assertion or you can you can
make an assertion by asking a question for making a request asking it a question
so there's conventional mapping between the surface speech act and the user speech act but
you just really
so not to do this yes agent
so a
essentially the output of all these national chance any sizes a feed into the a
collaborative problem solving agent and what it does is it provides a domain and model
communication adaptable to new domains
on
what side it just
what really could be called just intention recognition
so there's communicated at coming in from user utterance you want i understand would be
fashion of the user is i and we call that can also be guy
and obviously on the other side adjusting for someone to the specs much time on
that
if the system itself once to communicate to the user it will do that is
actually creating a collaborative problem solving task which can get sense to the generation component
and eventually we'll get into like that
so this section does that and essentially maintain the quality of a state
which
all these acts together essentially drive the a conversational structure so that's why it is
a dialogue model
and again going to repeat myself here but this is primes good idea that there
is in the in domain and the semantics of language that supports
reasoning about intentions
so there but
there is attention here between the desire for domain independent processing and the need for
very affordable a specific processing so
understanding detection of user is almost always it possible to do in just the domain
independent way so the way we deal with this problem is that essentially the collaborative
problem solving agent should be understanding of the user intention is a hypothesis
and then this is over to the behavioral agent which concludes sort of grounding of
all objects and is actually trying to figure out does this make any sense in
this particular state of the task does this makes test and if so then that
i guess
committed as a show if it's a goal then the system can mislead as a
as a shared real but if not there can be clarification so on going on
so is actually the way this is done based on the previous evaluate commit a
little
so the collaborative problem solving agent will figure out a probably problem solving a which
explains the user utterance
would send an evaluation and evaluate at the behavioral agent
and the behavioral agent agree use it will send back an acceptable and only and
we have a commit to the goal of the shared
and this is the same way that we're dealing with a request proposals of those
are questions as well
if the va
doesn't
a light
at the evaluation there's many different that there are several different ways it can handle
with this one is just say a rejection actually i think this should be unacceptable
but anyway
but
we use the like to do this and it can actually give a release
it's a horizontal we don't have enough box for corporate law
it is also possible to propose alternative way and together that for a to the
resulting
i'm gonna skip on aspect is just models
so in the paper is a very detailed description of the various a quite a
problem solving a
so i'm not gonna going to the detail so there's a number of them have
to deal with gold so we cannot do not select d for a goal if
you don't wanna deal with the right now you can completely abandon the goal or
we can really easy to release it means that it's completed
satisfactorily more or not
and there's a there's a bunch back support knowledge in make an assertion that is
actually once is committed to that means of the agent a now believe whatever you
don't the whatever that whenever the human user in intense corpus and the belief
this question is a ask even task w a just to what
questions
you can see in a number of examples that
quite complicated example these are actual examples from system you
including something like doesn't amount of sorely
at the conditional you
at a one that if we increase the amount of whatever the some other proteins
all
or i wh with choices of the gt wagner propose which are regulated by a
reinstall
so this is all the little and there's a number of access related to the
a problem solving status so again acceptable not an unacceptable are essentially interpretation yes where
the da says i like that i don't like it that goes can be we
use will reject it
they can be failures of execution i answers to questions and execution status which can
be either
done at the very end but it can still it can be also used to
just more progress i'm still working on this
okay well as you one is the u
so
what is mean to add a behavioral agent to actually haven't i was system based
on cogent
so
you can think of the cts access establishes a sort of a protocol was implemented
protocol and any sure that the obligations that these things create
are satisfied
then after that there's nothing else to do essentially there's no requirement for how the
behavioral agent represents intuitively
i think what it's a line system or a very simple database lookup
what kind appended complexity has
how many some agents are out there are a as long as there's a single
interface a single overarching yea everything should be fine
with it has a models alone
there are some related ways of affecting how the natural language understanding works
but is somewhat so you really want to use this and actually
change how the natural language understanding work because it's not good enough you ask the
did you never i'm not reliable
so we have a number of very implement coded based systems in very different domains
very different interactions is
so by duration
that station in an assistant a biologist assist and a bunch of systems that have
to do with the blocks world
more or less
and some others the that are sort of music composition visual storytelling that's creating such
scenarios for making movies essentially with animated characters so with very different domains very different
vocabulary very different interaction style
so i'm not gonna go too much into a into a we have used systems
but one of the reviewers we want to see the by iteration a system
and i could put too much into the paper because it wasn't published and it
still isn't really
but i'm gonna give you a little video of the system and
so these are all systems except for the one that you are represented the other
day all these systems are not develop is people power cogent and they developed on
the role
so let's look at it of a dialogue
providing you understand looks like logical systems like
was there
one is going to be sensor
but the trees are a little bit
the rule machine i don't want the one here
sorry but
alright so here we would have sort of a the dialogue history then is a
idea a system by averages
what you from an implementation and what you what the goal here i want to
find out how you be shown in the
b equal to these two genes
and there's just outline i think it's probably best work
so i'm so what is the goal here i want to find an explanation so
it's a very interesting type of goal of how this happens
and the way the system knows how to provide an answer that time is to
build and what a model of the molecular interactions
and can try to find out
one that you are maybe we which is kind of the source
useless is g the joan i in this particular cells
so
i'm gonna you go your
so the user then asks how does your maybe if we regulate pi okay now
why did they know you can see here about the p eight we hate you
"'cause" they're biologists obviously this is not a system for novices
and what the system does it actually looks also there's a huge array of a
by will just pacific agents
including ones that go look up a ways in a perfect database is
there's one but actually read papers and can we can extract information from the air
so it defines a watermark task between these two
g and it creates a network that the user can use it as a source
of information
so i'm gonna speed up because i know my ties are already right it is
okay
so a and creates a so
i'm just gonna lexical and only because it is below
so not the user creates with the system at i a very specific don't model
of this
the system actually based on what it sees it can suggest additional information based on
what it knows
and the user can look at it and say well okay that
good enough with an actual i know something even more specific than that
and the system comes back you can see here
but
to actually explain
the original question that the user a
and there's more it can actually take this and create a dynamic model about it
can ask questions for example is the monitor for whatever protein high and you can
see all kinds of useful information about so i'll stop here
for
four point recognition we actually don't to a
in the in the air agent in the cccs agent
we don't actually use right no plan recognition i know that i
more me
running when you
understanding dialog
we
for now we don't at high
so
the i you can see some essentially the one where of i've answering this question
is why was why where we successful with this where we're reward before and done
more work before because of this the way we split
what can be done in the domain independent way from what can be done in
a domain independent way
so a lot of the time i is a set in this evaluates commit little
we basically just wrote things over the fast and say well you figure it out
so most of the situational context and in there is not a model of user
modelling in this thing but the were all of this would actually reside right now
in to be a obviously you want at the at this is a level to
have some of it
to be able to do some walk some more reasoning but right now we don't
we don't offer a deterioration that all the teams that have worked on this have
essentially created template case the generation on the role and so we did we don't
provide
no
shortcuts
would be very difficult
well we started with similar goals right it with the collagen there are
actually some of these older papers dealing more with that question about the differences
i
there are some limitations in the collagen model there are some really good features the
colour to model
so i think we can at the same in the same direction but kind of
tackle things a little bit differently but actually i just wanna learn recently that the
the chart for each and others that have put together idea i toolkit
moving in the same direction
although as far as i understand i haven't seen it in practice that their there's
is more task oriented kind of like reading floor
so you know what you know way they can move their expectations as the kind
of reduce their expectations
so i don't know discourse on the slice sliding it was at a
link you can actually download it recommended to use
at least the parser you can actually do much better than what we people do
and if you want to use the whole system will be