thank you just that paper that kinda and introduction i'm so excited that you invited
me it's been several years since i've been to this conference
and in looking over the paper is at the conference i was really excited just
see that some of them are
on topics that some of us of an issue signal long time but we have
been unable to wrestle and or
control and up to study in some cases and some of us have managed to
make some have a way there
so things like flexible interest and interaction in interruptions reference resolution
entrainment in p incremental understanding nonverbal behaviors sarcasm social strategies
non-native speakers and also nontask-oriented dialog
so the for those of us we've been studying dialogue for a while he's really
gratifying to see that social in cognitive question that of things so interesting at so
ill formed a couple of decades ago
are now making their way into real systems at least
ask racially not functionally
so i want to ask you guys a question
i meant is that it would be okay if i did a little history of
some early findings in this talk
and i want to ask you
when you first start studying dialogue and wide you study dialogue
i wanted to know did you choose this topic or did it choose you
when you first started working on dialogue where you fortunate enough to be in an
environment where everyone around you with also interest dialogue or did you have to swim
upstream
i have the feeling that so you had to swim upstream and you just kind
of
meta the topic id whatever live you were working in
i one is just a the back in the mid eighties when lynn walker or
not you were working at hewlett packard natural language group
we had a manager well at least you tried to managers but we were pretty
much unmanageable given that then you're you know we're talking about and
i need to really appreciate why were so interested in pronouns okay
we are really interested in private so what if you start of basically trying to
study connected discourse because we were shallow people who wanted to become famous okay so
that that's why this computational guy one that we were starting from its okay and
so basically will work hewlett packard
and
waffling about whether to go back to grad school and whether get my phd in
psychology or linguistics or computer science
i discovered can designer and barbara grosses work in a high on discourse that i
just found it incredibly exciting i thought it was very exciting that they were actually
looking at language and trying to explain the structure of language via rules or whatever
okay
and then i when i went to my first little workshop on discourse the dialogue
i and it is incredibly exciting that so many of the big names in the
field where women a barber growth can decide our julie a bunny karen's marginal and
all of these people i found really inspiring so i knew that my field and
i wound up in psychology
not quite a flip of the three sided coin but close to it okay
knowing very much a psychologist
though this i want to show you some early example of the kind of under
later kinda psychology experiments like to later on for this is comes from
back in the days when i was working at hewlett packard in the natural language
group
and there was the system that came out and all database query system are called
human a from them and tech and it enable people to do type dialogues like
this one where the user could type in about a little database
who have the computer you imagine database the
equipment employees managers et cetera
the system responds with shelley do the following create a report showing full name a
manager in equipment from all forms of which equipment include computer
now back in those days on the remaining people who just love this little system
at that there is nothing at all wrong with the dialogue like that okay
so one of the first thing that i did one i got there was this
just seems kind of atrocious so they put me in charge of working on the
articulator
for awhile and so
i basically i thought the most obvious hack to fix that was just provide answers
that parallel the forms of the question that a list of them so
i just to the parse trees and i random backwards and i stuck the answer
into the wh position the parse tree
this solution with not endorsed or popular with
the linguists on the project recounted unprincipled and
yes it didn't work all the time they got it was kind of cheesy but
it worked ninety five percent of the time and so that's good enough for demo
is you will know and so that's what we had to be for a while
so basically noticing this problem early on and the obvious peaks
began to kind of fire of an interest in what i don't think of is
entrainment in dialogue which is this connectedness between
utterances where people are using information that the other person may have endorsed are introduced
and creating
together something that i think of is the conceptual pact to proceed in this way
and of course this is a flexible pattern can be adjusted very rapidly when the
situation changes
so back to a this dialogue system the other things we did
where
basically when one conversation when something goes wrong usually the kind of feedback you get
from your partners than up to
set you wanna course of initiating repair that will fix things
so the system at that time
i didn't display answers in any understandable way it just kind of
brought a little representations the database objects in response to queries
this particular one with the
we don't board managers and equipment and employee so we switched of an goes painting
that was more fun to work although there is just a simple
and so basically depending on you know this is very modular system you know good
a software engineering design of all that and modularity made a very interest a very
a straightforward to implement
i mean so if the break the system experience response to user query like
where van gogh paint starry night which is missing
are where we do rather
then it would get a break in the syntactic module and gives you an error
message call basically in the form of please try rephrasing that
if the break within the lexical module where is and then something and interpretable as
the word it would repeat the word and then say that's an unknown a word
or if it was just an outside of domain error or how large is window
x a it could interpret how large a starry night and give you the dimensions
of the painting but it couldn't tell you how large ringo was
it would come back with sorry that's not the database so again this was an
attempt to early on we tried to get what basic rudimentary dialogue capabilities of the
system that the system could possibly have to reflect what people might expect now
this is way back in the days before we had a dialogue manager that something
that linen others worked on
once we got going and eliza that was important
so i know that there is no anthropomorphic is them in these messages that with
the debate that was active back then and so i reported assiduously the use of
having the system
we refer to itself as i
at the time
and so that to pronouns on so back in those days when we're working on
the natural language processing system
i was all this talk about what will people do when they talk to compute
are able to use all the english that they are used to using where they
use basic english or something called tiny english of the time i think with called
and will they just kind of avoid using anything but a restricted subset will take
you can buy that the computers are restricted partner
and so faq all just thing real remote been done in nineteen eighty seven did
what i thought of what i think of is the first wizard-of-oz study or the
first one i became aware of my that was a really exciting technique when she
did this like i was
i thought well that's great you don't have to put people through a system that
doesn't work very well you can simulated on the other hand
and she found in da to support this hypothesis the prediction that people did not
use pronouns in they
when they were
talking to a system that provided advice about statistics now when you think about that
again she was looking at things like personal pronouns like you see in it
well there's not very many chances you have to talk about you she in it
when you're asking
about your in nova or to explain a t-test or something like
so that struck me as a little bit you know a premature conclusion i didn't
apply
so basically when i and others were that we worked with including hold with their
the time and decided not to take that idea to seriously
and so we forged ahead to spiderman adjourned we start working on pronounced so then
and i added to do a really wonderful classes at stanford type i part we're
grows and rape arrow and bill cullen
and we got a hold of this wonderful draft paper like roosters you mind sitting
on entering
occult scored the computational theory of discourse
back in nineteen eighty six this was and it with actually later published much later
and you probably seen that but we have the old version that
that very prominently across the top do not slight of course we added
with barbara splicing eventually decided it anyway
i and so basically we just to some of the ideas and that paper and
we use them to interpret pronouns in each you know so i'm not gonna go
into the details but this little box represents some of the rules between
transitions between sentences and the attentional shifts associated with these intent with these transitions and
so
i this is revolutionary because
at the time i was really interested in what people to win and the cognition
that's going around when you're interacting with the system and
and many of the people around us were very interested just in the formal representations
and just trying to parse sentences to begin with then what they were doing was
very wonderful and l there as well but
we were interested in the fact that you're really thinking about the psychology of a
user when you're parsing syntax and interpreting referring expressions
so
so this box represents the algorithm and we were doing really simple kinds of sentences
like dan works with derek at each p supervises derek there is a programmer he
answer the question is which
who does the he represent in these various situations
now i start working on centering i still get a few papers every year to
review but i usually turn them down if i'm the longer
doing that kind of research and part the reason for that is that it became
obvious to me that there is much more going on
in pronoun interpretation and the interpretation of referring expressions that just a simple algorithm so
i you know i think the centering approach from christchurch and white is wonderful and
was groundbreaking and taking account of the yours the speaker centers of attention
but there's it's much messier than that so i decided to go back to grad
school and get my phd in psychology with her park and then it entered the
messi world of human behaviour
which we all live at least one were not work
i and so my very first experiment uses a language game in which i tested
some of the predictions that we had to write from centering theory namely that mentioning
something in subject position as opposed to as the object of a sentence
me to say only and thus able to be pre-normalized
so i was thinking of a pronoun as a few that picks out the most
saintly representation in your partner is mental model of the situation
not as
something to trigger research on among all the possible reference for that pronoun which is
how the algorithm worked and that works well for a computer but it
is certainly not help people do it okay
so from the hearer's perspective i saw the interpretation of the pronoun just as the
selection of the best out of all possible interpretations
that's because it was most salient not because of the search
and so it basically i recorded pairs the stanford students to were naive subjects they
both the word basketball day and time and i had to do something that they
found engaging one of them watched a video of a basketball game without the silent
i gave a running description play by play
all the other one behind a screen had to keep track of who had the
ball at any given moment
and they had to write down with the ball but a like one on which
is kind of random
and they could speak to each other as much as they liked so this language
game got people to generate chains of referring expressions to the same object
but lots and lots of third person singular mail
entities in the discourse which is just what i was after
and so that may generated things like and now we'll train set of all they're
going down number thirty you passes it up to forty one forty one goes up
a shot emails
now what you're eight grade english
ugh teacher would probably have taught us that
a pronoun refers of the most recent thing that agrees a gender number we all
know that's not true and so you can see with this pattern they repeat the
speaker repeats of the noun phrase forty one rather than problem i think off of
that
that
referring expression and so
basically this task was it worked really well because the semantics of the task or
biased against the centering prediction
so the centering projection based well the semantics of the task basically say that you
can't shoot unless you have a ball that obvious right so you should be able
to get away of the pronoun here word forty one is underlined but you don't
you follow the predictions of centering
on average not always the can people don't always
don't ever do something all the time they do it with some probability okay
and so again the action ones very fast paced and so the speaker had to
try very hard to keep up
and so it always shorter to use a pronoun that of all noun phrase sometimes
they would usable noun phrase like
you know
number forty one the degree chi force whatever you know they really we're getting into
this task in providing colourful descriptions
so basically
just as the centering l two algorithm projected many people referred to in handy
of the people and he object
and then they refer to it there are much more likely to read refer to
it
by repeating it verbatim
as a volunteer then in instead of pro normalizing it and so they would move
in a to subject position and then they would problem lies that was on the
pro one of the predictions we derive from sorry
and so the other thing is if pronoun was used in that position
with forty one so number thirty passes of up to forty one he goes up
for the shot any misses the problem would get stressed
and that also an interesting discovery so
i have several different techniques at their disposal to maintaining a good focus of attention
with her address these and we found evidence for both those things
so at this point it was pretty clear that the algorithm was not psychologically possible
okay her park with kind of horrified that i was even work there's a what
i said like paper is ready just do you wanna be michael what is that
no that's okay go ahead submitted and i was horribly offended i think you take
me a favour in retrospect but you know i with
kind of cross but i was eager to move on to the world where the
world of psychology than just entering
so clearly cognitive and perceptual accessibility
it is important in both speech planning and reference resolution
but since entering your the entities
are not allowed to decay the discourse context are pretty much the whole sentence at
least the way our algorithm worked
and so it really require segmented discourse in order to pull off the centering algorithm
and so i was learning is the students like a linguistics that on language planning
and interpretation are really incremental huntley incremental more incremental i could've imagined
not even word by word but as soon as you hear two hundred milliseconds of
a word
you're you start to work on it as a listener
and so on that certainly that information was just coming on the scene around nineteen
ninety five and mike townhouse and his lab
publish their really important early work on visual worlds and
and so i was very eager are
back in grad school to
move on to the world of psychology and do something that was plausible and yet
computationally interest
so
back to you guys
i don't know if you thought about what cause you to
a start working on dialogue make we can talk about this at the end of
question period there's on but i also what you think about what i think dialogue
is
what is dialogue will kinda obvious right
the rest of this talk is really about what is dialogue and the assumptions that
you make about what dialogue is what the essences and what it's what simplifying assumptions
are safe to make
and don't destroy the phenomena of interest
and a what things are okay to control in your experiment
and will destroy the thing that you're trying to study okay
so the question is what we need to preserve in our research in order to
model dialogue appropriately
so
so if you think about the way we approach dialogue with respect both to machines
and humans
let's start with some kind of data
these might be data from previous experiments or examples that we find compelling and we
wish to implement or embody in a system or in an experiment
maybe the storyboard of have someone interacts with the an intelligent personal assistant
or maybe it's the corpora were looking at were looking at distributions of behavior over
lots of lots of people aggregated
or maybe a description of some product that somebody think should be built okay and
that we take those that those data those examples and we do something with them
and on the left what we do i guess like point with my cursor
can i can't one i can put the microphone on the left we have engineering
where we're trying to create a computational formalisms for dialogue processing in management on the
right we actually have reverse engineering what we're doing we're trying to figure out how
human processing works
with all of its cognitive social and neural constraints
so that if you think about the very different tasks
these two different
things involve okay
so when we think about how dialogue is implemented in dialogue systems we have no
limits on working memory you know if we want to remember the past and
create a space for which you can search for the referent of a referring expression
go back here it can cover thousands of users it doesn't necessarily cover that individual
that came up yesterday the attentional focus doesn't need to be modeled like human machines
don't have the same kinds of interruptions by the fact that we're now trying to
have them i'll do more than one thing and a type in these personal assistant
okay
but they don't their performance need not to k on any one of these things
while they're doing it okay
and the inferences are represented logically and their computed no matter what okay where is
the people you know often people here pronoun they don't even bother to resolve it
if they don't need to if it's difficult if it doesn't just pick something out
of their standard attention easily okay
so people don't always make the inferences that you think that maybe they should be
making in the hike
the architecture because we are
some of us are then software engineers at various points that still are maybe
you know that modularity makes things a lot more elegant one okay
so that tends to be the architectural choice when you're modeling dialogue systems
and there's also we're very limited perceptual ability for monitoring now there's work presented a
conference on reading people space of facial expressions and looking at these kinds of wonderful
nonverbal
things that will talk about a lot in a few minutes
and that's really important if you're really going to be a full dialogue partner and
deal with pragmatics and way that is easy for people to deal
so in mind brain we have limited working memory okay
we have and attentional focus that emerges from biological constraints
okay so
and it's probably evolutionary really good that we forget things that we don't always have
things active in working memory so forgetting is an important skill it turns out
inferences are often associated and they're not always made in some of my talk will
be about how certain important kinds of inferences are made and how they are deployed
in just in dialogue processing
and whether it's done really immediately in easily and automatically or later as a kind
of laborious work here okay
and then the architecture has to admit incremental processing now but first time
i thought of the first time that a man stand was what i went to
rochester to work with her and a team that included my tandem house
we were trying to write a and nsf grant
that would enable parsers to be incremental head pose we didn't get finding but
it was a wonderful thing to do because i'm at amanda step
so i
okay no bit or comments on that okay lots of other good things to get
a fixed iq and stuff
and so in that architecture of the difficult one to implement and i'm delighted that
many the people in this conference are really acknowledging that it's not necessary to always
do that with
spoken dialog systems but sometimes it might be desirable especially if you really wanna make
something human like
and again you can have abundant monitoring of perceptual information and of planned
you know people monitor their own upcoming speech errors that their speaking they monitor all
kinds of feedback coming in from the world and so
that kind of monitoring isn't part of most spoken dialogue systems just
so that's our question if you're a computational linguist or an engineer you make various
sorts of simplifying assumption okay and all of these assumptions move our research for it
but i think it's important not to lose track of what we had to set
aside in order to proceed because it might come back to hunt is
so it's good to make these assumptions explicit so the way in which you station
experiment often depends on your implicit theory of what a dialogue is
here's what i think a dialogue is here's a good example from my collection now
it to use the same example over and over in different parts are making different
points so apologies of using some of my examples before
i'm presenting this is a different context right now
so this i think a good example of what i think a spoken dialogue is
that this one comes out a bit lower so if the
if someone's adjusting the audio or maybe actually just like computers
now this was collected by trying to crawl g who works that you want now
with one of my early grad students and
we were trying to collect examples of a spontaneous getting to know you dialogue dialogues
from
a bilingual through didn't know each other were bilingual and they were recruited to the
last of these are two strangers
i
ordering
right
what
what
what and why i
right
i love this example it has so much in it and i went when my
i p a can model this then i will be happy
i will retire then okay so what i'm about this is that you know there's
all this really interesting stuff there's code switching but you can see that little can
the little constituents a little increments that each speaker presents what they're doing their face
to face that can see each other
their grounding where each other since they're trying to get you know each other they're
giving each other constant can see what's feedback about how some things than interpreted and
that's kind of manifested in the simultaneous speech around the asterisks where they both say
something at the exact same time
and one jumps in to define the spanish term at the other one is presented
so it's very clear that they very quickly establish this common ground
and they use that's very as a foundational part of their conversation
and so you know we can
okay we can observe abundant examples of referring expressions in any given task dialogue but
it's really important to think about the language game in which people are finding themselves
in this particular language game
they have the ability to fully established common ground with each other and there's nothing
restricted from doing that
on
and so in psychology experiment as you know there are often very we're they're very
weird language games you know
students come at the level paranoid what about today
trying to read my minded you know there are they all have this notion of
social psychology experiments which often have a large with them and then list of their
board to that in a kind of experiment where they
or just getting a cube right now
you are very different language case of so
a language game here is nothing like that one
but in most language games that we set of the lab we're trying to get
many observations from someone so we can have enough power to draw a conclusion so
that we can find out something
create new knowledge about dialogue
and you statistics on it so in a typical rep referential communication experiment we have
two people coming to lab signed consent
and then we initiate this mysterious language games and then they meet each other they're
see that with the barrier in between them
they're given identical to that the picture cards something like this perhaps
and that they need to get matching to the same order
and then they have these like the conversations are not gonna belabour this next example
it's only here if you're someone who has seen this kind of stuff before which
i think you probably are not you would be this room but
"'cause" you know what dialogue is but
here's the kind of dialogue the two subjects might reduced about a particular card you
can see it's very length in that's all of disfluencies and provisional
utterances
like a for this one are it looks kind of like the top their squares
that the looks i know and then be goes a
meeting i don't not quite sure yet i'm trying and you have sort of another
like rectangle shape and then like rectangle angle than on the bottom it's are under
what that is clash eight already i think i got it
it's almost like a person kind of in a weird way like a much prettier
something which is interesting here because be doesn't know what it is an europeans are
proposing the perspective that they end up taking throughout the rest of the experiment
and so we have them refer to this over and over again okay
and so later on you know about eleven cards later after we scramble them and
put them out again be gets to be the director this time and b goes
right to that unite a number nine is that one training and it goes you
open about eleven cards later a now is the director and that's number three michael
case the disentrainment so what these people are done is they have proposed
and kind of training a weighted in on an agreeable perspective to both of them
and then they both use that
so this is i found very striking and even more striking was what people do
in different carers talking about the same object
people come up with very different perspectives
this one you problem you've seen in other types are just as an example
you know you might call the anchor the candle
the symmetrical one shapes on top of shakes are my favourite them and jumping in
the air with bell bottoms on
and they continue to refer to what throughout the experiment as
i sh slightly shortened version of that okay
so it's really amazing to me there so much variation language that's probably one of
the things that attracts us to study that's good you love trying to explain that
variation
but there's very little variation it turns out when people have had a chance to
in trained on something okay so as the system designers you can exploit that
in terms of your intelligent personal assistant you can constrain
the set of things people state not because of tiny english or anything like that
the because people coming trained on these things
and so we view this as people setting up a conceptual pact that was the
term that you've clark suggested referred me one night when we were casting about for
the right term to capture what it was people ended up with a during training
on something okay
so and our first set of experiments we used both tang rooms and
these common objects
and you know we use the tigers just to throw them in there so people
would get distracted because you know system is a language game people are gonna try
to reverse engineer what you're doing to them and you don't want them focusing in
trying to guess i'd guess what you're hypothesis as
so what we were interested in what people would call things like used are dogs
cars and finish
they that we're setting the ten groups that they focused entirely on that
and so basically what we found in this a conceptual pacts experiment was that
people
don't just follow the expected gracie in the thing of saying as much information as
is necessary to distinguish an object from a set of objects which you find it
it so they would start calling this something like the really cool red car the
cork article right powerful right particle red car
and then what it was the only car in this that they didn't go right
to car they continue calling a typical rank are so that was our main finding
we also found that the extent to which they did this was probabilistic and depended
on how many chances they had gotten in the entrainment a part of the experiment
before
the critical trial
i'm such as urban i thought we were done we show something that we you
know that was pretty tangible in useful a controversy erupted okay
so to be here we are a little bit over reaching and the conclusions we're
drawing from these data
so one thing we did in the three experiments and that really paper was we
how to partners which try to the and for the last exterior experiment
and we found that people who switch partners
we often go back to the basic level term and just start calling at a
car when they do you do
consider the conceptual pact that they had established with a particular part or
what if they did that same trial with the
the old partner then they would
continue to use you know the correct are okay
so we were arguing that audience design or this kind of a entrainment thing with
partners this effect that was what we thought we had shown
but it turns out that
we don't really show it in terms of an online demonstration that you're taking this
information really into account with your partner okay
so basically
again this is the summary of our findings which i just covered
speakers were not just as the mormon of as possible and they to continue to
follow the conceptual pact the data samples with a particular partner
but they did not when they were working with someone else
okay
so
i one just briefly presented with a plane five acts so
this is the series of experiments
the talks about a little light we had in the literature and what i learned
was once the stomach that you know i was a young assistant professor back then
and of course the stomach acid that you get when someone attacks you're working
considerable right but then what it turns out is it can really be a wonderful
thing you can engage collegial e with your
where the opponent and you can both improve your research which is what i'm happy
to say is the ending of displaying five act or at least i think it
is i'm not sure my pointed agree to probably anyway they probably
so
so the first question after a verb and i published or paper
was the question is in train it really partner specific queries that just based on
which is just a simple association of memory okay
and so basically demonstrate that something really is partner specific to an individual
and not to just any old individual not just to the priming
in memory simple association with the between an object in the term and maybe a
link to that person
in each are really show the two people with different perspectives are knowledge so the
speaker the here
can adapt to each other from the earliest moments of processing this is hard because
most the time when you're in a conversation
you're really similar you're sharing the same context and you may i just happened to
get it right by chance and that's probably happens a lot of the time right
so dealer in both cases are publish this paper called angry comprehension linguistic precedent
and basically they were inspired by an anecdote that but was had where
you just happen to interpret something egocentric leah not gonna go into the details
but his proposal was that listeners expect presidents in it doesn't matter who the speaker
is
and then if you do just to speaker huge laboriously afterwards
inferential e as a late occurring here
and certainly to be fair some of the data that we presented in the re
original branding part paper i'm had little simple the dialogue where people would say the
first one is the car kind of read where red and strategy or something like
that so
so you can see sometimes it is presented after-the-fact and others other times it would
be the first one is that right but rowdy and
so you can see evidence of the syntax for early adjusting to the partner and
late adjusting
so but that's gave a talk about this at coney two thousand one in philadelphia
and trolls messing it i want my graduate students work in the audience that the
time
and basically i'm gonna go through this quickly
basically boas and they'll found no evidence for partner specific processing so
they had people in these somewhat unnatural situations where they're talking to someone but then
the subjects wearing headphones that are also getting things in there you're from some disembodied
voice somewhere else
that was pre-recorded okay
and so some of the time they found interference between these two things okay
and so
basically hearing the president expression the other expression that they then trained on with the
interactive partner
was no faster than hearing it from the new partner
and so a bar indicates that is the evidence that in train it was not
very specific okay
i mean so what's wrong with this picture
would be that if let's say you and i talk about something we call it
provides that's nancy read moderately and then i'll then walks and then she says i
that's that read matter it out side that would mean that we should be slower
to interpret it probably in
because she wasn't there on the entrainment phrase that doesn't make any sense it doesn't
preamps when using the same phrase that we've talked about just because
she wasn't there when we introduced and that's what this
argument was based on
so basically the criticism i had and i raised my hand during the talk and
i said
okay you fill this l b all partner using the right term again
you've got the new partner using the same old term in a new partner using
a new term
and you're finding that use two are both faster than this one
but what cell so that just one was really interesting but when you have the
old partner committed inexplicably break the conceptual pact what about that does not take any
longer and if you compare that to that's
if it's not part of specific then you should be the same if it is
partner specific and they should take much longer okay
so but was said in response my question well that's not an interesting cell so
we didn't bother with that one okay fine so childlike jump to the train and
ratio for training and with that of use that of his experiment we are you
have the set of objects are still in from by then young child story boxes
note when you're still but
little things that don't really have lexicalized expressions for them and we put them in
an array and we basically
i had a confederate speaker referred to what he object is either the shiny silver
that's shine use cilantro this over high whether these are equally good for that expression
and so basically a naive confederate a naive matcher and a confederate director repeatedly match
the objects and the director have the spoken use kind of show the object what
he was doing you know i have to tell you to get it into this
arrangement but they subjected know that of you the utterances were highly scripted the rest
raw completely natural
so after the in trained on one of these words then the director ago okay
it's time for me to get a get up and leave the room
subjects have been told this experiment is about how you follow directions from different people
so they were given the appropriate cover story this was not too weird in that
language game
and so the directory getup income in it either the same person would come back
in or different person would come back again so we had to confederates
so here is our lab manager darren and then the lowest joy hannah a was
also my collaborative which is serving as the second better
so some kind what we have is the same partner using the original expression on
this critical trial the new partner happening to use the same expression
then you partner happening to use the new expression which we are they were there
during the interim thing
or the original part or index what we breaking a conceptual pact okay
and so i might be interested time i won't up ladies but what you would
see is this one is much lower world just play it quickly
in the next one
to reach into the frame and follow the instructions to look like this comes out
kind of low i think sound
so
okay so i don't know useful work
still
so
okay but you can imagine so basically what's happening in this one is we're recording
the eye gaze of the subject and
we find that a lot all around the array when they hear a new term
problem and all partner but i doubt when they hear the new term from the
other partner
and so if you look at the time that takes in that one broken conceptual
pacts l
it takes significantly longer okay
well as the price and thus was that the somewhat so fast basically
when the
when the new partner use the new expression if you just looked at a bar
in case hours argument that anything that's new should take longer than something that was
already primed that's all
you would expect this to be a bit higher but it wasn't and it turns
out that we had norm both of these expressions they were equally good for the
object that's probably why that happened
so at three okay
so what kind of language so basically we ought without to we have shown that
you know there is evidence that you take a partners
identity and you're in train with them into account really on
so at three came along and now every young professor works hard on their data
they would rather die don't publish something that wasn't true
now we are all concert the applicability or if you're not should be
and it's really important that you do something that's replicable but i always be here
that while the things like ten to do is experiments are so
complicated in we're that who wants to try to replicate someone's time-consuming complicated where experiment
what i was really delighted when somebody did so this is a three so let
me just say about are
experiment which is act to
that we only had a critical observations for the whole
session for each pair acceptable for each subject and confederates
so basically we had to old expressions by new speakers to hold expressions by all
speakers to new expressions when you speakers
and two new expressions by will be speakers we only had two instances out of
the a critical trials
where the conceptual pact was broken before that the experiment
was taken up by all the entrainment faces because the chip quite awhile for people
to in train on these objects before you wanted to be natural
so a lot so basically what's interesting is in that last case the broken conceptual
pact is in full listed as
so when you're part just something in fullest that is once or twice
it's not a big deal maybe they just their attention wandered or whatever but what
to do what over and over again
are you playing the same language game or not i mean this is a psychology
experiment okay so map useful even in thomas l o with little kids range three
and five
replicated our experiment data you sidetracking but they emulated the design otherwise exactly
they had only
these eight critical trials and only two of them were broken conceptual pacts
okay
and so basically you know there was the experimenter present who told the children to
movies objects around
and then they just videotape the children they could code basically how quickly they were
able to position they were looking there are gays okay
and so basically what you see here is the at a critical trials in those
for different conditions okay and so
here we have the original partner
and here we're the new partner in the darker colour
okay
so what we see here is that i'm when the original partner break the conceptual
pact it takes a long time to process when the new partner
uses a different crafted in a different term that the brakes were prevalent
it's find it is much faster
but you effect really diminishes on the second occurrence you with a three year old
and also to some extent with a five year old
so this suggests that even a little children are exquisitely sensitive to implicitly in dialogue
okay
and you know that i'm charles and i had been sad we couldn't got more
power we're happy with the effect came out but if we had done what we
would have if we could have done anything we would've had like a hundred broken
cpus and we probably wouldn't of gotten or effect
and so in retrospect were very glad and so basically putting people in implicit situations
too often is unwise
so
basically after that act for
is a crime miller a and one problem really adamant crime miller and elbow are
one of use one deal students
tried it more detailed eye tracking experiment they argue that we were not as methodological
e sophisticated as we should abandon our analysis and so what we should have looked
at early in the trial was not just people's first look to the
object
that was the target object but you look around things as well
and they argue that if we don't that we would have found evidence that precedent
or using the old expression regardless of who is that it
what is important early on and then only later did the partners specific part kick
in
so we thought okay will try that they try they able to a speaker specific
effects and found them but only later on okay
and so i joined i came along and we analyze the matching a brown and
data and we actually early in the experiment did not find any effective precedent that
would be the black and blue lines here and the higher the winds the more
likely they are to get to the correct target but this is a noise in
there are no difference between and you please
lines right here
so we didn't find any evidence of the old from its timing people but we
did find this evidence of the broken conceptual pact here is the rise to the
looks at the correct target object when the cp is broken and the other lines
are essentially indistinguishable
right
and so this still supported our conclusion now note that in all the bar experiments
they have these pre-recorded partners going on in the crimea wherein bar experiment they had
a pre-recorded partner we had an interacting partner
and so acts five finally really quickly i dealt are did this with new calling
in the scanner so he's doing this and mpeg study okay very similar designed to
matching in brandon with to live confederates out by the scanner and one person in
scanner and so again he's now is looking for
evidence of mental arising in the theory of my network so called
which consists almost accounts of three different areas one of them's frontal one separately as
the ones that are on right temporal profile bridal region
and so you found no evidence for
mental i think in his experiment basically
but the problem is that's subjects in the scanner experience broken conceptual pacts at times
and they that was twice as many times as they experience maintain or follows it
is conceptual pacts so that's another issue
so basically my taken from this is that
then the language gain you put people into matters accurately dramatically change your results
and so and a cool and i wrote a little position paper on this
and basically
but ways in which confederates are deployed
can make a big difference in the results that you get
and also the ways in which experimenters choose to deploy confederates differs depending on what
they think the essence of dialogue is
okay what they choose to control what features to make explicit what they choose to
let the confederate just run with without instructing them what to do okay
okay right
so i'm gonna take you through the argument pretty quickly here so we use confederates
because we want a conversational partners who show up to the lab you know it's
harder to get two subjects to show up than one subject so if you have
one subject i see some that are not even going i'm my heart results you
if you want if you have four people coming in which i've done that then
that's even worse right and so we really big this research difficult
so that's one thing that people can do to solve that problem it maximizes the
efficiency in your data collection and it gives you a lot of experimental control because
as k bach once noticed people say whatever they want to say should call this
exuberant response thing which is one of my favourite
noun phrases of the whole world and the editors always try to correct if i
corpus in the paper
but it's called exuberant funding
and so if you to the extent that you can control one partners behaviour then
you can
reduce the variance and maybe get more powerful to conclude but the other subject is
doing okay
so maybe basically a lot of dialog experiments involve while deception and that's okay every
experiment
has some deception and that we don't tell you exactly what the hypotheses are before
you're in it okay
so i think there often not as they appear you might be interacting with the
computerised dialogue system
or with the person who provides rulebased responses and sometimes it can be unclear
you can be interacting with over an intercom with another student in the next room
or maybe that's pre-recorded you don't know
if you're not allowed to interact with them
well you can be interacting with another student or with an experimenter and so studies
do these different things depending on what they think dialogue is okay
and so on the questions when might using a confederate really threatened
your conclusions in the dialogue experiment
and again this depends on the purpose and on what you think dialogue is
so if you think dialogue is just like language processing by yourself only more engaging
that's one possibility or you might think it's a set of expect alternating monologues that's
kind of the way
the message model assumes dialogue is in a lot of spoken dialogue systems assume this
where they're just looking
at your move and then my moving and where you're the computer and i user
your movement mine moving your movement mine move
we're just doing these are alternating monologue sometimes
or maybe a little more sophisticated comprehension production about activate one
and
or maybe it's really shaped continuously by the interaction between partners
okay
now in this first one
the mere presence is what makes a partner make dialogue real and engaging if you
think just having someone their the audience is what does it
then this is you can see this is really just social facilitation theory okay
and so basically that having a partner just
after the projection space for the user to produce more natural
dialogue okay
okay
and so that had a long and distinguished history and social psychology ever sensible gram
a nash all of those other experiments
if you think it's alternating monologues again
this is all of you that is widely used in by many people who do
a i research computer science linguists psychologists you don't actually do research on dialogue people
like that
and it may be fine for some purposes right comprehension production about it but once
this is a few popularised by martin pickering and simon garrett in their interactive alignment
model
and basically this is interesting "'cause" it leads to parody meaning the speaker and hearer
using the same representations and acting on them
and that could be what you think of is common ground
but they argue that it's really just to priming they try to explain the whole
thing because of the simple association
and they are also argue on the same
kind of logic that bar in his are we're using that priming really will explain
all of this
so called partner adaptation
okay unless the late repair
and then finally if you basically i could go into the pickering everything which i'm
not going to
really brought about five used in this wonderful picture and the bbn thing which that
fall do the priming
right here we see the
one partner as partner a on the side partner be on the side and you
know my semantics just primes yours somehow through the air and not quite sure how
that happens but
you know and this is highly modular to but
the problem is that if you assume that a and b are carbon copies of
each other's interlocutors we do not that's not the case
my semantic network differs from yours if i hear the word
eunice i think mother because my mother's name is unit and she's going to be
ninety in we were so
and you think something else right you might think will eat on one is the
old telephone operator on t v comedy
whereas you know your mother might be named it'll travel and you know that will
think you have in your network so people are different
partners are not carbon copies of each other
priming is not an explanation for this i are you okay
and so just to get naturalist if it shaped by the correlation between partners
then this is a different you okay and you might decide to use partners differently
if you believe that likes you think confederates differently if you believe that
so these general concerns that you have in place when you use a confederate that's
you know basically a confederates can be biased if they
it is well let me just of overview of the concerns right now that an
and i talked about in our paper there's the bias confederate the covert confederate done
in secret the know what all confederate who knows too much about the experiment in
terms of the task that they're doing at that moment
as opposed to the first one who knows about the hypotheses
and the script a confederate
user for concerns that we go over
so basically ideally to deal with the bias confederate ideally you're confederate should be blinded
the experimental hypotheses and to the conditions
that can always be the case that would be ideal
and alternatively you can you can script the confederate behaviour in a few critical places
and not in other places
with the culvert confederate on this is we never use this in my lab we
never fool people into thinking that this is a real subject
other experiments that use confederates i'm this
vary dramatically stage managed thing where the confederate pretends to arrive late of a stress
pretends to be a subject need extra instruction "'cause" they're clueless so there
they're trying to kind of pretend should be not a confederate
but during the experiment itself they just behave however they are usually not given instructions
for how to behave and so that role is sometimes concealed a great length but
then neglected
see
so i want to just say these are examples of two different studies one problem
though as a slap in one that from a hannah and townhouses lab where they
basically deal with these concerns very differently so
with the experiment on the left which found no evidence for audience design or partner
specific processing and concluded that language comprehension is egocentric okay versus the one of the
right found audience to find that language comprehension takes the partners knowledge into account they
don't with these concerns very differently so on the right
the confederate was blind to the condition they did not have hidden knowledge during the
task okay and they were told of subjects were told that the confederate with someone
from the lab it was night the you know
i was gonna play this game along with them okay
but they didn't hide the status of the confederate and it's really the opposite on
the side interplay how this stage det and so basically those found to very different
results okay
with these other concerns
basically
an overall confederate this is when someone knowledge doesn't match on what they're supposed to
be so if you're confederates than sitting there as a listener forty times in the
experiment and knows the story that the value subject is telling them better than the
subject as
their feedback is going to indicate that unless there exists an extremely good actor
the problem is that when you're using a confederate as a speaker in experiment you
and script that if you want and
we know what speaking involves for the most part most don't know very much about
what listening involves or no formal models of what backchannels people given any given moment
really
and so
what are the experimenters are more likely to let that one run wild and so
therefore if the confederate addressee has too much knowledge table display it to the subject
and that's problem
so that try to speed up a bit five one
allowed time for questions but we'll see about actually happens
so it is important for the addressee not to have too much knowledge about the
experiment
and then finally i i'm gonna skip over the scripted one okay but if you
want to take a look at
are examples for
how different
experiments come up with different results depending on how the can better it is deployed
you can take a look at the paper
okay
so it turns out that even it and addressee is distracted
and
the speaker will tell a story differently depending on if the addressee shows that they're
distracted or not
and but interacts with that is it the speaker expects the addressee to be distracted
that also interacts with what the addressee does so
i'm gonna skip over this pretty quickly
but in s as study with this kind of design that an equal in an
idea and that's no longer in private banal for yourself skip that basically if a
speaker extracts an address the user is a tender then they get one that's good
if they expect an addressee whose attended but they get one who's counting the number
and in everything they say and pressing a button to the chair secretly whenever they
do that
but the speaker doesn't know exactly why the distracted
then a
or they don't even know that there are distracted they're expecting them to be attentive
then they have that interesting condition then if you tell the speaker the address is
going to be doing some secret task
here the are they're getting an attentive addressee but here they're not know getting what
they expect
so you get different results depending on all of these different cells
okay so it's not only the feedback that matters but the expectation that the speaker
and use the experiment with
so i'm gonna jump ahead
two or recommendations right confederates no i guess i've already covered most of those basically
try not to have the confederate have
information they shouldn't have at that point in the task and take into account both
what the subject is expecting and what they're actually see
just two
if you another example of audience design and partner specific processing you know
in things like just your gestures are a little more ambiguous than words people can
project all kinds of things onto a gesture but you would gestures people just your
differently when they're talking to someone who already knows what they're talking about versus when
someone doesn't okay so in a this next study
a lexical it who is now weight are said right now suppose start working with
retail
did this need study
by having people describe roadrunner cartoons she comes from the manual average in chicago and
so she is all about gesture
and so you have people
watching these roadrunner cartoons and describing the either telling it telling the story to
a new partner okay and then retelling it to that same new partner or retelling
into a different partner
so you have a preconditions but the two partners were counterbalanced for order
now i don't know if this will play the video
no one okay so basically the idea is that
this person i telling this to
a new partner versus and all partner
basically
the gesture space that she uses is much smaller the second time around is used
in this kind of diminishing of information that should provide the partners who already have
the information
whereas when there's a new partner it goes right back up again and the gestures
are large and
demonstrative since you give that the speaker the right cover story so this isn't a
weird language game for them
okay
and so
let's see
i'm going to jump ahead and since the videos are working i'm going to
jump ahead a little bit
and just say that computationally again you can either model adaptation as a slowly inferential
process or an immediate nimble process that if it's activated in memory and you don't
have to make inference "'cause" you've already made it
thank you can use it just like any other information in memory it's not modular
you not stuck
with using partner specific information like
in these situations here
what it was i looked at the numbers of our experiments that had shown clear
evidence for partner specific processing sometimes very early in the interaction and they were all
simple situations that didn't have any we were done natural
recordkeeping that a naive subject would have to do but things are very perceptually clear
like does my partner speak english or not does my partners speech this particular dialect
or not is the partner looking at the thing we're talking about or not and
when you have that simple and a binary situation
then you can think of this is very simple partner model so obviously that's a
lower part about it is computationally expensive to keep track of you your i p
a can do all the keeping track at once because it has all in a
list computing power but with a human if it simple enough they to can keep
track of the information show partner specific processing
so if you just stop acknowledge that you know these situations are quite different and
then and i'll be aware of when humans can keep track of partner specific information
then that could provide insight into what you wanted to as if you don't always
want to emulate computers sometimes you can do it better but
may want to take that into account
and then very much take into account what language game you're playing
so i'm getting near the end i know i'm running a little bit late but
just to wind up i wanna just say that i agree with some of the
discussion yesterday that we are only at the tip of the iceberg
concerning our understanding of the pragmatics around dialogue
but it still really important to better understand have system should perform the role of
dialogue partner and how best they should adapt to a human dialogue partner and i
have some concern about using the wonderful successful applications we already have like calendar management
information access
and try to use that to project ahead everything because in other socially interesting complex
pragmatic situation there's a lot more going on and that is
many of us to comments from the audience nuclear
but i just wanna amplitude very short clips from the internet i just put on
this morning
because i think there are relevant when we think about what a conversational partner really
is okay so first of all i call this the chance to garner effect and
basically i think people using these an intelligent personal assistant
they're projecting a lot of relevance and sensibility and things that are not so sensible
when there's ambiguity people do their best to make it what they think something sensible
would be and so there's a movie if you back away used back with peter
sellers playing this
so one type of character
and the it's described in the clip as a simple sheltered are near becomes an
unlikely trusted adviser to a powerful business man and fighter in washington politics okay
and so what i wanna do quickly as just
if you that this will cooperate
see
if i can make full screen
okay people are seen this movie
being there okay so you youngsters have not okay good score
alright so here we have chancy gardner walking with the l important items the a
trusted advisers the present
who eventually chancy gets promoted to being the trusted adviser suppressed
we want a which present it is but you kernel have your own fears okay
then going to know this is done by a single between is television was to
present them
you are much smaller
but i guess what
alright so basically you know this is somebody who really is very simple but everything
used as is taken as a
and word is the rooms do not suffer
well as well known we will
and we got a unicorn
i know and the related to brazil would be to
okay so that's that so basically you know things people will try their best to
make sense out of whatever they're experiencing okay and
we just get five two
okay
so basically even though we try to make sense of the main message that we're
hearing even when we have evidence so the contrary or ambiguous
we are really exquisitely sensitive when the non-verbal signals are wrong okay we may not
even be aware of what it is where
reacting to okay
and so if you don't take that into work then you might as well just
be on a date with contractually still some of you remember this clip from a
you they years back any one thing the scope for
okay about again less than half of you so you know what enter actually assistant
really highly successful dialogue system from years ago that
from what you may even have been involved with i'm not sure
and i just wanna play that could really quickly and then we'll
see
alright so make it big
right here it can tell
okay
here
of course i
i
i
all right next to the map to the lack there which i
i think there is no shot i
since i last i that's why i had to i
i just wanted to the two
it's a much
so we got them i want to know i
sure that i
and that of that
okay
so
back to the
and doing
to be of course faster i were able to that things properly
so the point of all that is that there are these little implicit right
do you
verbal and nonverbal collateral signal that's her pocket call them on
to which people or implicitly and exquisitely sensitive and when you get is wrong
it really shifts you into a different language game and of course the funny part
of this is that you doesn't get that he's
he's believing her but we all get at that point
so basically the language game in any experiment in any kind of application varies quite
a bit into assume that it doesn't matter is to really miss out on this
i think it's something really important to take into account when you're designing a personal
assistant but switching applications the something that has very different pragmatics okay
and so
it's better to acknowledge what you're assumptions are what you really think dialogue is and
how you've constrain the language game and what you've sacrificed basically and so basically
language processing in general with humans is extremely flexible
and yet it's extremely important you get these its right and i think there's a
lot more to learn and
i thank you for your being a wonderful audiences making audience design used easier by
didn't have so much too much material to present
and i just want to thank my collaborators and my home institution and stuff thank
you
sorry to run like yes when
we could hear you
okay
well try to summarise
right
right so the questions about crowdsourcing and whether you can just lived responses out of
a crowd and stick them in your application
which i love the idea of crowdsourcing many things right
right exactly right
right so the point with entrainment is it your restricting the particular packed the content
of the particular perspective that you take in that indexed by this lexical item that
using
so basically if you have the domain where that's not important where the domain is
choppy and people are referring over and over to the same thing then crowdsourcing might
well work
if you gotta domain where you in your i p a have had some preliminary
discourse about something and you're agreed on calling something a term which is also in
big you know it would be ambitious to have a spoken dialogue system that can
train i would love that
but most the time people end up adopting the terms of their computers use of
their systems use "'cause" they have no other choice
but if there were such a thing that we're flexible then this would be a
very incoherent dialog if it just the dropped in
utterances from a crowd because they wouldn't be lexically constrained in the same way to
indexes joint perspective that we think dialogue is about if you don't think dialogue is
about
working off the joint perspective that you've achieved with a particular individual
then you then crowdsourcing will work if your application fits that assumption
if it doesn't then it won't
i think
thank you for raising that i think that
that really
makes it clear
justine
only to thank you
or fact
right
no i don't think it we have i don't think
we should abdicate that responsibility i think sometimes
you have a different grounding criterion depending on the situation if you're entertain yourself with
theory
then it doesn't matter
and you are fine attributing all kinds of bizarre responses to be in contingent on
your own when they may not be
but when it something important
like referring to an object in you want the right object
then it does become important and so
you have to use the right term or else your partner things you mean something
else even if it's a perfectly good crowdsourced term that many people will like
and so it requires i don't think this is contradictory dog i'm sorry i didn't
summarise your question because it was impossible but
but i think everyone the room probably hurt it i'm not sure that people on
the weapon are converted
but you know the question is you know if you can basically take the partner
into account is just in just said you know in the micro sense moment-by-moment depending
on the feedback in the mapper sense what you want about them i would also
say at the beginning you start with the expectation about the partner we don't a
lot of work with that
and so all of those things it becomes very powerful the evidence you get moment-by-moment
you revise the initial model of the partner and by model can be a labrador
simple depending on how computationally expensive you wanna get
and then at the end you have some information long term memory that you take
with you in the next time you talk to them
then some of that gets downloaded right
so it takes a little bit to download apple once it's in working memory it's
fast rate
and so i think i don't think these things are contradictory tall sometimes meeting matters
and needs to be achieved painfully and other times it doesn't and it depends on
the joint purpose the two people presume in a conversation that's not always the same
purpose but it often is
any
are we done
have to stop i think you are