Speech Transcript - More than meets the ear: Processes that shape dialogue

thank you just that paper that kinda and introduction i'm so excited that you invited

me it's been several years since i've been to this conference

and in looking over the paper is at the conference i was really excited just

see that some of them are

on topics that some of us of an issue signal long time but we have

been unable to wrestle and or

control and up to study in some cases and some of us have managed to

make some have a way there

so things like flexible interest and interaction in interruptions reference resolution

entrainment in p incremental understanding nonverbal behaviors sarcasm social strategies

non-native speakers and also nontask-oriented dialog

so the for those of us we've been studying dialogue for a while he's really

gratifying to see that social in cognitive question that of things so interesting at so

ill formed a couple of decades ago

are now making their way into real systems at least

ask racially not functionally

so i want to ask you guys a question

i meant is that it would be okay if i did a little history of

some early findings in this talk

and i want to ask you

when you first start studying dialogue and wide you study dialogue

i wanted to know did you choose this topic or did it choose you

when you first started working on dialogue where you fortunate enough to be in an

environment where everyone around you with also interest dialogue or did you have to swim

upstream

i have the feeling that so you had to swim upstream and you just kind

meta the topic id whatever live you were working in

i one is just a the back in the mid eighties when lynn walker or

not you were working at hewlett packard natural language group

we had a manager well at least you tried to managers but we were pretty

much unmanageable given that then you're you know we're talking about and

i need to really appreciate why were so interested in pronouns okay

we are really interested in private so what if you start of basically trying to

study connected discourse because we were shallow people who wanted to become famous okay so

that that's why this computational guy one that we were starting from its okay and

so basically will work hewlett packard

and

waffling about whether to go back to grad school and whether get my phd in

psychology or linguistics or computer science

i discovered can designer and barbara grosses work in a high on discourse that i

just found it incredibly exciting i thought it was very exciting that they were actually

looking at language and trying to explain the structure of language via rules or whatever

okay

and then i when i went to my first little workshop on discourse the dialogue

i and it is incredibly exciting that so many of the big names in the

field where women a barber growth can decide our julie a bunny karen's marginal and

all of these people i found really inspiring so i knew that my field and

i wound up in psychology

not quite a flip of the three sided coin but close to it okay

knowing very much a psychologist

though this i want to show you some early example of the kind of under

later kinda psychology experiments like to later on for this is comes from

back in the days when i was working at hewlett packard in the natural language

group

and there was the system that came out and all database query system are called

human a from them and tech and it enable people to do type dialogues like

this one where the user could type in about a little database

who have the computer you imagine database the

equipment employees managers et cetera

the system responds with shelley do the following create a report showing full name a

manager in equipment from all forms of which equipment include computer

now back in those days on the remaining people who just love this little system

at that there is nothing at all wrong with the dialogue like that okay

so one of the first thing that i did one i got there was this

just seems kind of atrocious so they put me in charge of working on the

articulator

for awhile and so

i basically i thought the most obvious hack to fix that was just provide answers

that parallel the forms of the question that a list of them so

i just to the parse trees and i random backwards and i stuck the answer

into the wh position the parse tree

this solution with not endorsed or popular with

the linguists on the project recounted unprincipled and

yes it didn't work all the time they got it was kind of cheesy but

it worked ninety five percent of the time and so that's good enough for demo

is you will know and so that's what we had to be for a while

so basically noticing this problem early on and the obvious peaks

began to kind of fire of an interest in what i don't think of is

entrainment in dialogue which is this connectedness between

utterances where people are using information that the other person may have endorsed are introduced

and creating

together something that i think of is the conceptual pact to proceed in this way

and of course this is a flexible pattern can be adjusted very rapidly when the

situation changes

so back to a this dialogue system the other things we did

where

basically when one conversation when something goes wrong usually the kind of feedback you get

from your partners than up to

set you wanna course of initiating repair that will fix things

so the system at that time

i didn't display answers in any understandable way it just kind of

brought a little representations the database objects in response to queries

this particular one with the

we don't board managers and equipment and employee so we switched of an goes painting

that was more fun to work although there is just a simple

and so basically depending on you know this is very modular system you know good

a software engineering design of all that and modularity made a very interest a very

a straightforward to implement

i mean so if the break the system experience response to user query like

where van gogh paint starry night which is missing

are where we do rather

then it would get a break in the syntactic module and gives you an error

message call basically in the form of please try rephrasing that

if the break within the lexical module where is and then something and interpretable as

the word it would repeat the word and then say that's an unknown a word

or if it was just an outside of domain error or how large is window

x a it could interpret how large a starry night and give you the dimensions

of the painting but it couldn't tell you how large ringo was

it would come back with sorry that's not the database so again this was an

attempt to early on we tried to get what basic rudimentary dialogue capabilities of the

system that the system could possibly have to reflect what people might expect now

this is way back in the days before we had a dialogue manager that something

that linen others worked on

once we got going and eliza that was important

so i know that there is no anthropomorphic is them in these messages that with

the debate that was active back then and so i reported assiduously the use of

having the system

we refer to itself as i

at the time

and so that to pronouns on so back in those days when we're working on

the natural language processing system

i was all this talk about what will people do when they talk to compute

are able to use all the english that they are used to using where they

use basic english or something called tiny english of the time i think with called

and will they just kind of avoid using anything but a restricted subset will take

you can buy that the computers are restricted partner

and so faq all just thing real remote been done in nineteen eighty seven did

what i thought of what i think of is the first wizard-of-oz study or the

first one i became aware of my that was a really exciting technique when she

did this like i was

i thought well that's great you don't have to put people through a system that

doesn't work very well you can simulated on the other hand

and she found in da to support this hypothesis the prediction that people did not

use pronouns in they

when they were

talking to a system that provided advice about statistics now when you think about that

again she was looking at things like personal pronouns like you see in it

well there's not very many chances you have to talk about you she in it

when you're asking

about your in nova or to explain a t-test or something like

so that struck me as a little bit you know a premature conclusion i didn't

apply

so basically when i and others were that we worked with including hold with their

the time and decided not to take that idea to seriously

and so we forged ahead to spiderman adjourned we start working on pronounced so then

and i added to do a really wonderful classes at stanford type i part we're

grows and rape arrow and bill cullen

and we got a hold of this wonderful draft paper like roosters you mind sitting

on entering

occult scored the computational theory of discourse

back in nineteen eighty six this was and it with actually later published much later

and you probably seen that but we have the old version that

that very prominently across the top do not slight of course we added

with barbara splicing eventually decided it anyway

i and so basically we just to some of the ideas and that paper and

we use them to interpret pronouns in each you know so i'm not gonna go

into the details but this little box represents some of the rules between

transitions between sentences and the attentional shifts associated with these intent with these transitions and

i this is revolutionary because

at the time i was really interested in what people to win and the cognition

that's going around when you're interacting with the system and

and many of the people around us were very interested just in the formal representations

and just trying to parse sentences to begin with then what they were doing was

very wonderful and l there as well but

we were interested in the fact that you're really thinking about the psychology of a

user when you're parsing syntax and interpreting referring expressions

so this box represents the algorithm and we were doing really simple kinds of sentences

like dan works with derek at each p supervises derek there is a programmer he

answer the question is which

who does the he represent in these various situations

now i start working on centering i still get a few papers every year to

review but i usually turn them down if i'm the longer

doing that kind of research and part the reason for that is that it became

obvious to me that there is much more going on

in pronoun interpretation and the interpretation of referring expressions that just a simple algorithm so

i you know i think the centering approach from christchurch and white is wonderful and

was groundbreaking and taking account of the yours the speaker centers of attention

but there's it's much messier than that so i decided to go back to grad

school and get my phd in psychology with her park and then it entered the

messi world of human behaviour

which we all live at least one were not work

i and so my very first experiment uses a language game in which i tested

some of the predictions that we had to write from centering theory namely that mentioning

something in subject position as opposed to as the object of a sentence

me to say only and thus able to be pre-normalized

so i was thinking of a pronoun as a few that picks out the most

saintly representation in your partner is mental model of the situation

not as

something to trigger research on among all the possible reference for that pronoun which is

how the algorithm worked and that works well for a computer but it

is certainly not help people do it okay

so from the hearer's perspective i saw the interpretation of the pronoun just as the

selection of the best out of all possible interpretations

that's because it was most salient not because of the search

and so it basically i recorded pairs the stanford students to were naive subjects they

both the word basketball day and time and i had to do something that they

found engaging one of them watched a video of a basketball game without the silent

i gave a running description play by play

all the other one behind a screen had to keep track of who had the

ball at any given moment

and they had to write down with the ball but a like one on which

is kind of random

and they could speak to each other as much as they liked so this language

game got people to generate chains of referring expressions to the same object

but lots and lots of third person singular mail

entities in the discourse which is just what i was after

and so that may generated things like and now we'll train set of all they're

going down number thirty you passes it up to forty one forty one goes up

a shot emails

now what you're eight grade english

ugh teacher would probably have taught us that

a pronoun refers of the most recent thing that agrees a gender number we all

know that's not true and so you can see with this pattern they repeat the

speaker repeats of the noun phrase forty one rather than problem i think off of

that

referring expression and so

basically this task was it worked really well because the semantics of the task or

biased against the centering prediction

so the centering projection based well the semantics of the task basically say that you

can't shoot unless you have a ball that obvious right so you should be able

to get away of the pronoun here word forty one is underlined but you don't

you follow the predictions of centering

on average not always the can people don't always

don't ever do something all the time they do it with some probability okay

and so again the action ones very fast paced and so the speaker had to

try very hard to keep up

and so it always shorter to use a pronoun that of all noun phrase sometimes

they would usable noun phrase like

you know

number forty one the degree chi force whatever you know they really we're getting into

this task in providing colourful descriptions

so basically

just as the centering l two algorithm projected many people referred to in handy

of the people and he object

and then they refer to it there are much more likely to read refer to

by repeating it verbatim

as a volunteer then in instead of pro normalizing it and so they would move

in a to subject position and then they would problem lies that was on the

pro one of the predictions we derive from sorry

and so the other thing is if pronoun was used in that position

with forty one so number thirty passes of up to forty one he goes up

for the shot any misses the problem would get stressed

and that also an interesting discovery so

i have several different techniques at their disposal to maintaining a good focus of attention

with her address these and we found evidence for both those things

so at this point it was pretty clear that the algorithm was not psychologically possible

okay her park with kind of horrified that i was even work there's a what

i said like paper is ready just do you wanna be michael what is that

no that's okay go ahead submitted and i was horribly offended i think you take

me a favour in retrospect but you know i with

kind of cross but i was eager to move on to the world where the

world of psychology than just entering

so clearly cognitive and perceptual accessibility

it is important in both speech planning and reference resolution

but since entering your the entities

are not allowed to decay the discourse context are pretty much the whole sentence at

least the way our algorithm worked

and so it really require segmented discourse in order to pull off the centering algorithm

and so i was learning is the students like a linguistics that on language planning

and interpretation are really incremental huntley incremental more incremental i could've imagined

not even word by word but as soon as you hear two hundred milliseconds of

a word

you're you start to work on it as a listener

and so on that certainly that information was just coming on the scene around nineteen

ninety five and mike townhouse and his lab

publish their really important early work on visual worlds and

and so i was very eager are

back in grad school to

move on to the world of psychology and do something that was plausible and yet

computationally interest

back to you guys

i don't know if you thought about what cause you to

a start working on dialogue make we can talk about this at the end of

question period there's on but i also what you think about what i think dialogue

what is dialogue will kinda obvious right

the rest of this talk is really about what is dialogue and the assumptions that

you make about what dialogue is what the essences and what it's what simplifying assumptions

are safe to make

and don't destroy the phenomena of interest

and a what things are okay to control in your experiment

and will destroy the thing that you're trying to study okay

so the question is what we need to preserve in our research in order to

model dialogue appropriately

so if you think about the way we approach dialogue with respect both to machines

and humans

let's start with some kind of data

these might be data from previous experiments or examples that we find compelling and we

wish to implement or embody in a system or in an experiment

maybe the storyboard of have someone interacts with the an intelligent personal assistant

or maybe it's the corpora were looking at were looking at distributions of behavior over

lots of lots of people aggregated

or maybe a description of some product that somebody think should be built okay and

that we take those that those data those examples and we do something with them

and on the left what we do i guess like point with my cursor

can i can't one i can put the microphone on the left we have engineering

where we're trying to create a computational formalisms for dialogue processing in management on the

right we actually have reverse engineering what we're doing we're trying to figure out how

human processing works

with all of its cognitive social and neural constraints

so that if you think about the very different tasks

these two different

things involve okay

so when we think about how dialogue is implemented in dialogue systems we have no

limits on working memory you know if we want to remember the past and

create a space for which you can search for the referent of a referring expression

go back here it can cover thousands of users it doesn't necessarily cover that individual

that came up yesterday the attentional focus doesn't need to be modeled like human machines

don't have the same kinds of interruptions by the fact that we're now trying to

have them i'll do more than one thing and a type in these personal assistant

okay

but they don't their performance need not to k on any one of these things

while they're doing it okay

and the inferences are represented logically and their computed no matter what okay where is

the people you know often people here pronoun they don't even bother to resolve it

if they don't need to if it's difficult if it doesn't just pick something out

of their standard attention easily okay

so people don't always make the inferences that you think that maybe they should be

making in the hike

the architecture because we are

some of us are then software engineers at various points that still are maybe

you know that modularity makes things a lot more elegant one okay

so that tends to be the architectural choice when you're modeling dialogue systems

and there's also we're very limited perceptual ability for monitoring now there's work presented a

conference on reading people space of facial expressions and looking at these kinds of wonderful

nonverbal

things that will talk about a lot in a few minutes

and that's really important if you're really going to be a full dialogue partner and

deal with pragmatics and way that is easy for people to deal

so in mind brain we have limited working memory okay

we have and attentional focus that emerges from biological constraints

okay so

and it's probably evolutionary really good that we forget things that we don't always have

things active in working memory so forgetting is an important skill it turns out

inferences are often associated and they're not always made in some of my talk will

be about how certain important kinds of inferences are made and how they are deployed

in just in dialogue processing

and whether it's done really immediately in easily and automatically or later as a kind

of laborious work here okay

and then the architecture has to admit incremental processing now but first time

i thought of the first time that a man stand was what i went to

rochester to work with her and a team that included my tandem house

we were trying to write a and nsf grant

that would enable parsers to be incremental head pose we didn't get finding but

it was a wonderful thing to do because i'm at amanda step

so i

okay no bit or comments on that okay lots of other good things to get

a fixed iq and stuff

and so in that architecture of the difficult one to implement and i'm delighted that

many the people in this conference are really acknowledging that it's not necessary to always

do that with

spoken dialog systems but sometimes it might be desirable especially if you really wanna make

something human like

and again you can have abundant monitoring of perceptual information and of planned

you know people monitor their own upcoming speech errors that their speaking they monitor all

kinds of feedback coming in from the world and so

that kind of monitoring isn't part of most spoken dialogue systems just

so that's our question if you're a computational linguist or an engineer you make various

sorts of simplifying assumption okay and all of these assumptions move our research for it

but i think it's important not to lose track of what we had to set

aside in order to proceed because it might come back to hunt is

so it's good to make these assumptions explicit so the way in which you station

experiment often depends on your implicit theory of what a dialogue is

here's what i think a dialogue is here's a good example from my collection now

it to use the same example over and over in different parts are making different

points so apologies of using some of my examples before

i'm presenting this is a different context right now

so this i think a good example of what i think a spoken dialogue is

that this one comes out a bit lower so if the

if someone's adjusting the audio or maybe actually just like computers

now this was collected by trying to crawl g who works that you want now

with one of my early grad students and

we were trying to collect examples of a spontaneous getting to know you dialogue dialogues

from

a bilingual through didn't know each other were bilingual and they were recruited to the

last of these are two strangers

ordering

right

what

what and why i

right

i love this example it has so much in it and i went when my

i p a can model this then i will be happy

i will retire then okay so what i'm about this is that you know there's

all this really interesting stuff there's code switching but you can see that little can

the little constituents a little increments that each speaker presents what they're doing their face

to face that can see each other

their grounding where each other since they're trying to get you know each other they're

giving each other constant can see what's feedback about how some things than interpreted and

that's kind of manifested in the simultaneous speech around the asterisks where they both say

something at the exact same time

and one jumps in to define the spanish term at the other one is presented

so it's very clear that they very quickly establish this common ground

and they use that's very as a foundational part of their conversation

and so you know we can

okay we can observe abundant examples of referring expressions in any given task dialogue but

it's really important to think about the language game in which people are finding themselves

in this particular language game

they have the ability to fully established common ground with each other and there's nothing

restricted from doing that

and so in psychology experiment as you know there are often very we're they're very

weird language games you know

students come at the level paranoid what about today

trying to read my minded you know there are they all have this notion of

social psychology experiments which often have a large with them and then list of their

board to that in a kind of experiment where they

or just getting a cube right now

you are very different language case of so

a language game here is nothing like that one

but in most language games that we set of the lab we're trying to get

many observations from someone so we can have enough power to draw a conclusion so

that we can find out something

create new knowledge about dialogue

and you statistics on it so in a typical rep referential communication experiment we have

two people coming to lab signed consent

and then we initiate this mysterious language games and then they meet each other they're

see that with the barrier in between them

they're given identical to that the picture cards something like this perhaps

and that they need to get matching to the same order

and then they have these like the conversations are not gonna belabour this next example

it's only here if you're someone who has seen this kind of stuff before which

i think you probably are not you would be this room but

"'cause" you know what dialogue is but

here's the kind of dialogue the two subjects might reduced about a particular card you

can see it's very length in that's all of disfluencies and provisional

utterances

like a for this one are it looks kind of like the top their squares

that the looks i know and then be goes a

meeting i don't not quite sure yet i'm trying and you have sort of another

like rectangle shape and then like rectangle angle than on the bottom it's are under

what that is clash eight already i think i got it

it's almost like a person kind of in a weird way like a much prettier

something which is interesting here because be doesn't know what it is an europeans are

proposing the perspective that they end up taking throughout the rest of the experiment

and so we have them refer to this over and over again okay

and so later on you know about eleven cards later after we scramble them and

put them out again be gets to be the director this time and b goes

right to that unite a number nine is that one training and it goes you

open about eleven cards later a now is the director and that's number three michael

case the disentrainment so what these people are done is they have proposed

and kind of training a weighted in on an agreeable perspective to both of them

and then they both use that

so this is i found very striking and even more striking was what people do

in different carers talking about the same object

people come up with very different perspectives

this one you problem you've seen in other types are just as an example

you know you might call the anchor the candle

the symmetrical one shapes on top of shakes are my favourite them and jumping in

the air with bell bottoms on

and they continue to refer to what throughout the experiment as

i sh slightly shortened version of that okay

so it's really amazing to me there so much variation language that's probably one of

the things that attracts us to study that's good you love trying to explain that

variation

but there's very little variation it turns out when people have had a chance to

in trained on something okay so as the system designers you can exploit that

in terms of your intelligent personal assistant you can constrain

the set of things people state not because of tiny english or anything like that

the because people coming trained on these things

and so we view this as people setting up a conceptual pact that was the

term that you've clark suggested referred me one night when we were casting about for

the right term to capture what it was people ended up with a during training

on something okay

so and our first set of experiments we used both tang rooms and

these common objects

and you know we use the tigers just to throw them in there so people

would get distracted because you know system is a language game people are gonna try

to reverse engineer what you're doing to them and you don't want them focusing in

trying to guess i'd guess what you're hypothesis as

so what we were interested in what people would call things like used are dogs

cars and finish

they that we're setting the ten groups that they focused entirely on that

and so basically what we found in this a conceptual pacts experiment was that

people

don't just follow the expected gracie in the thing of saying as much information as

is necessary to distinguish an object from a set of objects which you find it

it so they would start calling this something like the really cool red car the

cork article right powerful right particle red car

and then what it was the only car in this that they didn't go right

to car they continue calling a typical rank are so that was our main finding

we also found that the extent to which they did this was probabilistic and depended

on how many chances they had gotten in the entrainment a part of the experiment

before

the critical trial

i'm such as urban i thought we were done we show something that we you

know that was pretty tangible in useful a controversy erupted okay

so to be here we are a little bit over reaching and the conclusions we're

drawing from these data

so one thing we did in the three experiments and that really paper was we

how to partners which try to the and for the last exterior experiment

and we found that people who switch partners

we often go back to the basic level term and just start calling at a

car when they do you do

consider the conceptual pact that they had established with a particular part or

what if they did that same trial with the

the old partner then they would

continue to use you know the correct are okay

so we were arguing that audience design or this kind of a entrainment thing with

partners this effect that was what we thought we had shown

but it turns out that

we don't really show it in terms of an online demonstration that you're taking this

information really into account with your partner okay

so basically

again this is the summary of our findings which i just covered

speakers were not just as the mormon of as possible and they to continue to

follow the conceptual pact the data samples with a particular partner

but they did not when they were working with someone else

okay

i one just briefly presented with a plane five acts so

this is the series of experiments

the talks about a little light we had in the literature and what i learned

was once the stomach that you know i was a young assistant professor back then

and of course the stomach acid that you get when someone attacks you're working

considerable right but then what it turns out is it can really be a wonderful

thing you can engage collegial e with your

where the opponent and you can both improve your research which is what i'm happy

to say is the ending of displaying five act or at least i think it

is i'm not sure my pointed agree to probably anyway they probably

so the first question after a verb and i published or paper

was the question is in train it really partner specific queries that just based on

which is just a simple association of memory okay

and so basically demonstrate that something really is partner specific to an individual

and not to just any old individual not just to the priming

in memory simple association with the between an object in the term and maybe a

link to that person

in each are really show the two people with different perspectives are knowledge so the

speaker the here

can adapt to each other from the earliest moments of processing this is hard because

most the time when you're in a conversation

you're really similar you're sharing the same context and you may i just happened to

get it right by chance and that's probably happens a lot of the time right

so dealer in both cases are publish this paper called angry comprehension linguistic precedent

and basically they were inspired by an anecdote that but was had where

you just happen to interpret something egocentric leah not gonna go into the details

but his proposal was that listeners expect presidents in it doesn't matter who the speaker

and then if you do just to speaker huge laboriously afterwards

inferential e as a late occurring here

and certainly to be fair some of the data that we presented in the re

original branding part paper i'm had little simple the dialogue where people would say the

first one is the car kind of read where red and strategy or something like

that so

so you can see sometimes it is presented after-the-fact and others other times it would

be the first one is that right but rowdy and

so you can see evidence of the syntax for early adjusting to the partner and

late adjusting

so but that's gave a talk about this at coney two thousand one in philadelphia

and trolls messing it i want my graduate students work in the audience that the

time

and basically i'm gonna go through this quickly

basically boas and they'll found no evidence for partner specific processing so

they had people in these somewhat unnatural situations where they're talking to someone but then

the subjects wearing headphones that are also getting things in there you're from some disembodied

voice somewhere else

that was pre-recorded okay

and so some of the time they found interference between these two things okay

and so

basically hearing the president expression the other expression that they then trained on with the

interactive partner

was no faster than hearing it from the new partner

and so a bar indicates that is the evidence that in train it was not

very specific okay

i mean so what's wrong with this picture

would be that if let's say you and i talk about something we call it

provides that's nancy read moderately and then i'll then walks and then she says i

that's that read matter it out side that would mean that we should be slower

to interpret it probably in

because she wasn't there on the entrainment phrase that doesn't make any sense it doesn't

preamps when using the same phrase that we've talked about just because

she wasn't there when we introduced and that's what this

argument was based on

so basically the criticism i had and i raised my hand during the talk and

i said

okay you fill this l b all partner using the right term again

you've got the new partner using the same old term in a new partner using

a new term

and you're finding that use two are both faster than this one

but what cell so that just one was really interesting but when you have the

old partner committed inexplicably break the conceptual pact what about that does not take any

longer and if you compare that to that's

if it's not part of specific then you should be the same if it is

partner specific and they should take much longer okay

so but was said in response my question well that's not an interesting cell so

we didn't bother with that one okay fine so childlike jump to the train and

ratio for training and with that of use that of his experiment we are you

have the set of objects are still in from by then young child story boxes

note when you're still but

little things that don't really have lexicalized expressions for them and we put them in

an array and we basically

i had a confederate speaker referred to what he object is either the shiny silver

that's shine use cilantro this over high whether these are equally good for that expression

and so basically a naive confederate a naive matcher and a confederate director repeatedly match

the objects and the director have the spoken use kind of show the object what

he was doing you know i have to tell you to get it into this

arrangement but they subjected know that of you the utterances were highly scripted the rest

raw completely natural

so after the in trained on one of these words then the director ago okay

it's time for me to get a get up and leave the room

subjects have been told this experiment is about how you follow directions from different people

so they were given the appropriate cover story this was not too weird in that

language game

and so the directory getup income in it either the same person would come back

in or different person would come back again so we had to confederates

so here is our lab manager darren and then the lowest joy hannah a was

also my collaborative which is serving as the second better

so some kind what we have is the same partner using the original expression on

this critical trial the new partner happening to use the same expression

then you partner happening to use the new expression which we are they were there

during the interim thing

or the original part or index what we breaking a conceptual pact okay

and so i might be interested time i won't up ladies but what you would

see is this one is much lower world just play it quickly

in the next one

to reach into the frame and follow the instructions to look like this comes out

kind of low i think sound

okay so i don't know useful work

still

okay but you can imagine so basically what's happening in this one is we're recording

the eye gaze of the subject and

we find that a lot all around the array when they hear a new term

problem and all partner but i doubt when they hear the new term from the

other partner

and so if you look at the time that takes in that one broken conceptual

pacts l

it takes significantly longer okay

well as the price and thus was that the somewhat so fast basically

when the

when the new partner use the new expression if you just looked at a bar

in case hours argument that anything that's new should take longer than something that was

already primed that's all

you would expect this to be a bit higher but it wasn't and it turns

out that we had norm both of these expressions they were equally good for the

object that's probably why that happened

so at three okay

so what kind of language so basically we ought without to we have shown that

you know there is evidence that you take a partners

identity and you're in train with them into account really on

so at three came along and now every young professor works hard on their data

they would rather die don't publish something that wasn't true

now we are all concert the applicability or if you're not should be

and it's really important that you do something that's replicable but i always be here

that while the things like ten to do is experiments are so

complicated in we're that who wants to try to replicate someone's time-consuming complicated where experiment

what i was really delighted when somebody did so this is a three so let

me just say about are

experiment which is act to

that we only had a critical observations for the whole

session for each pair acceptable for each subject and confederates

so basically we had to old expressions by new speakers to hold expressions by all

speakers to new expressions when you speakers

and two new expressions by will be speakers we only had two instances out of

the a critical trials

where the conceptual pact was broken before that the experiment

was taken up by all the entrainment faces because the chip quite awhile for people

to in train on these objects before you wanted to be natural

so a lot so basically what's interesting is in that last case the broken conceptual

pact is in full listed as

so when you're part just something in fullest that is once or twice

it's not a big deal maybe they just their attention wandered or whatever but what

to do what over and over again

are you playing the same language game or not i mean this is a psychology

experiment okay so map useful even in thomas l o with little kids range three

and five

replicated our experiment data you sidetracking but they emulated the design otherwise exactly

they had only

these eight critical trials and only two of them were broken conceptual pacts

okay

and so basically you know there was the experimenter present who told the children to

movies objects around

and then they just videotape the children they could code basically how quickly they were

able to position they were looking there are gays okay

and so basically what you see here is the at a critical trials in those

for different conditions okay and so

here we have the original partner

and here we're the new partner in the darker colour

okay

so what we see here is that i'm when the original partner break the conceptual

pact it takes a long time to process when the new partner

uses a different crafted in a different term that the brakes were prevalent

it's find it is much faster

but you effect really diminishes on the second occurrence you with a three year old

and also to some extent with a five year old

so this suggests that even a little children are exquisitely sensitive to implicitly in dialogue

okay

and you know that i'm charles and i had been sad we couldn't got more

power we're happy with the effect came out but if we had done what we

would have if we could have done anything we would've had like a hundred broken

cpus and we probably wouldn't of gotten or effect

and so in retrospect were very glad and so basically putting people in implicit situations

too often is unwise

basically after that act for

is a crime miller a and one problem really adamant crime miller and elbow are

one of use one deal students

tried it more detailed eye tracking experiment they argue that we were not as methodological

e sophisticated as we should abandon our analysis and so what we should have looked

at early in the trial was not just people's first look to the

object

that was the target object but you look around things as well

and they argue that if we don't that we would have found evidence that precedent

or using the old expression regardless of who is that it

what is important early on and then only later did the partners specific part kick

so we thought okay will try that they try they able to a speaker specific

effects and found them but only later on okay

and so i joined i came along and we analyze the matching a brown and

data and we actually early in the experiment did not find any effective precedent that

would be the black and blue lines here and the higher the winds the more

likely they are to get to the correct target but this is a noise in

there are no difference between and you please

lines right here

so we didn't find any evidence of the old from its timing people but we

did find this evidence of the broken conceptual pact here is the rise to the

looks at the correct target object when the cp is broken and the other lines

are essentially indistinguishable

right

and so this still supported our conclusion now note that in all the bar experiments

they have these pre-recorded partners going on in the crimea wherein bar experiment they had

a pre-recorded partner we had an interacting partner

and so acts five finally really quickly i dealt are did this with new calling

in the scanner so he's doing this and mpeg study okay very similar designed to

matching in brandon with to live confederates out by the scanner and one person in

scanner and so again he's now is looking for

evidence of mental arising in the theory of my network so called

which consists almost accounts of three different areas one of them's frontal one separately as

the ones that are on right temporal profile bridal region

and so you found no evidence for

mental i think in his experiment basically

but the problem is that's subjects in the scanner experience broken conceptual pacts at times

and they that was twice as many times as they experience maintain or follows it

is conceptual pacts so that's another issue

so basically my taken from this is that

then the language gain you put people into matters accurately dramatically change your results

and so and a cool and i wrote a little position paper on this

and basically

but ways in which confederates are deployed

can make a big difference in the results that you get

and also the ways in which experimenters choose to deploy confederates differs depending on what

they think the essence of dialogue is

okay what they choose to control what features to make explicit what they choose to

let the confederate just run with without instructing them what to do okay

okay right

so i'm gonna take you through the argument pretty quickly here so we use confederates

because we want a conversational partners who show up to the lab you know it's

harder to get two subjects to show up than one subject so if you have

one subject i see some that are not even going i'm my heart results you

if you want if you have four people coming in which i've done that then

that's even worse right and so we really big this research difficult

so that's one thing that people can do to solve that problem it maximizes the

efficiency in your data collection and it gives you a lot of experimental control because

as k bach once noticed people say whatever they want to say should call this

exuberant response thing which is one of my favourite

noun phrases of the whole world and the editors always try to correct if i

corpus in the paper

but it's called exuberant funding

and so if you to the extent that you can control one partners behaviour then

you can

reduce the variance and maybe get more powerful to conclude but the other subject is

doing okay

so maybe basically a lot of dialog experiments involve while deception and that's okay every

experiment

has some deception and that we don't tell you exactly what the hypotheses are before

you're in it okay

so i think there often not as they appear you might be interacting with the

computerised dialogue system

or with the person who provides rulebased responses and sometimes it can be unclear

you can be interacting with over an intercom with another student in the next room

or maybe that's pre-recorded you don't know

if you're not allowed to interact with them

well you can be interacting with another student or with an experimenter and so studies

do these different things depending on what they think dialogue is okay

and so on the questions when might using a confederate really threatened

your conclusions in the dialogue experiment

and again this depends on the purpose and on what you think dialogue is

so if you think dialogue is just like language processing by yourself only more engaging

that's one possibility or you might think it's a set of expect alternating monologues that's

kind of the way

the message model assumes dialogue is in a lot of spoken dialogue systems assume this

where they're just looking

at your move and then my moving and where you're the computer and i user

your movement mine moving your movement mine move

we're just doing these are alternating monologue sometimes

or maybe a little more sophisticated comprehension production about activate one

and

or maybe it's really shaped continuously by the interaction between partners

okay

now in this first one

the mere presence is what makes a partner make dialogue real and engaging if you

think just having someone their the audience is what does it

then this is you can see this is really just social facilitation theory okay

and so basically that having a partner just

after the projection space for the user to produce more natural

dialogue okay

okay

and so that had a long and distinguished history and social psychology ever sensible gram

a nash all of those other experiments

if you think it's alternating monologues again

this is all of you that is widely used in by many people who do

a i research computer science linguists psychologists you don't actually do research on dialogue people

like that

and it may be fine for some purposes right comprehension production about it but once

this is a few popularised by martin pickering and simon garrett in their interactive alignment

model

and basically this is interesting "'cause" it leads to parody meaning the speaker and hearer

using the same representations and acting on them

and that could be what you think of is common ground

but they argue that it's really just to priming they try to explain the whole

thing because of the simple association

and they are also argue on the same

kind of logic that bar in his are we're using that priming really will explain

all of this

so called partner adaptation

okay unless the late repair

and then finally if you basically i could go into the pickering everything which i'm

not going to

really brought about five used in this wonderful picture and the bbn thing which that

fall do the priming

right here we see the

one partner as partner a on the side partner be on the side and you

know my semantics just primes yours somehow through the air and not quite sure how

that happens but

you know and this is highly modular to but

the problem is that if you assume that a and b are carbon copies of

each other's interlocutors we do not that's not the case

my semantic network differs from yours if i hear the word

eunice i think mother because my mother's name is unit and she's going to be

ninety in we were so

and you think something else right you might think will eat on one is the

old telephone operator on t v comedy

whereas you know your mother might be named it'll travel and you know that will

think you have in your network so people are different

partners are not carbon copies of each other

priming is not an explanation for this i are you okay

and so just to get naturalist if it shaped by the correlation between partners

then this is a different you okay and you might decide to use partners differently

if you believe that likes you think confederates differently if you believe that

so these general concerns that you have in place when you use a confederate that's

you know basically a confederates can be biased if they

it is well let me just of overview of the concerns right now that an

and i talked about in our paper there's the bias confederate the covert confederate done

in secret the know what all confederate who knows too much about the experiment in

terms of the task that they're doing at that moment

as opposed to the first one who knows about the hypotheses

and the script a confederate

user for concerns that we go over

so basically ideally to deal with the bias confederate ideally you're confederate should be blinded

the experimental hypotheses and to the conditions

that can always be the case that would be ideal

and alternatively you can you can script the confederate behaviour in a few critical places

and not in other places

with the culvert confederate on this is we never use this in my lab we

never fool people into thinking that this is a real subject

other experiments that use confederates i'm this

vary dramatically stage managed thing where the confederate pretends to arrive late of a stress

pretends to be a subject need extra instruction "'cause" they're clueless so there

they're trying to kind of pretend should be not a confederate

but during the experiment itself they just behave however they are usually not given instructions

for how to behave and so that role is sometimes concealed a great length but

then neglected

see

so i want to just say these are examples of two different studies one problem

though as a slap in one that from a hannah and townhouses lab where they

basically deal with these concerns very differently so

with the experiment on the left which found no evidence for audience design or partner

specific processing and concluded that language comprehension is egocentric okay versus the one of the

right found audience to find that language comprehension takes the partners knowledge into account they

don't with these concerns very differently so on the right

the confederate was blind to the condition they did not have hidden knowledge during the

task okay and they were told of subjects were told that the confederate with someone

from the lab it was night the you know

i was gonna play this game along with them okay

but they didn't hide the status of the confederate and it's really the opposite on

the side interplay how this stage det and so basically those found to very different

results okay

with these other concerns

basically

an overall confederate this is when someone knowledge doesn't match on what they're supposed to

be so if you're confederates than sitting there as a listener forty times in the

experiment and knows the story that the value subject is telling them better than the

subject as

their feedback is going to indicate that unless there exists an extremely good actor

the problem is that when you're using a confederate as a speaker in experiment you

and script that if you want and

we know what speaking involves for the most part most don't know very much about

what listening involves or no formal models of what backchannels people given any given moment

really

and so

what are the experimenters are more likely to let that one run wild and so

therefore if the confederate addressee has too much knowledge table display it to the subject

and that's problem

so that try to speed up a bit five one

allowed time for questions but we'll see about actually happens

so it is important for the addressee not to have too much knowledge about the

experiment

and then finally i i'm gonna skip over the scripted one okay but if you

want to take a look at

are examples for

how different

experiments come up with different results depending on how the can better it is deployed

you can take a look at the paper

okay

so it turns out that even it and addressee is distracted

and

the speaker will tell a story differently depending on if the addressee shows that they're

distracted or not

and but interacts with that is it the speaker expects the addressee to be distracted

that also interacts with what the addressee does so

i'm gonna skip over this pretty quickly

but in s as study with this kind of design that an equal in an

idea and that's no longer in private banal for yourself skip that basically if a

speaker extracts an address the user is a tender then they get one that's good

if they expect an addressee whose attended but they get one who's counting the number

and in everything they say and pressing a button to the chair secretly whenever they

do that

but the speaker doesn't know exactly why the distracted

then a

or they don't even know that there are distracted they're expecting them to be attentive

then they have that interesting condition then if you tell the speaker the address is

going to be doing some secret task

here the are they're getting an attentive addressee but here they're not know getting what

they expect

so you get different results depending on all of these different cells

okay so it's not only the feedback that matters but the expectation that the speaker

and use the experiment with

so i'm gonna jump ahead

two or recommendations right confederates no i guess i've already covered most of those basically

try not to have the confederate have

information they shouldn't have at that point in the task and take into account both

what the subject is expecting and what they're actually see

just two

if you another example of audience design and partner specific processing you know

in things like just your gestures are a little more ambiguous than words people can

project all kinds of things onto a gesture but you would gestures people just your

differently when they're talking to someone who already knows what they're talking about versus when

someone doesn't okay so in a this next study

a lexical it who is now weight are said right now suppose start working with

retail

did this need study

by having people describe roadrunner cartoons she comes from the manual average in chicago and

so she is all about gesture

and so you have people

watching these roadrunner cartoons and describing the either telling it telling the story to

a new partner okay and then retelling it to that same new partner or retelling

into a different partner

so you have a preconditions but the two partners were counterbalanced for order

now i don't know if this will play the video

no one okay so basically the idea is that

this person i telling this to

a new partner versus and all partner

basically

the gesture space that she uses is much smaller the second time around is used

in this kind of diminishing of information that should provide the partners who already have

the information

whereas when there's a new partner it goes right back up again and the gestures

are large and

demonstrative since you give that the speaker the right cover story so this isn't a

weird language game for them

okay

and so

let's see

i'm going to jump ahead and since the videos are working i'm going to

jump ahead a little bit

and just say that computationally again you can either model adaptation as a slowly inferential

process or an immediate nimble process that if it's activated in memory and you don't

have to make inference "'cause" you've already made it

thank you can use it just like any other information in memory it's not modular

you not stuck

with using partner specific information like

in these situations here

what it was i looked at the numbers of our experiments that had shown clear

evidence for partner specific processing sometimes very early in the interaction and they were all

simple situations that didn't have any we were done natural

recordkeeping that a naive subject would have to do but things are very perceptually clear

like does my partner speak english or not does my partners speech this particular dialect

or not is the partner looking at the thing we're talking about or not and

when you have that simple and a binary situation

then you can think of this is very simple partner model so obviously that's a

lower part about it is computationally expensive to keep track of you your i p

a can do all the keeping track at once because it has all in a

list computing power but with a human if it simple enough they to can keep

track of the information show partner specific processing

so if you just stop acknowledge that you know these situations are quite different and

then and i'll be aware of when humans can keep track of partner specific information

then that could provide insight into what you wanted to as if you don't always

want to emulate computers sometimes you can do it better but

may want to take that into account

and then very much take into account what language game you're playing

so i'm getting near the end i know i'm running a little bit late but

just to wind up i wanna just say that i agree with some of the

discussion yesterday that we are only at the tip of the iceberg

concerning our understanding of the pragmatics around dialogue

but it still really important to better understand have system should perform the role of

dialogue partner and how best they should adapt to a human dialogue partner and i

have some concern about using the wonderful successful applications we already have like calendar management

information access

and try to use that to project ahead everything because in other socially interesting complex

pragmatic situation there's a lot more going on and that is

many of us to comments from the audience nuclear

but i just wanna amplitude very short clips from the internet i just put on

this morning

because i think there are relevant when we think about what a conversational partner really

is okay so first of all i call this the chance to garner effect and

basically i think people using these an intelligent personal assistant

they're projecting a lot of relevance and sensibility and things that are not so sensible

when there's ambiguity people do their best to make it what they think something sensible

would be and so there's a movie if you back away used back with peter

sellers playing this

so one type of character

and the it's described in the clip as a simple sheltered are near becomes an

unlikely trusted adviser to a powerful business man and fighter in washington politics okay

and so what i wanna do quickly as just

if you that this will cooperate

see

if i can make full screen

okay people are seen this movie

being there okay so you youngsters have not okay good score

alright so here we have chancy gardner walking with the l important items the a

trusted advisers the present

who eventually chancy gets promoted to being the trusted adviser suppressed

we want a which present it is but you kernel have your own fears okay

then going to know this is done by a single between is television was to

present them

you are much smaller

but i guess what

alright so basically you know this is somebody who really is very simple but everything

used as is taken as a

and word is the rooms do not suffer

well as well known we will

and we got a unicorn

i know and the related to brazil would be to

okay so that's that so basically you know things people will try their best to

make sense out of whatever they're experiencing okay and

we just get five two

okay

so basically even though we try to make sense of the main message that we're

hearing even when we have evidence so the contrary or ambiguous

we are really exquisitely sensitive when the non-verbal signals are wrong okay we may not

even be aware of what it is where

reacting to okay

and so if you don't take that into work then you might as well just

be on a date with contractually still some of you remember this clip from a

you they years back any one thing the scope for

okay about again less than half of you so you know what enter actually assistant

really highly successful dialogue system from years ago that

from what you may even have been involved with i'm not sure

and i just wanna play that could really quickly and then we'll

see

alright so make it big

right here it can tell

okay

here

of course i

all right next to the map to the lack there which i

i think there is no shot i

since i last i that's why i had to i

i just wanted to the two

it's a much

so we got them i want to know i

sure that i

and that of that

okay

back to the

and doing

to be of course faster i were able to that things properly

so the point of all that is that there are these little implicit right

do you

verbal and nonverbal collateral signal that's her pocket call them on

to which people or implicitly and exquisitely sensitive and when you get is wrong

it really shifts you into a different language game and of course the funny part

of this is that you doesn't get that he's

he's believing her but we all get at that point

so basically the language game in any experiment in any kind of application varies quite

a bit into assume that it doesn't matter is to really miss out on this

i think it's something really important to take into account when you're designing a personal

assistant but switching applications the something that has very different pragmatics okay

and so

it's better to acknowledge what you're assumptions are what you really think dialogue is and

how you've constrain the language game and what you've sacrificed basically and so basically

language processing in general with humans is extremely flexible

and yet it's extremely important you get these its right and i think there's a

lot more to learn and

i thank you for your being a wonderful audiences making audience design used easier by

didn't have so much too much material to present

and i just want to thank my collaborators and my home institution and stuff thank

you

sorry to run like yes when

we could hear you

okay

well try to summarise

right

right so the questions about crowdsourcing and whether you can just lived responses out of

a crowd and stick them in your application

which i love the idea of crowdsourcing many things right

right exactly right

right so the point with entrainment is it your restricting the particular packed the content

of the particular perspective that you take in that indexed by this lexical item that

using

so basically if you have the domain where that's not important where the domain is

choppy and people are referring over and over to the same thing then crowdsourcing might

well work

if you gotta domain where you in your i p a have had some preliminary

discourse about something and you're agreed on calling something a term which is also in

big you know it would be ambitious to have a spoken dialogue system that can

train i would love that

but most the time people end up adopting the terms of their computers use of

their systems use "'cause" they have no other choice

but if there were such a thing that we're flexible then this would be a

very incoherent dialog if it just the dropped in

utterances from a crowd because they wouldn't be lexically constrained in the same way to

indexes joint perspective that we think dialogue is about if you don't think dialogue is

about

working off the joint perspective that you've achieved with a particular individual

then you then crowdsourcing will work if your application fits that assumption

if it doesn't then it won't

i think

thank you for raising that i think that

that really

makes it clear

justine

only to thank you

or fact

right

no i don't think it we have i don't think

we should abdicate that responsibility i think sometimes

you have a different grounding criterion depending on the situation if you're entertain yourself with

theory

then it doesn't matter

and you are fine attributing all kinds of bizarre responses to be in contingent on

your own when they may not be

but when it something important

like referring to an object in you want the right object

then it does become important and so

you have to use the right term or else your partner things you mean something

else even if it's a perfectly good crowdsourced term that many people will like

and so it requires i don't think this is contradictory dog i'm sorry i didn't

summarise your question because it was impossible but

but i think everyone the room probably hurt it i'm not sure that people on

the weapon are converted

but you know the question is you know if you can basically take the partner

into account is just in just said you know in the micro sense moment-by-moment depending

on the feedback in the mapper sense what you want about them i would also

say at the beginning you start with the expectation about the partner we don't a

lot of work with that

and so all of those things it becomes very powerful the evidence you get moment-by-moment

you revise the initial model of the partner and by model can be a labrador

simple depending on how computationally expensive you wanna get

and then at the end you have some information long term memory that you take

with you in the next time you talk to them

then some of that gets downloaded right

so it takes a little bit to download apple once it's in working memory it's

fast rate

and so i think i don't think these things are contradictory tall sometimes meeting matters

and needs to be achieved painfully and other times it doesn't and it depends on

the joint purpose the two people presume in a conversation that's not always the same

purpose but it often is

any

are we done

have to stop i think you are

More than meets the ear: Processes that shape dialogue

Keynotes

Susan Brennan (Stony Brook University)