she could mining at b one next get static
i think we have i where i know keynote speaker an extremely glad to that
coming can
a and detail is a perfect setups linguistics i've invested californians ninety eight
and if you see that you like is that he had a broad the second
one thing that out succumbs
from at the expense of a
a professor of linguistics like is also then if at the sri international in the
past their arts and sun microsystems
and he is associated celtic general of logic and complication and he has and then
at the executive both
i think ideas and is what actually e focus the and it's "'cause" interpretation so
you guys a lot of computational modelling luck but also experiment a lot and which
can be is stealing and think that i is to listen relationship you're that idea
i'd and like to nominate
as the sink a pragmatic and delta psycholinguistic features of language
and some of the things he's not that trinkets trumpet on and it's this collecting
instruction on a events
and ten
and his book a coreference utterance and that the idea of grammar isolate and a
and b citation
and any maybe come up with a set of speaker is a who would broadly
at the cable i think i understand dialogue in an ideal is a very a
backchannel an ideal choice
and in have been talking to a bunch of people like yesterday and today before
and you come from a variety of backgrounds comments and makes it might makes that
the psycholinguistic at least at the end you have something to say tell a few
and i just like and take get from there
okay thank you mean and you
being we okay
alright
well thank you very much
for having you here and four
a median in this morning
so
so that
famously positive to competing desiderata and language design right one he called the auditors economy
right which is kind of biased towards here
right that languages should enable here's to get the speakers intended a message
with minimal interpreted in inferential after right so that
pushes mine which stores having more products the unless the ambiguity
is that what we would like
language to have when we're building system right we want the information right there where
we can grab
unfortunately for systems there's a competing does it around
which is the speakers a kind of which has the languages should allow speakers to
get
their message across with minimal articulatory effort
right so that pushes towards
less felix the and greater amounts of ambiguity
kind of the limit if you see variants of a galaxy
it's kind of the group language right good so always says i am group and
then everybody has been for well what it means by that
so
one way to speakers can be economical
i and still be expressive in getting them there'll a message across
it's a designer utterances would take advantage to be here's
cognitive apparatus mental state incapacity for inputs
so is to be able to convey more information than what they explicitly say
and this voices problem week constantly face when we're building discourse and dialogue systems because
the systems don't have that same apparatus thing capability that languages kind of wrapped itself
around
now of course the source of these pragmatically determine aspects of meaning
also been kind of the focus
in pragmatic since its birth and it's become an industry of its own since the
seminal work of rice
what i'm gonna focus on this part is a type of actually semantic enrichment that
i'll clean them fit neatly into any the other kind of enrichment of interest custom
the list linguistics and philosophy literatures
so let's illustrate by treating right in the some examples
like a jogger with it by far out about the last night
you're probably getting that the victim was india somebody who jobs
but was actually jogging at a time
right
the sentence doesn't entailed
right and you can see that by comparing with one be a farmer it was
hit by acquired how about the last night
it's for less inevitable where you get an inference that the victim was four and
half the time right in fact if you're knowing about how afterwards pretty unlikely even
though it could be that require veered off the road one so far in the
field of markup for guy also extractor right
you're probably not getting so that you don't need
to get that inference
in a case like one b would cause one to ask what why are you
getting it and one a
it's not limited to choice of a nominal
you get a with adjectives as well
so
the drug addled undergrad fell auditory pints clips
you probably getting
not only that
the victim follow the clips
and
was on drugs
but fell off the cliff
or because they were on drugs
but if you get you compare with to be the well liked undergrad about the
storyline squareds you're probably not saying ty why would being well like to call somebody
the fall off
into c
the normally with skippers undergrads of auditory by waves
you're probably getting kind of a contrary to expectation inference there wondering why somebody who's
risk of course would find themselves in such a document
finally you get it would relative clauses and referring expressions as well
so the company fire the manager who was embezzling money
again you probably getting narrowing that they were embezzling money they were they were fired
and embezzling money
but they're fired because they were embezzling money
you can compare that the three be the company fire commander whose tired in two
thousand two again doesn't send you off on the search for a while being hired
two thousand would cause one
to be fired
and
i then three c is another case of the a bilabial expectation kind of inference
right so
mean you think about a dialogue system
i be perfectly natural to respond the freebie
by saying y
right but it would be a little i to respond that way to three a
well that's a speaker was trying to convey the reason for the fire
use if you ask why haven't picked up on the inference that the speaker intended
to get across
so for one of and interpret it appropriate term of or i'm gonna brand x
as
conversational elicited right it's meant to kind of play on use other terms and pragmatics
implicature explicate sure imps the structure and so forth which are we talking about the
moment
to get at the idea that what you have is a speaker who is choosing
her referring expressions among alternatives
so as to trigger inferences on the part of her here that wouldn't otherwise be
drawn
so wanna do one is talking is that i'm first gonna
the gap a little bit on the kind of linguistics and philosophy side so a
topic
they're with me on that
and kind of talk about why this is a new type of richmond in the
literature and what are the car what are the aspects of
people's cognitive ask apparatus
that the speakers taking advantage in being able to communicate this extra content
and that's can be largely joint work with jonathan how women the philosophy department use
est
then i'm gonna go experimental
with joint work with honda roller at university of edinburgh in talk about how a
list features are just important for getting all the content however at of the message
but also actually impact
the interpretation of language an unexpected places in this case illustrated with pronoun interpretation
and then
i will conclude with some slides on the ramifications of the model that will build
for computational work in the area
so if you are you know
from hollywood pragmatics
you probably react
to beat examples pressing one that sounds familiar that sounds like could be
in cases of the gradient implicature
right so
i think a lot of we only know what implicature is that won't going to
detail but the important thing is that
a according to grice's implicature results from assumptions of a rationality and whopper activity among
me in the lock interlocutors
you can just at out in terms of for maxims i well i will read
them but will be most interested in
the first quantity maxim
which says you know say is much improved
information is required
the third some maxim of manner that says to be brief
avoid unnecessary fill actually and then finally the relation maximum says be relevant
so that the important thing i want to focus on is that implicature is a
failure driven process
meaning
the here and encounters a problem and ralston implicature to fix it so basically what
happens is the speaker
says something that has the literal meeting say color p
and the here says gee it just a really means peace you wouldn't be very
what order
but
rather than that
this one
but i identify some after information for q
i assume that she's can trying to convey
then she becomes cooperative again
and so mean
i think in fact she intended that i do this whole calculation and draw the
inference q in addition to the content
p so to illustrate right we're gonna be talking about referring expressions amiss talk
and grace was the first denote the choice of referring expression
so i can in some cases have hallmarks of implicature so he's kinda
rather dated example was for actors meeting a woman this evening
which would normally implicate that the woman being mentioned is not ex's wife
sister mother and so even know those are all when
so the idea is that
if you're talking the speaker was talking about acts as y
then
she what is said white
but n and n in accordance with the maxim of quantity give as much information
is required
since the speaker didn't do that
we're gonna draw the inference that in fact
the referent the space of four of possibilities for a woman don't include these other
kind of salient possible a reference that would be denoted by terms like a system
otherwise and so
so implicatures
right
or kind of we would those out with standard tasks
basically when you have implicata content
you could do a few things with it you can actually asserted input on the
record that's a reinforcement
they can say x of meeting of a woman this evening in implicatum out his
wife and then you can actually save
but not his wife
and that doesn't have a strong sense of redundancy
you can select
in fact ceases wife
a or you can actually get on the record that you don't know that the
two status of the imply could consist
impact that's in fact possibly twice
well you or times are examples satisfy these tests as well right you can say
the company fired manager whose embezzling money in fact that's why you got fired
that's a reinforcement
but that's not widely got fired
cancellation
and that mainly why he got five which is the suspension
so our researchers just implicatures
there's one
one person who i think is really given a serious pragmatic example of exam analysis
of examples of the kind a general character
that i'm talking about here and have a and we profile
so i took this
this is a kind of a an example come
from the first
hillary clinton donald trump presidential debate in the us lester whole
is the moderator from n b c
and what is in trouble starting to ask a question
he did not say seven a
right research on for five years you perpetuated of false claim that brought about what
was
not an actual word that is
it's not only said what he said instead with seven b
it should run for five years you perpetuated of false claim that the nation's first
black right
was not a natural born citizens
those two sentences are extensively equivalent
right they differ in these over for rain expression that denote the same individual
but seventy goes beyond seven a
right in kind of
giving rise to this idea that there could be some kind of causal relation between
drama hassling a one man and his status as the first
why are present
fortunately nothing had happened the sense that you make its worry about rampant racism
sarcasm
and if we compare that with seven same as for example five years to perfect
way to false claim that the first part of the place to one of women
on that some key where was i do not report susan that gets a little
kinda confusing
i
using the one
explain actually to referring expression even know
that possible first to a bomb
so
compels ideas that you for you see that these referring expressions are longer and more
descriptive the need to
they violate the product c d sub maxima and the maxim of quantity
and what i and basically what you do is happens with some kinds of implicatures
is
you rescue it
by way of another max
in this case relation you find
this relevancy relationship
that justifies the use of the more probable x more informative referring expression there's a
lot of technical detail here that i'm just gonna gloss over
okay so now making it case so far that a list features or a species
of the implicatures
but the in general
these cases do not pattern with template
so
can maybe try triggered by the maxim of manner
not really right probably studies
an issue
probably the use my
require so and he'd age on fire the employee who was always late
you get the elicit your
john fired employee who is read here we generally don't
the relative pauses just picking out one
salient employee
and there's no real difference meaningful difference in perplexity between those two referring expressions
and ac john fire the employee who is right here appeared in glasses
is more products but you still don't get the a causal inference
so what the maximum and at elvis is that e c
might be side in a situation where at what it's advice right of there's only
one employee would right here why going on about to be are in classes but
its orthogonal to the existence of a causal inference in again like eight
a another reason for doubting
maxim of manner being
relevant here
is that
these examples lack kind of the canonical
the heat here
implicatures driven by mail or so
what
larry horn call the division of pragmatically or so
if we compare john kill bill with john "'cause" tilted i
those essentially have the same view notation
but you get this division where the shorter version tends to describe the more typical
situation and a longer version them or a typical situation so
you know when i say john hospital did i
you was probably be surprised if you wanna john just one often shot building
you can get the sense that do exist
might a bit indirect causation or accidental killing or something like that
because only because
if gunshot build a it probably would just said john killed
so in of are cases you don't have this you just talking about competing referring
expressions of all denote the same reference
there is no this characteristic division of the do you notational space
so what about the maximum relevance you might be thinking relation it might be thinking
these are just kind of relevance implicatures
but that doesn't really work
either "'cause" the problem is relatively more restrictive relative clauses there
can stream
the dean the
the reference a b and p to which they attach
are kind of by definition relevant
so it can be a couple if i really am manager whose higher in two
thousand and two
that relative clause
is fine
even though it doesn't give rise any pair of causal inference
so by then relation you don't have a an explanation for why you go beyond
that draw comp a causal inference in the case like a ten day
really what the feeling is that the these inferences are not
triggered by gracie in maxim violation
it's the it's are already are machinery for recognizing relevance
thank
gives rise
so the inference right
by the time you would think of in terms of triggering the maximal relevance
you've already identified the relevancy relation
it's a more automatic process
there's a number of other types of pragmatic enrichment that have been discussed in the
literature you know i'll go to just
cut on this we use quickly
you know from rice
you know it's a pretty simple picture right you would
hearer's
interpret sentences do a little work we on that in terms of fixing reference i
index tickles tends interpretation
and b ambiguity resolution
and then everything else is left to implicature other researchers have argued that there's other
types of enrichment that go beyond
what's literally said but
we wouldn't wanna call implicatures so
it's is box implicit sure and part of what constitutes a explicate your relevance this
so these are cases like lemonade i'm always true crazy
well we don't really can't even decided to value to that unless we know you
know to pray six or what
so that's called a completion
in a way of other cases like second class cases like eleven b i haven't
had breakfast
which you know that usually need ever it just means today
right so you can compare that to a sentence like
i haven't headset
which usually means ever and not today
unless you live image society of course where people typically have sex every morning but
very rarely have breakfast and then presumably the justice record of slot
so the crucial thing there's a lot to be said about of these but
crucial thing is these all constitute
developments
expansions completions to the logical form of a single
utterance
where and it again their failure trip
either the sentence is an even complete enough to assign a to a value or
it is complete
but it can represent something that the speaker would plausibly once the same as in
the breakfast example
so you have to narrow it's t d notation
elicited don't have a characteristic in all right the sentences are perfectly well formed
without
the inferences in question
they're not triggered
by any
communicative any risk of communicative failure
okay
and the and then they involving inference of
then do not the completion of a logical form but they it it's an additional
inference additional proposition so the company fired employee who's always late
and another obstacle
it was the lateness because the five
so i there's a lot this is said in terms of other types of enrichment
and but i'm not i won't
i think you get the picture so then the question is where do these a
list features come from
and i'm gonna argue that they come from
part of our contributions apparatus
that
many of you actually in this audience will be familiar with
less so for other audiences the type of presented this at
presents two
which basically it's or it's the same machinery that we used to establish
or world is coherent
right so
it's well known that we interpret when we interpret our world we go well beyond
what our perceptions give
right so
if we're working at more or something and you see this chronically tardy employee show
up late for work
and then witness a few minutes later
and getting fired
you probably draw inference
that there's a call a causal inference between the two
the feasible you know could be wrong
but you draw all these kind of inferences
anyway
but if you see a party employ articulatory employee coming late again
and couple minutes later class for what to say where is the automotive department
you don't draw causal relation between those two it's just two events that happened in
the world is perfectly coherent otherwise
so if we make these kinds of enrichment
were not as a situation
so guess
as we interpreter world
it only makes sense that we would make similar kinds of inferences
when we understand natural language descriptions
of the world
right so which is why we see the boss fired employee who came in late
again you might draw this inference
and when you see a customer s employee who came in late again with the
automotive department is
you want draw a causal inference
so many ways he's inferences of the or the most pedestrians or right there just
the kind of inferences we draw
to establish the coherence of our environment
and as a argue it's a very different kind of process
then the other kind of more value driven processes that underlie other kinds of pragmatics
enrichment
so what are these cognitive principles well
yes there will be
familiar to a lot of them you lot of you
they're the same kind of principles that underlie reestablish of establishment of coherence
in discourse between set s
so
in seven a the boss fired him for you came in late again its essentially
the same kind of inference that
you will get to establish an explanation coherent relation
for seven b
the boss fire the employee
it came in late again
it typically infer causal relation we've also seen by a labial expectation relations
the company fired to manage to is a long history what words same inferences if
you break it up between sentences
the company filed the manager he long history of corpora towards
we've also seen cases a better non-causal or maybe just like enable my relations like
i with very hard collocation
we employ you want to the still the but we employed want to the store
bought a bottle of scotch
for the authors part i have somebody said that to you and them
somebody later ask so where the employee get this sky
you probably say at the grocery store
not probably not notice that sentence doesn't never says
it's just an inference that you draw to connect the going to the grocery store
and the binary files got
just like you would draw for across causes the employee went to the store she
bought a wildcard
for the office party
the crucial difference how ever
is that
when you're establishing coherence between
these sentences
that's a failure driven process right language mandates that when you have sentences within the
same discourse segment you have to find some kind of relevancy relation between
less we be satisfied for discourse is like not dale's twenty be replicated adaptation so
you know the employee broke his leg
you like models
we'll probably strike you as a kind of a discourse right you don't to say
you know i just i just one two things about the employee great
move on
right now you might object and say
well wait a sec
i think that could be coherent may the employee happened upon a problem tree try
to climate to get a aplomb and so i'll broke is like
now it's hard pointed out many years ago that they are shows you
right that you are within two car in its to by the search for coherence
right you
you know is interpreted has to check this sense you want to search for coherence
between the utterances and you willing to accommodate a certain amount of a context to
do that that's totally different from twenty eight
by say that what we employ a would like one broke is laying does not
send you off on the search for coherence
it's just employee broke his leg which one o one like ones among we others
okay
so
same time of the machinery twenty eight feet is so tell your free try nothing
in the sentence is explicitly telling you have to search for coherence in a way
that twenty be does
so really what's happening here it's just like other kinds of pragmatic enrichment
right where the speaker is taking advantage of her here's cognitive some aspect of our
current cognitive apparatus in constructing a referring are utterances
so the case of implicature again its reasoning about you know rationality poverty the right
a five assigning grades and as soon as me about the grades in my class
and i say some students will get an eight
i'm not being cooperative
if it turned out that it every student
even though it's like able students in a actually gives students and
a you have cases like indirect speech acts right where we know like
these are all over dialogue and you have the reason about
the plan-based goals of the interlocutors
beliefs desires and intentions and all that kind of thing
it's the same kind of thing except the aspect of cognitive
here's cognitive apparatus taking advantage of this is more basic
kind of associative a reasonably kinds of reasoning that can extract the last in a
in a temporally extended convolutive
sequence
so basically we have this machinery for understanding coherence in our world we use that
for understanding coherence across utterances in
dialogue
and discourse
and then the speaker takes advantage of that into using a referring expressions within a
sentence to give rise to these inferences even though they're not mandated by anything that's
explicit in the utterance okay
so
so i think this is the structures are particularly difficult challenge problem
when you're building computational systems precisely because
there's yes right we build systems we think of
triggering interpretation problems we see an utterance
and we have to you know we have to interpret it we see a problem
in we have to search for reference
we see multiple sentences and we have to find a coherence relations
cases of the list of judges just nothing there
that's saying hey you have to try to search for you know every possible any
kind a causal relation that could occur between the content of any two constituents right
it's something that a rises automatically
when you have the cognitive apparatus that we
so
hopefully
at this point can be into solicitor's arms
important part of
extracting the for meaning out of utterances now minutes which years and to
experimental mode
with the joint work with on a roller
but in argue that
i can elicit yours is an important part of
tracking discourse meaning
and ultimately can affect
interpretation of downstream linguistic some i'm gonna do that
make that case with respect to
a particular problem pronoun interpretation
so i think it's
the safe to say
but is in a common wisdom in the field
reference for
for decades which is that there's this unified notion
of energy salience or prominence mediates between pronoun production interpretation
speakers
use pronoun to refer to salient reference
and then hearer's users think use the salience
to interpret
the reference
they're mirror images of each other
happen to be any other way
and then so then you know the pulse stress discourse terrace
it's just identify what are these different contributors to energy scaling some i put you
know
a very
i a partial list their own
in the bowl
don't and it's fifty minutes or so i'm gonna kind of this completely this is
used to of this idea
the experiment so i'm gonna describe all about implicit causality concept so
let me take a moment to tell you what those are
right these are
is or verb student very well studied in the psychology literature
and their said to impute causality to one
of there are two of an artist a tense
such a ten that
computing of causality then affect
downstream referential by a six
so if you run a little experiment
in your lab or on mechanical turk us people to complete the sentence
amanda mazes britney because she
right there it completions you have three annotators tell you what you refers to i
can tell you what's gonna happen
by enlarge the vast majority are gonna write something about amanda
we just found that amanda is amazing
and we're gonna here we
okay so those are subject biased implicit causality verbs
you can compare that to the second case amanda detest britney because she
now we're gonna hear about britney
we just for different means detestable
and we're gonna find out what those are updated by implicit causality verbs
now a couple things worth mentioning here if you run in experiment
where you don't include
but well so the but that was here usually experiments as like a linguistics literature
use the cars and of course that indicating a particular type of coherence relation an
explanation relation you're gonna hear a cause
or reason that follows
and
that's really what these strong bias a user or try to
so if you ran a study that we just adam animations britney
and let people write the next sentence a couple things will happen
one is
you'll still get the biases but they won't be a strong
because you're gonna get some other coherence relations decides explanation you're not gonna have the
same by sees if somebody tells you know what happened next or something like that
but the other interesting thing that happens
and i wrote a showed years ago
is that you will get
many more explanation relations
in an implicit causality context
then
for other kinds of content
so it should make some sense if i say amanda just has britney
what do you thinking
why
you can tell me why i need to know why provides a you know amanda
solver e
you're not thinking
wow i need to know why okay well what happened next right so they generate
god greater expectation you're gonna get a cause or reason
in an icy context and i'm foreshadowing that's gonna become important a couples slides yes
so
to give some background there was this study is very influential
in my thinking by rosemary stevenson and colleagues and nineteen ninety four
where they did set task completion studies vary across a different context types including the
two implicit
and they compared
what happens if you give people a pronoun prompt verses no problem
so in the first case you get my pronoun it's ambiguous between the two then
participants and you see how they assign to run
in the in the three prompt condition you find out to things
you find out who they mention next
and
what form a reference to they choose
do they use a pronoun where they use any
they found to really interesting facts one is that
when you given the problem now
you always get more references to the previous okay
then when you do
across all a context types
now the overall is might not be to the subject
might not be in an object by simplistic causality context
but you still get more to the subject
when you let them take the referring expression
the second thing that happens is that
again across all context types there is a strong production tendency when they're referring to
the previous okay
they like to use the pronoun
maybe that at each one
and when they were for to the previous non stop
they like to repeat and me
so that is computed for a little while
well of people clearly have
this production bias it says for normalize the previous subject
don't problem lies in previous object
why would you have
ever get an object
found out as and but not by simple the called out of context
in terms of the that's actually not paradoxical
at all
once you can ask the relationship
between interpretation and production in terms of bayes rule
so this term on the left
is the interpretation problem
or interpreter see the pronoun and has to figure out what the reference
the first time in the numerator is
are production is the production bias
our speaker knows what you want to refer to and has to decide whether use
of pronoun or not
bayes rule tells us that these two one your images of each other
there's another term there in the numerator
the prior
the prior probability that a particular referent is going to get mention next
regardless of the for linguistic form
other speaker chooses to do it
okay
so there's nothing paradoxical about having a production bias
that says pre-normalized the subject
much more than minimizing the object
and then interpretation by s
they close to the object
as long as the prior probability of who's gonna get mentioned next
is weighted strongly enough
towards the arc as it is interrupted by simply the called out
now
theory can comes into forms kind of the weak formant a strong for the week
form just as
we expect interpretation production to be related by bayesian principal
but we posit that the stronger form "'cause"
all the evidence that we have seen at a time
pointed to the fact that the to use the types of contextual factors the condition
the two terms in the numerator
seem to be very different
all the semantics and pragmatics stuff semantics like verbs i
implicit causality
pragmatics like coherence relations
seem to be affecting not problem interpretation directly but the prior
those are pushing you your expectations it's about who's going to get mentioned
the production via seemed much more basic based on things like grammatical role some get
a or probably more probably information structure what's the top
you know pronouns like a lot centering theory basically say hate i think i was
talking about before and still talking about it
no when you can see like this makes in extremely counterintuitive prediction
which is that the speaker in deciding whether she's gonna use a pronoun or not
is ignoring a rich set of semantic and pragmatic pisces
that's those conditioning the prior
that the interpreter is nonetheless going to bring the bear in interpreting the problem
i think very a
but despite its honest
a number of experiments have provided evidence that is in fact
the case
so
that's it you're is a and experiment from
and a rotors thesis the three by two
should look familiar this twenty
the three way to three waiver five
comparison
subject by a simplistic all value added biased
i see verbs
and an icy verbs
and in the from you and affiliation
three problem versus pronoun problems
so the prediction is that verb phrase verb type should affect the prior
and imagined of the effect in the prior for a cascade to affect interpretation
but that verb type
will not
affect production
right so
again in the in the three prime condition we get to measure two things we
see who they mentioned next
that's our measurements of the prior
and we see what number of reference way to get that you
they choose whether use a pronoun and so we get the production bias
and then down here we wanna given the pronoun we get direct access to their
interpretation
giving them a pronoun how to interpret the
okay so
we're predicting an affect
a verb type on both the prior and
pronoun interpretation and that's exactly what we
so you see more subject references the subject i z condition
the least in the object i c condition
and then on ice verbs or somewhere in between
and then you see that the light or light
blue bars those of the pronoun problem condition
data
are always a little higher than
the prior the dark blue bars and that's the actor production bias coming in the
production term that's tilting everything towards the subject from the baseline presented by the prior
okay so that works out
now did verb type affect
production when speakers to use pronouns verses names any answers no not at all
only thing that matters
is grammatical role lot of pronouns for subjects not a whole lot for objects
right so to put a fine point on this
right people or no more likely to use a pronoun
to refer to the direct object
in a biased implicit causality context
then in this update bias implicit causality
and then one or more likely to use a pronoun to refer to the subject
and a sub device context and a bias context there is a dissociation between production
by sees and interpretation
so
and a noun take the last two parts of the talk
and bring them together and one b new tiny little experiment it's a two by
two
when a very prompt i as before
and we're gonna have a model that manipulation that involves and the literature
so you compare the boss widely employed was hired in two thousand two verses of
all so far we employ was embezzling money
now most there is a condom interpretation and i pretty much all the taurus i
think
don't predict any difference and pronoun by season those two cases
the same subject the same for the same object
the relative clause is a little different
that's and introduce any new reference who cares
but are analysis the bayesian analysis does predict the difference
based on this interconnected sheen
of referential incoherence driven dependencies
so here's
gives a crucial slide
what are we expecting that
when you have
the when you have
you know at in the literature
in the relative clock so we call that you split at all
or three condition
right the relative also gives you an explanation
versus the control condition when it doesn't
i told to first that
when you have a these are all gonna be uttered by simplistic causality verbs when
you have an icy context
you're really expecting an explanation to come
we exploit the lot of a
exhalation coherence relations exact
in the explanation or c condition
we are defined explanation
it was in the relative cost
so we predict that you're gonna get fewer explanation coherence relations
after those cases then in the control condition
why give an explanation when the proper already have one
batch and then can say to affect the prior the next mentioned bias
user i've requires verbs we expect a lot if we have a lot of explanation
relations you expect a lot of
object references
but then we have you have fewer exclamation relations in the explanation or c condition
then you're gonna get fewer object mentions
because
the object biases try to there being an explanation relation
so we expect an effect on the prior
we also expect
and effect of the production by this what we seen before
in interpretation we expect to see more pronouns
referring more mentions of the previous subject when you get more prone then when you
down
i'm sorry the production by we expect people to produce more pronouns to refer to
subjects
then objects
and then when you put those two together at the bottom
both terms the prior and the likelihood term should affect interpretation
more or fewer references to the object that is more to the subject
in the exclamation rc condition
and also within the pronoun problem condition compared to the free problem condition
the crucial thing about this slide right is that
here's a little graphical model for influences on pronoun interpretation
and all the interesting stuff is on the right-hand side
all the stuff that's completely independent
a pronunciation
that
all building on the right is about predicting
the message who's going to get mention next
the most boring part of the slide is the part
over here where a pronoun comes into play
so
notice that this part of the a pop years possible to affect
on interpretation directly only indirectly
okay
so first predictions do we get
fewer explanations
in the when the relative clause already gives you one yes
people
do you still get some explanations but
not as monies in the control condition
people one explain why the person higher than two thousand and two got fired more
than they wanna explain
why the person who was embezzling money got fired
does that affect the next mention biasing yes
as we expected you get more mentions of the direct object
in the control condition than in the explanation or c condition
the and the existence of a causal literature in a relative clause
affect production or not
not at all
same pattern we seem before
all a matter was grammatical role
and then when you put these two things together you get expected interpretation patter
you get the existence of the literature pushes around
the prior when we so like to slide to go about those of the white
blue bars
and i map object references here so when you give people pronoun prompt
those parts go down because you get
the production by given by using everything towards subject reference so fewer object references when
you give them a pronoun
okay so
this idea that again production and interpretation
are mirror images of each other
is clearly not happening and something is kind of subtle is the existence of the
list such are way up here
you can see how often cascades to affect
several other things and ultimately down here then tweaks your by sees for how you
would interpret
quickly we can do little model comparison
you know passes completion studies don't really
rate that's highly on the sex appeal meter and cycle
but i want doing them because they give us actual fine grained
numerical
measurements for biases
and so we can use that to compare different models so again what we can
do
we can estimate interpretation by using our free prompt condition
we get can measure
really mentioned next that gives us the prior we get to see whether they use
the pronoun are not that gives us the production bias we can plug them into
this equation get interpretation by s
then we can compare that with the actual interpretation by s
there we
c in the pronoun prompt condition
right so we're estimating
we coming up and the estimated bias from the free from condition using this formula
in comparing it to the actual one we find in the pronoun condition
we can compare this with two kind of competing models that are out there one
is
and of the what i've been calling them your model
that's where in there so
what we reference
was the speaker most likely to use a pronoun to refer to
so we can calculate that by taking the production bias and normalizing
i wrote it this way
just to point out that its essentially like bayes rule set without the prior
the other model is the agenda for arnold expectancy model she said look what's happening
is
you greater generating expectations about who's gonna get mentioned that
and if you have a
you see a pronoun
and it matches and gender number or not i think
that tells you say that's the thing
it's the thing you're expecting get mention x
that's essentially just the prior now the priors already probability distribution soapy referent would have
sufficed
but i wrote it this way to show you that this is
basically bayes rule except without the production bias
and
when you compare the numbers basically the bayesian model when so these in the actual
column or the actual numbers we get
for article percentage of object references
in the problem i'm condition
and then you see three sets of numbers
four where we plug in the frequencies that we get in the free problem condition
into those different equations and you see
the bayesian members of predictions are actually pretty close and have a higher degree of
correlation
we expect all
the other models to have some correlation because
as i just showed you
essentially those models of being combined in the bayesian model but it's the combination of
the two that
that makes the best predictions
so to summarise this part of the talk
we see that you know pronoun temptation is
since it is very kind of subtle
coherence prevent factor where
production isn't
which
is counterintuitive but is exactly the dissociation that the bayesian model
would project
so contrary to this is that there is no unified notion of salience it's between
production interpretation
there's always in this problem in the pronoun interpretation literature right where
you know you read somewhere in the first paragraph of the paper it says you
know pronouns refer to salient reference
you say okay well what are the contributors the salience
as well that
go look at a corpus and see look identities pronoun to refer to
i two basic unit variance pronouns before to the kinds of entities that pronoun to
refer to its completely circular right so bad i have any meaning
right you're notion of salience has to be treated derived from
something that independent of choice of referential form which is what we're trying to predict
so for me i don't follow
l email to clocking here and three it's this next mention buys the prior
that's the best measurement we have for salient
right who you're expecting to get mentioned
but as we've seen pronoun vices don't know one directly
with that notion of salience
okay
so let me conclude with this a few quick slides oaks i think there are
some lessons for computational work here
ideas that i wanted to follow up on a long time adjust can ever get
a student interested enough so
i hope somebody here doesn't step
i think it's safe to say that when we've done computational work on reference
if you look over the last number of years
using a lot more progress on them on the mission the modeling side
that
the feature engineering side right
many new machine learning method
not aligned in terms of new
linguistic features right people still can be used the same three dozen or so features
gender number
distance maybe little grammatical role information that kind of thing
and for good reason because retraining these and systems unsupervised mode
you can ask people to annotate morton to three thousand pronouns
and so you can never ask questions
in your features that like is this an implicit adopted by some close to causality
you never have enough data to it to you know
to do something like that
well this
the bayesian model contest you don't need that indicate
because
it
you know the prior doesn't care all the semantic and pragmatic stuff
conditions the prior
and apply would you can calculate
doing so reference
for cocoa reference in general and not just for pronouns
you can go into data and have your system fine
case of the car reference that is really sure about
right repeated proper names
definite descriptions with substantial
lexical overlap
with their antecedents and pretend that human when and said that's co reference
you could get calculate millions of get millions of examples like that of the corpus
and then have a model that can has seems very fine grained features now you
might have a hundred
two hundred thousand
implicit causality verbs in there and be able to model that get some predictive power
added
all you need annotated data for is the pronoun specific part the production price and
a couple thousand pronouns is going to be plenty
to learn
that people pre-normalized
subjects the most and then less and less as you move down the oblique this
hierarchy
so
it was not at all obvious before that you could take
apply the factors that you learn for co reference in general using only like kinda
high probability cases the co reference and the teleport directly onto the pronoun interpretation problem
so
the situation is entirely analogous to bayesian models of other
kinds of things right now
machine translation in those the or in this case speech recognition right
you doing speech recognition with a bayesian model you could write well we could try
to train
a you know a model can directly that maps from acoustic signal to work
but we don't do that because
then when somebody says to
you've no idea they said t o
t o where t w well
right so we don't do that
instead we reverse it into production model given the words we predict
what's the likelihood that the speaker produce that acoustic signal
for that word
and then we can plug in the prior a language model like an n-gram model
imac and help tell us
where in that context
it is at o p w or well
the same idea right pronouns just like ambiguous words are used
underspecified signals a place strong constraints on their interpretation
but you need context a fully resolved
so
is it would have an efficient language should allow speakers to take advantage of
whatever aspects of or interlocutors cognitive apparatus you can get our hands on basically
for implicature that
collaboratively rationality for an indirect speech acts that plan planning and satisfying goals least designers
intentions for literatures it is more basic
aspect of our cognitive abilities that is
inferring relations have you do with
causality
com security
and the more
basic associated principles
so
you know when it when we know we build systems and easy the think of
language interpretation as a
as there is a reactor process right overall scheme
i need interpreter is a pronoun i need to a search
right everything happens when you see
the trigger right
on the other hand that the bayesian model
right is a more directly captures what is become
a more modern view of interpretation of
not as a reactive process but
one where interpretation is what happens
when you're top-down proactive
expectations about the ensuing message
commonly contact with the bottom-up linguistic evidence
by the by utterances
right
and so it's important i think of the case of the literature in a really
spells out the important
of doing that proactive modeling
right recognising these kinds of inferences and having that discourse update occur
so it's ready by the time you get
particular linguistic forms in the input like problem right you don't wanna wait to you
see a problem down a to run around a context and try to figure out
whether there's some of the list a true if you the
and i will
stop there thank
thanks very inspired design and i
definitely agree with a the
kind of approach to these kinds of inferences and the bayesian status great i had
a
couple of questions so
i guess you made a distinction about the
understands rely coherence relations versus intra-sentence and i don't think there's really a difference there
that i
i think you're sentences we're not really parallel and twenty and twenty b and
exactly the same kinds of coherence issues whether it's within one sentence or cross
to that
so
so that twenty seven twenty be like that if it was you know the employee
the likes plan c broke his leg that's
that's fine
and similarly the employee
likes plans and broke his leg this is just as weird as twenty p
so it's
the thing is i think issue thing so maybe
well let me
i are you are you commenting on my characterisation is intersentential versus intra sentential right
so i
i would nine i probably
there's is not a good term for that
i think it's exactly of what i want
right to compensate intra clausal purses inter-pausal
because the cases where you have been here
i'm not i'm those are still intersentential from me
and i'm but once you start saying intra clausal wow you now relative clauses the
clause and everything so
if you put
you know
a like a because in here or you know one hand or something i'd still
treat those as
intra sentential i intersentential
right we need we need to have
are coming here and
machinery come along and tell us
well i think i might have employed work is late because you like models
you don't a well
okay there must be causal relationship where it's been asserted i'm happy
no you need to establish the causal relation
right you're not happy until you
see you know
so that the crucial point is that in twenty a
right although it needs to happen to this to be explicit this is that we
can pick you know which employed we're talking about it doesn't trigger
this search well process but you know what i think i think it is that's
that is very search process that
you know the reason for this kind of free markets to identify particular employee and
that is a coherence relations
this you know identification purse or that i mean i depends on your area
coherence
but the crucial thing is you you're
you're not often running
here
this twenty we send you are running trying to figure out
how liking problems
could relate costly or otherwise right to breaking or like in a way that
twenty eight does most of the time
you know use a
this morning sense of the relative also there's no causal researcher
and it doesn't mean that where confused by all of those utterances right so the
question you know in the theory of pragmatics then is
when there is one why would you ever draw and that's
what's problematic for just about every type of enrichment that's out there
the that the triggers that day these different
if you know implicature implicit your
explicate sure for relevance theory and there's a mother's two
the comedies work in
you know local
pragmatic strengthening things like that
none of them had the triggers a need to give rise the inferences
i joint time
when doing a little bit of several works
so that you know it's well parents don't that early work on this is a
constant for
i really interesting they predict properties
it can send your last few slides out like where this is the computational approaches
is that
we need a corpus there surely the second sounding words rule
work are
and then we just the sse probabilities the implicit causality where cases where
we have all referring expressions we could
actually in there at least that's causality spend any system theory and
there i from a norse
somewhere right
so i displayed if we kind of ways
done it is part of your time and not just
i get my corpus blog stories
and i think a you know for all these lexicons nodding or any see what
happens next
and maybe i don't need to look for implicit causality groups need to adjust
wow
g you just have a lexical
probabilities with every
so i see you like that
yes that that's exactly right i mean you could
if you had enough data you could calculate
a probability
you know for every kind of or some or all four and of n participant
complex so there's no reason
you know to employers causality is a very weird
kind of concept in terms of
it really a cover term for a set of verbs that tend to have solar
by a six
there is no
deeper
definition for what implicit causality for is there are there are consistent subclasses so
experience or a stimulus for so you know annoyance surprise and you know the test
and
those kinds of verbs tend to be impose a causality but there are others it's
just you know like hit you know or things like that
have
just have strong by season with sailors and thus causality
so
there's a reason i think if you're gonna do modeling you know anything if you
have enough data
to just limit yourself to those kind of verbs because in fact
all verbs
have
some kind of biased you might want to account for is just gonna be more
meaningful when it's stronger one way or the other
i mean this hits on like a real problem all
in the cycle
you know the very one of the very first week
one the very first experiments we when that we random it can been that's what
we have these none of the so called reality
we had twenty of them
an hour now we forty of
and we calculated what the next mentioned by sees or for those that the prior
and only in it with
every spot
in a from between zero and one
okay you know starting from verbs it really should be considered impose causality concern so
far towards the end even though there's no real causality involved
there's ones in the middle in every point
help in
so i don't the cycle linguistics literature on column permutation
in people just make up a bunch of examples and saying
there's no pragmatic bias
i don't have any pragmatic bias
there's no such thing is a sentence it doesn't have
i'm having is
and some the verbs
i can take my three verbs it you want me to show you to a
noun setup a subject is
i'm gonna run a study with these twenty
you want me to show that there's no such that is
i'm gonna run at least one
right
now they are i'm gaming it "'cause" i know with verbs have you know
but not based on problem interplay to five is only the next mentioned by c
i can take one it has an x men a prior that eighty percent of
the object
run in a pronoun interpretation study
that pronoun is gonna pool be eighty percent to fifty percent under sail there's no
you know there's no bias it exactly what happens with transfer possession groups you know
john handle booked ability you will get fifty john a bill you don't put the
he there it's eighty five percent of bill so this is a huge problem in
the literature "'cause" nobody's and warming everybody treats the baseline like it fifty as the
baseline between subject and object
that's not the baseline
the baseline is
the prior
and two there's always confusion could people say well
pronouns are
or something biased
except when they're not i can transfer possession of verbs wonder fifty m and when
they're towards the object like a not device simplistic a value for
all that is wrong
every context
when you give a problem
it a few contribute a subject bias
over
it suggests that
the it's over the baseline
of what the next mentioned bias would have
so the it may appear there's a fifty bias but it's a
it's a strong subject bias because if you don't given the pronoun it's eighty five
percent to the up
it does that make sense
so
this is a long winded way of saying
yes
not only would you want to capture these by things are gonna be important for
your statistical model for all urban context i one computational systems
but also what's really important
in psycho-linguistic work
and you know we talking about this for a decade and you still just i
guess get papers review every year that you sell here's my you know
and adapted normal may happen they have to control for next mention by z having
control for coherence relations none of the stuff isn't
i'm sorry are talking too much
the just like a
with the image plane is a real injuries
i think that the weights consonants symmetrical because both used for use with this here
is model
used to model distribution for the extension to rotation nice right that one problem that's
this was more time
it is perhaps
the problem of using it probably have a museum or any reference maybe it should
probably it is different for the for the speaker so that we find no
something like that
right in which i like watching what i like to
see
this way
the speaker
if a probability distribution o is a real image rain rate
so i missed my and health
you know that's
and discussion and from which is
you guys i and i
o where there's right so that there's two notions of asymmetry here
so than the but the one i was talking about was really there
the production and interpolation are really based on different factor so
i'm saying that even know the speaker could have a model of
the hearer's prior that is not coming endeavour decision to produce a problem i'm saying
that at the at that i that symmetry is not there
now
you're pointing out also that
be here
doesn't have direct access to the speaker's production bias is a he has to estimate
her production by season put that into his interpretation equation
and that could be off to
but
when we have these mix that's where we're not tracking each other right you each
other's reference
it could be due to either of those asymmetries it could be you know i'm
not tracking the discourse right on the speaker's perspective
work could be that the speaker she's
i'm being little of thinking the discourse is going in one direction
and she's taking in other directions is producing
pronouns
based on you
and has a positive and i get i get messed up because
i'm not tracking
the prior right and she's not even using the prior to produce or problem to
begin
so what it could be either
so we take that it is in the next