this is the work of my phd student law may write him or not who

it's from here on

incamera currently leave the united states

so i

presenting our work with your and she's finishing phd not very good situation

alright

so our

and the motivation for this work is that

a narrative structures occur all over different kinds of natural language genres you see "'em"

in restaurant reviews you see it in

in newspapers

and this seems to be because humans

the way that they work advice

the world is in terms of narrative structure so this kind of fits in very

well with the added tailors

i talked this morning that people are always trying to look for coherence a lot

of that coherence can kind of be framed as a narrative structure

finally agree

that narrative understanding requires modeling the goals of the protagonist in tracking the outcomes of

these goals whether the goals are being fulfilled or not

or thwarted

and first person

social media stories are actually full of these expressions of desires and outcome

descriptions of for example here's something from a log site like journal where it's very

similar to like most of where our data comes from

so somebody's telling a story something that happens at a at a concert

slight drop something it was dark about the cellphone hardly look for we spoke a

little bit it was loud and so can really talk

i had hoped to

asking to jointly forgery first subpoena to the shower likert with alarming to do such

thing

but he left before the and alliance em after that maybe outright missed connections

so this

sentence here i had hoped to ask him to jointly for a drink or something

i shows an expression

of a first person desire and one of the reasons that were interested in first

person stories is because

we don't have to deal with co reference it's quite easy to try to the

protagonist is in the first person

narrative so we can kind of tracker

the near the protagonist goals in this case

which makes the problem of a little bit more tractable

so what we do is we identify goal and desire expressions in

first person narratives like for example that had hoped to in the previous we have

a bunch more all

tell you more about how we get an

and then we well we aim to do is to infer from the surrounding texts

whether or not the desires fulfilled or it's not fulfilled so we want to actually

we the narrative and be able to predict whether the desire

is fulfilled or not

so in this particular case the one i showed you

we have this a phrase but he left for the and i didn't seem after

that which clearly indicate that the desire was unfulfilled

and

as i said in this kind corpus that we have so we have a corpus

of about nine hundred thousand

first person

stories from a blogs domain

these

excuse the for you know if i try to do that the practise

so

i this was a slight a lot i didn't have a mare

but there's the these first person narrative are just right

with these desire expressions you can get as many as you want as you could

possibly want out of out of this kind of

data and they have lots and lots of different forms like i wanted to i

wish to i decided to i couldn't wait to

i aim to i arranged to and i and i needed to

and this paper we it's true that states can also expressed desires like

if you say something like i'm hungry

that implies that you have a desire to get something to e

so we initially we had a goal that you we might be able to do

something mistakes but we decide in this paper to restrict ourselves to

to particular verbs

and have tense

expressions

so

okay

so the related work the previous work was

in a around twenty ten was the first paper on this by ellen trial often

hers to deny me well who were trying to implement a computational model of twenty

lenders

plot units for story understanding

and one of the main things that you do in that model plot units is

that you try to

tracking identify the

states of the of the characters

the dataset they use with say stops fables

and they manually annotated a stops tables themselves to examine the different types of aspect

expressions and narratives and one of the things that they claim of this paper is

really interesting paper you have read it one of things that they claim is that

i states are not expressed

explicitly like i was

the character we saturday character was happy but that they're implicated or you data right

the inferences

by the tracking the character schools and they claimed in this

and seminal paper that we need even though it's been a long time ai idea

that what you wanted to extract people's intentions and whether they're being realised or not

that in natural language processing we need to do much more work

on tracking a goals and

and their outcomes

is there is also a recent paper by selecting chatter of at

where she kind of picks up on this idea tracking expressions of desire and their

outcomes

and they did this in a two very different corpora from cars they to date

and m c test which is the corpus from microsoft

of crowd sourced stories that are suitable

for machine meeting a task and the stories are supposed to be understandable by seven

year olds so you get to start expressions like

johnny wanted to be on the baseball scene

he went to the parking practised everyday you know so

that kind of story and then they also to passages from wikipedia

and tracked desires there and then you and you get stuff like lenin wanted to

be varied in moscow

but blah so they they're very different than ours than when i first heard a

presentation of this paper

i thought that aren't at it so much more suitable to this task data we'd

already been working on for several years

a narrative understanding that our data is so much suitable and so much

so prime with this particular task that we had to try it on the our

datasets

so

so we made a new corpus which is publicly available you can download it from

our corpus page

we have three thousand five hundred it's really high quality corpus where i'm super excited

about it being able to do more stuff with that

with three thousand five hundred first person informal narratives with the annotations you can download

it

and we

and in this paper might talk about how we model the goals and desires and

they're gonna talk about some classification models that we've done i don't know why my

slides going on the bottom

thing there

but

what do we do a feature analysis of what features are actually good for predicting

the fulfilment outcome

we look at the effect of both the prior context in the pos context and

i don't even know what that last thing is

on there that we do

this is gonna be a problem

i hope it'll be

can i x

then by slides are going on the bottom of the

ok starting a subset of the spinner corpus which is publicly available corpus of social

media from

collected at one of the i c w s and task

and we restrict ourselves a subset of the spinner corpus that comes from these traditional

kind of journalling side like journal by a

so you can get quite clean data by restricting your style

two

things from the spinner corpus that just come from particular blogs websites

should we use the power

and you

i it works fine on my

you know

for

i guess

it is not one can to right now

we can continue

and you think the pdf would be better

no you can see okay alright

okay

okay

alright

so we have a subset of the spinner corpus

we have this like what we claim is a very systematic method linguistically motivated method

to identify just a wrinkle statements we collect the context before the goal statements in

the context after five up to five utterances perform five utterances after

and then we have we have mechanical turk task where we put it out of

mechanical turk and we collected a gold standard labels for the fulfilment

status to be i actually also ask the turkers to mac

to mark what the spans of text

we're for evidence for fulfilment are not for film that

but it is

paper we don't do anything with the evidence

okay so i kind of refer to this before the many different linguistic waste expressed

desires and so one of the things that my colleague phenomenon was struck by the

prior work with that it was the limited in terms of the desire expressions that

they looked at they just looked at hope to which two and wanted to

i think that's motivated probably by the fact that the and c test corpus is

very simple and written for

seven year olds and it was crowd sourced of maybe they didn't have very many

expressions of desire in there

but our data is open-domain is very rich we have complex sentences complex

temporal expressions we have all kinds of really great stuff for it on there are

really encourage you to have a look at the data

so what are not it was he went through framenet and picked every kind of

frame really thought could possibly have a verb in it that would or state it

would express the desire expression

we went through we made a big list of all those then we looked at

their frequency in the gigaword corpus to see which things a kind of most frequent

english language not just dinars

we pick a thirty seven verbs we constructed their past tense

for patterns with regular expressions

and then we

and then we put those out against arg database of nine hundred thousand first person

stories

and we found six hundred thousand stories that contain verbal patterns of desire

so this is kind of what it looks like we go five sentences before and

five senses after the reason that we go five sentences before is that

there is a oral narrative claim that the structure of narrative that you often for

chateau

something that's gonna happen so unlike the previous work we took the prior context

so we have the prior contact the desired expression and the

and the pos context in our goal is to use the context around the desired

suppression to try to predict whether

the

to express desires that

so we sampled from a corpus according to a skewed distribution that match the whole

original corpus we put three thousand six hundred eighty samples out for annotation

exhibiting sixteen verbal pattern

and we show the mechanical turkers

what the desire expression was that they were supposed to match "'cause" sometimes and story

might have more than one

in it

so we show them the one that they were supposed to predict the fulfilment status

for

we had three qualified workers for utterance

this is really annoying

sorry

we ask him to label whether the desired expression was fulfilled in to mark the

textual evidence

so

we got agreement almost have a mechanical turk we did three qualify the workers the

kind of make sure they could read english and that they would kind of paying

attention to the task that's typically what we do when we have a task like

this is that we

we put it out a lot of people we see who does a good job

and then we

go back to the people that have done a good job and we say will

give you this task exclusively we pay then

well

and then we'll and then go off and do it for whatever it takes like

a week or two

and

we got on the stuff that we put out there we got that there were

a seventy five percent of the

but to start were fulfilled sixty seven percent ground of l and forty one percent

of them

where and don't from the context

if i'm not in presentation the next

i shows okay so

the

one thing to notice is that the verbal pattern itself harold's

the outcome so what how you express the desire is often

kind of conditioned on whether the desired actually fulfilled so if you look at

does decided to

decide it's you kind of

implicates that the desire is gonna be fulfilled

if you use the a word like hope to it implicates that the desire is

not gonna be fulfilled

but there is but something like wanted to

it's more around fifty are needed to so there's a there's a prior distribution that

is associated with the selection of the

of verb form

and so what we have like this database you know like a set you can

download it

we think it's really lovely

testbed remotely desired can personal narrative in their fulfilment

it's very open domain we have the prior in the pos context we have pretty

reliable

annotation so that's one of our contributions is just

a corpus

so accented talk about the experiments we did so we define feature sets motivated by

narrative structure some of these features were motivated by the previous work by

garland right off and by think it is

chatter various experiments

and then we ran another class different kinds of classification experiments

to test whether we can actually predict desire that

fulfilled desire and we also apply our models to chatter babies data which is also

publicly available and so we compare directly

how our models work on their data and are data and all those datasets are

are publicly available

so some of our features come directly from the desire expressions are in this example

eventually i just decided to speak i can't even remember what i said people were

very happy

and proud of me for saying what i wanted to say

so the first

i think that's important is that is the desire for

obviously like whether it's decided to are both sure wanted to

and then what we call the focal word which is the embedded for underneath the

desire expressions so we pick the verb with stem it

so in this case it's p

we then look for other words that are related to the vocal words in the

context that we look for synonyms and antonyms of the vocal words and we count

whether those things occur

we look for the desire subject in its mentions all the different places with the

desire subject which is in our case is always first person

get mentioned

and

and we have those features we have discourse features having to do with whether there

is discourse relations explicitly stated

and classify these according to their occurrence in penn discourse treebank

is that we just there is an inverse indexing penn discourse treebank annotation manual that

gives you all the

all the surface forms that were classified as a particular class of discourse relations that

we just take those from there so we have two classes violated expectation or max

median expectation

and we keep track of those

we have sentiment flow features

i miss anything

we have connotation lexicon

sentiment flow features

so we look and see whether the over the passages that are in the story

whether the sentiment changers stuff it starts positive in goes to negative or starts negative

it goes to positive that's not feature that we keep track a

and we have these four types of features that are motivated by the narrative x

it characteristics the paper goes into detail about the appellation experiments that we do

to test which kinds of features

we use a d

a neural

network architecture that sequential architecture to do this i'm running out of time

and

we also compared to just plain logistic regression

on the data so we have two different approaches for generating the sentence embedding jesus

get caught

the pre-trained skip but models which like we can combine the we concatenate the features

with the with the sentence embeddings and we can use that as the input representation

and we also have a convolutional neural net

recursive neural net

so we have it is three-layer architecture and

what we do this we sequentially go through the prior context in the pos context

we also did experiments where we

distinguish we tell the learner whether the prior context of the pos context surprisingly to

me

it doesn't matter

if you if you

so what we do is we do that have just to remember we have eleven

sentences we have five for five after the desired expression

at each stage we keep the desire expression

in the inputs to each time we meet in a new next

sentence context

and we keep the desired expression and then we kind of recursively call routing get

the next one

so that how we're keeping track of the context

and we did some experiments on a subset of

desired to be which is meant to match more closely the sense that

chattered at worked on

that only have a expressions that she looked at only have two minutes

okay so

we look we wanted to do these ablation experiments in the with these different architectures

and so

there are first thing compared to bag of words with skip well it shows that

have been these linguistic features actually matters

for the performance on this task not just the embedding

so we get an overall f one of point seven for predicting

fulfilment versus not fulfilment

we also have results that show that

are

theoretically motif plane motivated claim that the prior context should matter not just the subsequent

context

this shows this slide shows that indeed

it does improve over just having the desired that's alone to have the prior context

of course if you have the whole context

you can do even better

and then we compare

certain individual features

bag of words versus all features versus just the discourse features than our best result

is that just the discourse features this is actually i in my view is kind

of disappointing that just the discourse features by themselves

to better than all the features

so if you kind of tell it paid just pay attention to these discourse features

and i would consider like that

interesting next thing would

be to do would be to explicitly sample the corpus

so that you

selected stuff that didn't have the discourse features so that you could see

what other features come into play when you don't have explicit discourse features they're

so interestingly it's a similar features and methods achieve better results than the fulfilled classes

compared to the unfulfilled class

and we think that's because it's just harder to predict

unfulfilled case it's more ambiguous and the annotators the human annotators that think problem

and supposed to stop now

those

we actually kind of really surprising to us we got much better results on chat

of eighties dataset then

they did them

cells

okay so i stopped and i will take questions

i will be about that slide

questions please

nobody

right so you see it is if you looking at just verbal patterns for these

designer

expression give you a do you come across

nonverbal patterns that might expose designer or

there are there are not verbal patterns like you can easily see like if somebody

says i

suspected to then you could have my expectation was that

or you could have

but what we

we so we didn't

we did a search against the corpus where we pulled a bunch of those things

out my expectation my goal my plan

and also some of these data is like

hungry thirsty tired whatever that might indicate some kind of goal

and we just decided that to leave those things aside

for the present but there definitely in there

so if you're actually interested in those you could easily probably find those semantic in

things like you know just purpose clauses lot of these contexts are in this that

they don't actually have the words of you know you don't want to go somewhere

you can see in order to go somewhere

are you don't actually get verbal patterns so those the there are

i'm just wondering how many other kinds of

hundreds the might be i just think there is lots of other there's lots of

other patterns and it's actually

what's really interesting is how frequent want is

so in this

so for our data

once it is the most common

of the verbal patterns wanted is the most common expression

and you could do quite a lot if you just look for wanted to assess

it we have all these we have all these different ones and we are also

able to show that they have different biases as to whether the

the goal be fulfilled or not i have not just comments at the end because

usually the kinds of but when you're talking about non fulfilment that's indication of

expectation

and i wouldn't have thought that

the work decided

gave generated that expectation that can get counts other words the time at your this

probably due but not decided

so

but you

sighted

was unfulfilled

sixty some design a design bill eighty seven percent of the time

what that shows right

so we strongly a strong into intuition before we put the data for annotation

it's just and it is fulfilled eighty seven percent of the time which is what

i would expect it's just looking at the and it's unfulfilled nine percent of the

times right

okay

it's interesting they

you could you could see a difference there i had very strong intuition that a

lot of these would be interesting

that they because it would be implicated so i'm actually quite interested in the fact

that i used in ten percent of the time we decided to it's actually not

fulfill

and

and there's these different cases that not fulfilment which we're looking at an art in

our subsequent work

like

sometimes the key goal is not fulfilled because something else comes along just side actually

to do something now

so it's not really unfulfilled is just you kind of changed your mind like

we wanted to buy

playstation

and we went to best buy

and the

we were on sale so we came home with that we

so it mac if the kind of higher level goal

is actually fulfil that they wanted some kind of entertainment system but the expressed desire

was in fact not fulfilled those

those are maybe about i don't know

eight percent of the cases of the ten percent something might not

okay since we're running over time should mix people forced segmentation