Speech Transcript - The Role of Conversation Context for Sarcasm Detection in Online Interactions

okay thanks and but i'll and the buttons and then this a joint work with

the next five rate and small integration

some present detection on viral are any detection has recently become a popular and of

the problem

and the method of the these detection research is looking into the utterances of what

the point and of utterance and in isolation

however

as we know that sarcasm is a complex phenomena in natural language and speaker intent

is often unclear a list to provide additional context so this conversation context background knowledge

from of doctors

the topic of discussion in section

and that's why some researchers have argued that these are the sort of additional context

has to be provided to understand sarcasm bitter

and that actually triggers to our research here we look into basically to research questions

that far does conversation context held in predicting some present but conference in context is

that one type of context

and can identify what part of the conversation context actually triggering a sarcastic reply

so before going to the modeling let's look into some of the examples that we're

looking you know researcher

so this that we'd

and whether a user is saying that one more reason to feel really great of

what's line and then to put the has text are present which would be used

as a label

now is looking to that with this is

a hard to understand that why this to it is a sarcastic to eat right

but of the same time we see that the street was a reply to one

other tweet with the symbolic can see that at the rate

and

that was the reply to another to it from another user

where it says that plane window shades or open so that people can see

so we can understand that the user is sarcastic about the experience of line here

and that's why to put i would that has text are present

no let's look at a another domain

and this

these the discussion thread or stun it's taken from the internet argument corpus

but this user again

replying to some other user and reading about sarcastic about their reading have it

and think that all your reading too much into by

is to look into the context forcing we see that the point x this way

a lot longer than we because it's taken from a discussion forum posts and i'm

going to the detail of the post but what's the person is saying that you

think that well the what is not sixty million years old it's only two thousand

years old and they bring

definitely different arguments

and the user to is just saying that go you're eating too much into by

the and that's what they're

sarcastic about

so the outline of my job here two days like with fast discuss about though

i don't results are some of the state-of-the-art in sarcasm detection i'll also talk about

but data

and then i'll go over the first research question that can conversation context l in

detection some thousand and then they're seconds question that what the context actually triggering the

sarcastic reply and can identify that

and finally alicante would and a point to some future work

so as explained before there's a sergeant sarcasm environment detection in the recent years

and that of the research looking to see this as a typical binary classification problem

of not be which we we're doing like for an entity tagging or

relation extraction that sector

but there are some of the directions of the research also has come up in

started and the second recently and out just if you talk about that are for

instance right of it all the look into the context in can really characteristic of

sarcasm a then in our previous work in yemen we looked sarcasm detection as a

what sense disambiguation problem

i miss rate all they have looked into cognitive feature set as i tracking feature

for sarcasm data sent to find a lot they look into timbre posts where images

and the text can be combined for a second predictions and yesterday we saw one

poster from a horribly at all that the looked into rhetorical questions where

a lot of sarcastic utterances are actually rhetorical questions

from the respected of role of context

that's all the users have looked into orders context so that they actually model orders

previous x in a previous

post like into it

to understand that if the authors of sarcastic on that

are there is some work in the conversation context also especially balance but they look

into like n-gram characteristics and their performance we don't the context was not much different

and finally we think that combining wall knowledge how would be like really crucial to

understand sarcasm but

and the same time we think it's

much harder problem so for instance if we see this to it

that users thing that driveway that we are having a long time so that means

that

alignment has something to do with green and this sort of all knowledge is very

difficult

to include

in a system

your research we

particularly look into the conversational context here

so we use two sets of data the four cities like discussion four

and that was that it partially released in the lasagna conference eigenvalue rugby at all

and the structure is that you have a sarcastic response and which replies to a

context and vocalic discussion forum post

some of the characteristic of the data that the data was like allocated that comment

level using crowdsourcing and it's a balance set of training data close to five thousand

a post

from the context and response

we also collected data from twitter

and we look specifically into sarcastic with which are actually like previous tweet and when

we collected the previous fight using this parameter like at user

and then whenever possible we just collect the full set of the dialogue so sometimes

like one user is sarcastic about out of their tweet and that we use they

can reply to another with so we have collected the full trade

and labels are provided by the authors so that's one different from the discussion forum

where it's not just one delegating here the labels are provided and we use that

have sex sarcasm and has tech sarcastic to understand what to do sarcastic to each

other

again

it sort of on about in balance that but it's still close like two thousand

and thirteen thousand it for sarcastic and on circuits degrees respectively

and more than thirty bucks of the data we had a couple of sentences as

a context

also it should be given as like into it as max for the contextual characteristics

here

looking to the model can conversation context help in sarcasm detection

our baseline

features for svm linear kernel with discrete features

and do that the features with actually a part one very well in previous the

sarcasm detection research and that's where you're using

n-gram features we use like bigram and trigram and unigram we also used two sets

of lexicons like they are lexical for sentiment and

liwc louis for pragmatic features

absolutely like alteration of the sentiments between the point text and sorry a sarcastic post

and then we used a set of sarcasm markers which are basically indicators of sarcasm

a and that's been used in many research and linguistic and communication signs are for

instance like morpho syntactic features so use of various interjections questions examine some signs

and then when you're doing like speech recognition i'm sarcasm the voice into an snr

modulations is like they have like strong features

but one opinion to natural language and people like typing the put like this kind

of typographic feature set this quotation mark a different type of you multi phones

you move these are used also capitalisation such as this what the time showing like

never

and finally a list of intensify record is also shown to be a strong features

in sarcasm detection like the worst degrade based and beta next section

we also experience with the lstm network of which are able to a lower long

distance dependencies

and in our architecture we are using two lstm one read stuff context and one

reads the response

your six print with that consent base variations where first we used a lot and

sentence level attention and that's like a hierarchical model

and in the second we give the word embeddings static and we didn't do any

a bit attention on the words but we only put that consent on the sentence

of the context because we also one two

answer our second visits question that

what part of the context also help in identifying the sarcasm so that's why you're

doing the sentence level only

this is a schematic diagram of our utterancelevel for the sentence a sentence but seconds

and so you can see that the left hand side we have one lstm to

learn the content

and in right hand side we have one lstm to learn the response the real

there is a sentence embedding then you have the hidden layer then we have attention

and then you have the context vector

and what we mean here that we are concatenating the final vector representation from the

context and the response concatenating and then we're passing it to the softmax

a finally we also experimented with other variation the fella stimulus the conditional density and

that was introduced from the deep mine group for the a natural language inferencing tasks

like textual entailment

what we're doing here is that again we are using two lstm

but the response lstm is conditioned on they represent on the context of spam

so which means that

the sale state is for the response is initialized by the final state of the

context and

that's

been shown that it's like really a strong character structure architecture for

nestling in inferencing task

more some more details that we split the data into a dt in general like

training is like eighty percent data ten percent data we used for the parameter tuning

and we use two sets of word embeddings for discussion forums so we use that

standard go would be ten few hundred dimensional what effect model that for twitter we

used as basic model that was been trained on tweets and we have used you

know this is for

no so the results are this talk about some of the macro-average the that allowed

us point out something important how values and

the more i thought the number seven in their in more details in the paper

here first we look into the comparison of the is em results with the response

and context bus response we said that into a terry there is a some of

that some part of the improvement is they're using the context

but not in the discussion forum data and we suspect that

it's might be possible because a lot of and the context of soul all in

the discussion forums and

is v m which are mostly based on n-gram features for not every to learn

that

if you look into the lstm variations we only the we don't that instant model

we see that

the first observation is using context help

for both lstm models here

but at the same time they list in conditional model which was kind is the

weather response was conditioned on the context

it performs reasonably well so we see there is like

five to six percent improvement of if one for the discussion forum and for the

preacher results more

and finally that the results from using attention over the sentences with the response and

with the context and we see that sort the treated it to the context and

response performs past and so the discussion forum it just like comparable to the conditional

now we i'm not showing the results we when we do the attentional what in

sentence because that are phones a little more than that

and what we think there isn't could be that

are we don't have enough training data and when you're putting attention what on the

warden sentence level that are too many parameters to model so in the future we

hope that

well when you're we can put more data under discussion forum at which are especially

experiment in that direction and that might prove better

we go to the second research question that and the identify

what of the context figure out the sarcastic reply

on what we're doing here that's basically we're looking to that happens and wait and

if the indicate what part of the context

triggers the sarcastic reply here

so we evaluated i mention weight with a crowdsourcing experiment on amazon mechanical turk where

we asked i don't parse that we

poster sarcastic reply on the context from the discussion forum and we have that are

those that are can you identify one or more sentences from the context

but you think that we got the sarcastic reply

and we selected the

dark colours where like really

much higher qualifications so we thought that due to be able to do that starts

we're and with our own the eighty five heats here and we selected only the

post with the context length are between three to seven sentence because we think that

if we put context much longer which will be a hard task for that are

percent they might be interested to do that

and we use the major regenerating and interestingly when you compared to that can wait

we found that

ask for more than forty possible but i'm the sentence that got the highest

attention weight is also

comparable to what that are course the thinking and the put the majority voting for

that

so if you see into the same example that i should be for are regarding

the reading have it up because i will

left hand side of the sentences and the right hand side with show that heat

map

so for the heat map also in the left hand side that engine the weight

and the right hand side it just how we represent the majority voting

we see that the for sentence was selected from that can generate and also but

are cars

but at the same time that can send don't put much weight into the other

sentences for the target still think that

there is some wait for the second

and part sentence

we also looked into to analyze that consent weight in the light of like how

it's actually telling us about sarcasm

and we in the paper we actually discussed what various characteristic of sarcasm but in

the interest of time i'll us basically look into one issue which no less the

context in country

so you can think of the conquered context incongruity is characteristics of sarcasm and

with this very simple example i can say that okay say

i'll going to the

emergency room one every week so there is a sentiment like positive sentiment loss

but there is a negative situation like going to the abundance a room on every

weekend and so you can think of this early in the country

or inconsistency between the sentiment of the situation and they're a lot of research in

a linguistics and communication and also in philadelphia where people talk about this context in

one greedy and the think that

this one of the very strong characteristic of sarcasm

a here in our research we actually found out that a lot of time that

i've been sent especially for that which you're the one example am showing

has actually puts more weight into that includes features here so for instance in this

example of the context which was that

it talking about like advertisement saying right now it just totally do you press

about c of mediocre s

and the response was like all really got see what you're doing

and we see that depressed and mediocre actually got more weight

with the

lots one responses got see so you can see that there is something some sort

of incongruity and there are some more examples in our paper is that we

i showed that how this context incongruity is happening here

so at the same time we also looked into sarcasm markers are which is basically

indicators like explicit indicators of sarcasm

and we seems like especially here i can send a concluding much weights on the

markers such as various multi columns anymore jeez

you can think of and those actually actions and put more weight and

that means the model of thinking that these are like some features

a one interesting thing was we found out that sometime in the context you have

a like cost like anymore tickle like a set face

and then in the response maybe you have a like smiley face

and that attention models actually learning that these opposite characters of emotive content may be

small feature and that would be

a small weight over there so that so interesting observation

at the same data like interjections like are and those words actually have received a

lot of attention which so

but

one of constant here that

however

plus the classification task was not forced only to run on the attention the weight

and it's again

not does it say tall in two thousand fifteen to have discussed

on this topic that

the integration based on any attention rate has to be taken

care taken with here and so you have to be little courses here because

the classification was not only working on this with

we're interpreting here and we are mostly like observing the races

so to confuse

we found out that for sarcasm detection when you're using contextual information from the dialogs

it shows like better accuracy

also we i think to identify what portion of the context may trigger the sarcasm

and we ran some experience on the interpretation of the context to analyze a different

attitude of such as a like context incorporate the n

other characteristics or read discussed in the paper

a sort of future work we are interested in to like for large scale experiment

and we hope that there will be like more data released from the discussion forum

that we can use for more parameter tuning especially like into the award and second

sentence level able to attention models

also we saw yesterday on posted at how the comments which are reply to the

sarcastic was where you stand

in this kind of analysis so we also interesting

another thing here to observe you have that the three the data is a self-labeling

sarcastic data so people are

posting to its and they are posting the highest tax themselves

but in the discussion forum the sarcasm was perceived to because you have other annotators

coming and then they are not doing so

we are a very interesting to do this sort of analysis that what's the difference

between cells level and possible sarcasm

and finally we follow the in our experiment after doing a lot of error analysis

that

there are specific aspect of conversation like you more than on that users have used

many times and when we also downloaded or like the mechanical turk or they were

actually able to identify those here morrison ones which could actually trigger sarcasm

but i want model actually couldn't so that's one interesting direction that

we hope that will continue or research into

thank you

you questions

you like this slide on an attention experiment i guess for you had the one

example

a series of tweets and i think you where you compared against a crowdsource workers

this can you in that i think i

this is like for attention or no so this is like a we're showing the

discussion forums one so we experiment with that are course only on the discussion forum

and we

proposed like the sarcastic pose

and also the context for the context for like multiple sentences so we have that

doctors that

okay like

we already telling that this is a sarcastic or stand this the context that actually

has triggered the sub doesn't

so how about that can identify the tweet sentence is more important

and we just like five doctors and they put their voting on the sentences that

they think that okay it's one is more important than is to an effect your

and here the right hand side it's sort of the heat map but it's showing

the method of importing

and then we compare the majority voting with that mention that was and here for

this to be example yes one

actually has got a maximum attention the weight from or experiment at the same time

that are first thing that actually also like doctors think that this of the thing

is that actually

i might be triggering the sarcasm

in this

sponded then

again

so this the macro-average f

sorry the macro-average if once we are showing here between the sarcastic and on circuits

to capture

so we basically to the macro-average if one from the

both binary classes so in the paper we actually shortly

separately all categories but just for the interest of time taken that macro-average report

so c plus are means that we're doing the c is like the context

and the response and the all like all the response model

is that

right this normal questions then that's thanks to john again

The Role of Conversation Context for Sarcasm Detection in Online Interactions

Oral Session 3: Modeling Semantics and Pragmatics

Debanjan Ghosh, Alexander Richard Fabbri and Smaranda Muresan