thank you

so electrical because a recent work and the to be a scaled down there

exactly in collaboration with a one company the company score the added combine the

and is this company is of interest e and e commerce in the promise a

scenario a we have been working on a and decoder can you show no for

conversational agents a in this in this scenario

so they get the general idea is that and that the long-term goal of this

work is a kind of conversational agent that kind of the shop assistant that the

media but users in a buying products in that one income in any comments you

know

so forth for instance i e for the user say second i find a kind

of that was the arc of rules for my most of the suppose behavior of

these shop assistant who presented the user really to the products

this is a kind of task oriented scenario and the and basically it can be

approached the with the traditional as follows really approach

so we have several what the system is supposed to recognize in than the likes

of a

and then the to it's a classifier

because if i you put the using the categories to provide the brand the colour

and the and other properties

so the approach should work on a seven dollar on a many to me for

instance the cameras for mutual or groups through the and that that's

why a relative and the probabilities of these us in its that the basically there

are there are no i'm not stated that utterances so that i and all sentences

or

request from the user that which i i'm not stated that we the

the i

the properties the intent and the and the properties that

of the specific domain

and another problem another relevant factors that

it might be easier

to find that handles where

information about verb that the is present

so given these along with a scenario with focus the on the specific issues in

the in this work

so we focus on entity recognition

so for instance the cup on the capacity to recognize kind of who's park that

is one lady with the

a user utterance

we based our work on a gaseous

that's its in this

in this scenario are catalogues basically comparable opera dataset

that we can get the from vendors the yukon the subband

the main research question for as he is how far we can go without any

and located and so this is why we call the other so the link this

approach

so few words about the specific issues of and it's nice product needs to and

in this in this is in hong kong

so basically this is different from traditional named entities we have what they in that

the tradition of information extraction has been called the nominal in

so for instance an entity may contain also connectives like a black and white fisher

we have a black a black paint

so black is a property all of open-ended

entity names may contain bright adidas so wide beam that's for sure

or even a proper names that i've a lot if you if the think about

okay

and use it you know how many

and need to add products

another a very important property for a for our approach is composed is compositionally

so being and nominal entities we may assume and they'll respect to some competition my

principle of the lane

so if we have of for instance in the folder domain pasta with broccoli

i and but we base also this is the and now plus the positional modifier

we can add the objective alamouti five times faster would based

but then knowing that we may have a slight but we may still

a we may in fact that the

being getting having both past that wouldn't broccoli and but it will broccoli may be

maybe it's a it's a good name also spaghetti we'd based okay even if it

is not as never been

see before

so this means that we can

our the approach the should be able to take advantage of composition on

a then might be the case of having a multiple of convinced it's all this

of a semantic category in the same utterance

which is not the colour of

it's a d r that was synonymous a like a in booking flights usually one

just unique shown

one at time of our evaluation but not maybe

it's quite well maybe i would like to work the salami pizza and most analysis

entities

so two entities to of the content of the scene categories to in the same

utterance

and then there is a strong the need of multilinguality of course so that it

is

you can to spend also they

they need to translate accuracy in a in multiple languages

okay so which are that were working hypotheses we would like a twenty nine but

the model for entity recognition based only on if you look at how to cut

the ropes like i got it

and then we would like to apply this model

to in order to label

unseen entities in a user

so the only is that we have a density is not nothing at and we

will open all

right to understand how far we can go

that's the working on a on the t

so the main into a ds set of our approach have the following

take advantage of composition and nature of problem needs so we want to extract as

much as possible knowledge from got it

use as much as possible syntactically generated data

a having no reality come from users

a we need to work on syntactically generated data

and then we would like to be as much as possible language independent

so this is the approach basically for states to

at the beginning we collected against india

for a step in domain

then the starting from these get here we generated both positive and negative example

of all the and it needs can be regarded e

so that will be on the base of positive and negative example

will be the a classifier in our case a new classifier

be able to recognize the entities in that in this in that the specific domain

having these are classified this model that this classifier

which is able to discriminate the weather

is to design a sequence of tokens is that is contained in a in a

second domain or more than we are we want to apply this model

to recognise

and names in utterance is

and so we apply the classifier to all the stuff sequences

of in user utterance

in order to select and select all the best the sequence is that which are

normal that like

i will see the forced that we wish to the force that with that was

some example

so collected additive for a separate for the domain

we just by screen and the website of a fine so for the number of

the main okay like for the real the and four

the underlying assumption here is that the screen being a website collect the

well as it often fails on the all four entity names that is marks g

then i'm not eating good data

particularly because we don't have

after a system you not

so this is the first that

just collecting

the second step is to generate the

positive and negative examples okay so the positive example

i at least in our money now our approach our initial approach is quite simple

all of the old i frames university of it are also conducts

okay we downloaded them from a website so we trust the website

as for melody that

for each

well as you can example which have at the and negative example or and number

of negative examples that

spalling disciplines and the rules

okay

so for instance at each step sequence of imposing example is in it

okay that's

that is simple

we have the second row and second the

perceive you

we have a positive example one token a randomly selected from the forced located in

the list of the data in the data yes or the last okay

okay so we

compose a negative the negative examples

for instance if we start with the black and white t shirt the

okay this is the positive and negative a to all the some sequences play why

the black and white black and white and so on but negative but also black

and white to show the preceded by as being the randomly selected the from the

data

i think that in this capacity of a which i've downloaded from the web there

is a local noise

okay we don't have any control on that we all the vendors a ride the

needs of products

well

make sure that might be completely you know

so the second step we generate positive and negative now on the basis of positive

and negative we built a more than

and you modality

so a classifier which is able to say even in a sequence of tokens

yes this is a

a full that it's a novice is not the full

this is the

for mature not this is not performance

so we was the not really x easy to

classifier a so we this is based on a new world model proposed by the

lamp holder and the and the others a couple of years ago

and uses a kind of a classical l target you know that detector is you

know that that's all both a word embeddings that and are active embeddings

we have data a few handcrafted feature

which we are available on the for this classifier you like the features about the

relative to a certain token

the position of the token the frequency the length of the to enter

you don't probability of a token the and also all the this is the only

linguistic information that the we have using all the a part-of-speech for that though okay

so without any disambiguation

so at the end of this classifier assays yes this is the this is a

sequence of cocaine

is a multiple

a first step in for seven containing

and it is a confidence score that so the thing that simple

so no we have this classifier

okay you mode that we have but we want will but our goal is to

recognize and easy and it needs to in a so in that sentences request okay

so

i think about the this example this is a possible a request that from one

user and looking for a building the yellow sure so and that

lucia the

we need to the classifier although this sequences of these additional request

so and the we asked to the classifier to say whether x sub-sequences positive or

negative

so in this case the

a positive will be sure cellular shorts a little bit in the yellow sure to

and then i will be i'm looking for a gold in a short and darpa

and blind

then we train k

of the well the or the policy

and

classified stuff sub-sequences on the base of the confidence to all the new remote

okay and we select the rules which have not of them that a simple i

so in this may rewrite golden yellow shore so that a short so that loser

a this set one that is discarded the because these the overlap between the first

one and so we looked at i'm looking for a we will close

golden yellow sure so and the data which

okay

so this is the methodology we want to apply

we would like to know how we can will with this is impermissible

so we did the some experiment so as aforesaid that we collected the got cynthia

set as i mention that

now we have a density as for a three domains to the for sure across

the two languages english and italian with different characteristics

so for each that it here we have number of and it is the number

of talking to

the lane the and the standard deviation of that okay so the standard deviation a

and

kind of index that can scroll how much is the complexity albany

so the more this contribution to the more likely it is of the complexity of

the names university of

we have that i do you know ratio so they are is that it indicates

a high lexical but at feast or more complexity again

a real also added to the what the proportion of time that the first token

appears in the position or the name

and this may make it is a sum of our educational

about the

well matched is how much the

the

semantic a that i need a stable okay

and these with different

different designs here at like you see that the project i mean and is like

to the italian

this is and low-level value then the each

it means that in time and the first okay and it is usually the hey

why this is not the this is not for english

and the last the last feature that we want to point see that are easy

to how maps

and the proportion of anything the that can be in that entity these we give

some idea of the

compositionality of a of a set thing or something that

okay so the moral you the more we can find the with the intent in

any and now the name pricerange is a good the more it is it is

composition

this is the experimental setup the

we have the six the comments that the sse domains two languages to just the

just mentioned

we split each density of it in that way in that case

one import company is that there is no

anoint at present in the training is present in the text

for and it's innovation also negative entity needs to see you later the one information

to displease one quality for each for a very positive we generate pruning

then test this is important we don't titanium we are that a real test data

so you can that has a synthetic it is you know rate

okay a start from a number of templates are a little bit more than two

hundred the template so both for english and italian

a typical in place a do correspond to intensity comments apply templates for selecting a

plate for asking description templates for i think it a product of two that will

use the

like i'm finally we the name of the and whether the name of the entity

and the in from it is the

that is the a part of the data

we have two baselines atlanta's of that a simple rule based to a baseline where

a us this sometimes you get

and

a time to in a certain utterance is recognized as belonging to look at the

early eva

any all of the all that all kinds of the chunk of present indicative for

something typically

and then we wanted to test also

a new and more data we live in of rollment

syntactically generated that screen the

okay so we apply the same methodology for generating testing data testing data also for

generating synthetic for synthetic data generating the training data

this have the result of this of our experiments of the two baseline class of

our system and the

the last row so we see that the for all our dataset a

the system based the convexity of significantly outperforms of the that will be the two

baselines

which is already i think original result

something more about that the problems that this is the more complex or

as you can image and as it

it amounts from the got it is for the ease has a high us the

variability and they have just compositionality both in italian and in english so the results

are nowhere among the three the three domains

for an issue to is the last compositional so basically they easier

it's the one who percent or more tool named entity on a set then project

the point of view but actually this is the smaller dataset that we are okay

just a few one of its of with respect to

save about a thousand one of its for

and cruel think is very regular graph and the high composition

so here we have a good results

okay so just want to compute the

so this i where i have reported the some experiments about the other short approach

for entity recognition on the web we can see that gravity is only as the

only source of information

so it does not assume any annotated sentences the training it but also for testing

we have generated the syntactically the

we focus on a nominal entities because this out of domain entities in the second

also for naming the product so

and the we

the approach to tries to take advantage as much as much as possible or extract

noted from density as a particular due to the compositionality all the names or products

and the menu of respect this is a very initial work the and the we

see that quite a lot a room for improvement

three activities are going to for us to

the first one is just considering the fact that the state of a column of

sequence labeling is improving are actually about daily we have new and approaches the new

and more there's a for instance we tried the last the

more data the value by my and all the and the this is maybe better

than the previous there is a lot of room for experimenting and improve even acknowledges

for a generating a synthetic data

so we have we experimented with some parameters c one positive to negative but well

out that there might be maybe that all of the model setting for these parameters

and then of course it might be very interesting to integrate the exact idiots to

and he is that we have a some data maybe a little data few data

i i'm not take a few sentences and an integrated to

what also integrated the guys at a more than what we call and then g

we

a syntactically anatomy the more than a from an okay the doctor acts

so the reason i think about a lot of work of four

where the forties to make these approaches as much as possible domain independent soul and

be able to move from one domain to another with the same technology and also

language independent

and you

yes sure

so templates that are disjoint

both entities

and templates are disjoint so we try to separate as much as possible

training from that's

that woman

i

or maybe it's a good question but i don't think i have any also for

the moment so the focus was

and to do recognition in basically isolated sentences and the right so i don't have

any

so these are asked to be probably c and consider data even a and a

broad three more of a dialogue system

actually a this work is closer to traditional information extraction then

problem so we have no still there

sorry i think i to the possible

not all even that will the word embedded the word vectors so i'm generated the

front cavity

that is it's a good point so we don't vector or server for all mixing

we can be or other stuff everything is generated from a density

so this is the only source of information that