0:00:17 | hello everyone we're ready for final session of one or assertion of a conference |
---|
0:00:23 | on discourse |
---|
0:00:26 | i was to produce a session chair the first talk remove your is from the |
---|
0:00:32 | ritual corpora too |
---|
0:00:34 | space reference a new approach to spatial reference resolution in a real environment |
---|
0:00:41 | thank you graph to everybody so my name is middle and i'm a piece the |
---|
0:00:45 | student here at age and of course is a busy student you get to read |
---|
0:00:49 | a lot of papers and every time everything you wanna get confused by the title |
---|
0:00:52 | the most |
---|
0:00:53 | so i decided to represent mine i'm going to make sure all of you would |
---|
0:00:57 | understand why those words i mean the title |
---|
0:00:59 | so we start backwards paraphrase a realistic environment referring to the main we're working in |
---|
0:01:04 | which is but that's train wayfinding |
---|
0:01:07 | in the real sick this |
---|
0:01:09 | and when you are in i'm from elicit this the first thing you do is |
---|
0:01:12 | you take a smartphone and launch something like mobile apps |
---|
0:01:16 | and |
---|
0:01:17 | the way they global map something typically guide use the same as they would not |
---|
0:01:21 | be cars typically pitch is present you some think will turn by turn irrigation so |
---|
0:01:25 | you get the bunch of instructions presented to you on the screen supplemented by map |
---|
0:01:29 | of the movie marker indicating your position |
---|
0:01:31 | and the instruction can be watched as well |
---|
0:01:32 | and they would sound like turn right on the wallet eigen slash route |
---|
0:01:36 | two hundred seventy seven then go six hundred fifty meters and turn left on the |
---|
0:01:40 | frame on which is not exactly the most natural thing you would expect from the |
---|
0:01:45 | system mainly because of two reasons |
---|
0:01:47 | the first one is that it relies most and quantitative data so on cardinal directions |
---|
0:01:53 | street names and distances and these exactly the things that we humans of trying to |
---|
0:01:57 | avoid when guiding each other instead we tend to rely more on landmarks according to |
---|
0:02:03 | present to the previous research solo salient objects in the vicinity |
---|
0:02:06 | so what we would really like to have here is the shift from turn by |
---|
0:02:10 | turn navigation to the landmark by landmarks navigation |
---|
0:02:14 | the second reason is that the wayfinding process is inherently interactive because it happens in |
---|
0:02:19 | a dotted between two humans so we would like to have more and more interactions |
---|
0:02:24 | from the wayfinding systems which led us to that a spoken dialogue system that uses |
---|
0:02:29 | landmark navigation to wear for wayfinding |
---|
0:02:32 | this what you're saying structures like these go forward until a see the fountain either |
---|
0:02:36 | glass building with some slicer and if the person got lost assess i think i'm |
---|
0:02:40 | lost but they see yellow house to my right a system should but and still |
---|
0:02:44 | be able to respond with no what is due possibly see apart opposite to this |
---|
0:02:47 | yellow house |
---|
0:02:48 | not only to do that the basic task is to identify that the yellow house |
---|
0:02:53 | to my right is really referring to something and this being and being able to |
---|
0:02:57 | find this geographical object and done |
---|
0:03:00 | process it under support response accordingly which leads us to the basic thought that we |
---|
0:03:04 | need to solve that of spatial reference resolution and this is the next phrase in |
---|
0:03:08 | the title that we have |
---|
0:03:10 | so |
---|
0:03:11 | what we're talking here is the referring expressions there's of those words that people use |
---|
0:03:15 | when the reference or something |
---|
0:03:17 | like those in the pink over here |
---|
0:03:19 | then the optics that amount by those referring expressions the geographical object |
---|
0:03:25 | a cold reference like those with green frames and where what variations that here is |
---|
0:03:30 | that |
---|
0:03:31 | three level referring expressions so when you're walking down the street whatever you see |
---|
0:03:37 | those of the objects were adjusted |
---|
0:03:40 | and then the task of reference solution is defined simply as resolving your for expressions |
---|
0:03:44 | the reference |
---|
0:03:45 | now some very |
---|
0:03:47 | to use the listener's my say wait there is also that |
---|
0:03:51 | it is also a referring expression and indeed but this is a core reference so |
---|
0:03:56 | it refers to something that is inside the discourse that is under forty for expression |
---|
0:04:00 | whereas in this work going to sit in x afford referring expressions so that those |
---|
0:04:03 | referring to something else i to discourse the jurors could object and then another problem |
---|
0:04:07 | here is that will have nested for expression |
---|
0:04:10 | so have class we don't their which refers to |
---|
0:04:12 | thus a small shop and for this particular work we decided to take the maximal |
---|
0:04:16 | for expressions of the largest in case we have nested ones |
---|
0:04:20 | okay so from this for example is the seems like it's pretty easy you just |
---|
0:04:23 | take not phrases in your don't say there was a referring expressions |
---|
0:04:26 | is that so |
---|
0:04:27 | well not really |
---|
0:04:29 | can see the district samples for example first question is you know if there is |
---|
0:04:32 | a subway station your life and the subway station is an all face but it's |
---|
0:04:36 | not there for expression and the reason is because there is no reference you're not |
---|
0:04:40 | meeting any specific object it just and kind of subway station can be there can |
---|
0:04:44 | be not we don't know and the same as it goes for the |
---|
0:04:47 | two examples below |
---|
0:04:49 | than the last phrase is space recognized and you're approach which is sort of the |
---|
0:04:54 | method we're proposing here and all the word neural might think you that there will |
---|
0:04:58 | be neural networks yes in the that would two |
---|
0:05:00 | and when you thinking about neural networks you think but there really hungry for they |
---|
0:05:05 | that so what the it to the use |
---|
0:05:07 | and with the dataset called space ref |
---|
0:05:10 | and it was collected by letting ten users walk to predefined rules and just basically |
---|
0:05:15 | describing the weights of like thinking allow so like i see a red brick building |
---|
0:05:19 | over there are going down the steps and so on so forth |
---|
0:05:22 | and this way |
---|
0:05:23 | one thousand three hundred and three geographical for expression |
---|
0:05:27 | have been collected which we're going to use for the purpose of this work so |
---|
0:05:32 | now we see the problem of special efforts a solution is being decomposed into three |
---|
0:05:36 | stages the first one is what have spoken utterance you want to identify referring suppression |
---|
0:05:41 | in the so those words if we but potentially five something the second step would |
---|
0:05:45 | be find potential reference potential geographical objects which we call can that's and the third |
---|
0:05:50 | step would be the resolution itself so we goal |
---|
0:05:52 | bottom to top |
---|
0:05:55 | so one we're thinking about referring expressions identification we realise that it's actually very similar |
---|
0:06:01 | to named entity recognition "'cause" what you need to do is just fine specific kind |
---|
0:06:05 | of face instance named and in one case and referring expressions in the other |
---|
0:06:09 | well actually named in this are can also be referring expressions so we were thinking |
---|
0:06:14 | okay then we can maybe borrow or get inspired by the methods for the named |
---|
0:06:18 | entity recognition |
---|
0:06:19 | and we started by labeling the data in the same weight in fig with this |
---|
0:06:24 | famous by you |
---|
0:06:26 | labeling and in here |
---|
0:06:27 | what's your is if you have assumed a word |
---|
0:06:29 | we can label it as and then still have a referring expression because think it's |
---|
0:06:34 | to can have noncontiguous referring expressions be labeled that |
---|
0:06:37 | and then we're thinking at the method you also be inspired by than that by |
---|
0:06:42 | the methods when the net recognition and guess in this case is your network with |
---|
0:06:46 | architecture to the right we go definite so that see how it works |
---|
0:06:50 | as i we have an utterance noisy a big red building so the first one |
---|
0:06:54 | thing we do is with that it at the fixed with because we're standard when |
---|
0:06:57 | they want fixed with dancers |
---|
0:07:00 | then we fitted word-byword reference |
---|
0:07:03 | and then every word for every word with first encoded with a fifty dimensional are |
---|
0:07:08 | more demanding so that pre-trained we have downloaded those and of course there are out |
---|
0:07:12 | of the capital cases and mostly those are sweeney street names in our case and |
---|
0:07:18 | to those |
---|
0:07:19 | we encode the character level information using a character-level bidirectional are now |
---|
0:07:26 | and the reasoning why we're using biran and is speakers |
---|
0:07:30 | this we restrict names tend to have this bit that the and like a diagonal |
---|
0:07:33 | in your of are part of that again or holding up and meaning the street |
---|
0:07:36 | actually so we're thinking we're hoping but this small myrna and can identify those patterns |
---|
0:07:42 | and that we have some kind of information for these words |
---|
0:07:46 | lacking la vectors |
---|
0:07:47 | so them |
---|
0:07:48 | the final including for every word would be |
---|
0:07:51 | the column vector |
---|
0:07:52 | then the hidden state of the forward cell of the small they're the level by |
---|
0:07:55 | or not and hidden state of the backward cell and we do for each work |
---|
0:07:59 | so we get so the sort of metrics |
---|
0:08:01 | i don't of course sentence we want to have sentence level information there as well |
---|
0:08:05 | so not there is a larger by are known to account for sentence level information |
---|
0:08:11 | and |
---|
0:08:12 | at the end we get and all the matrix |
---|
0:08:14 | which we got all such as sub sentence encodings and |
---|
0:08:19 | the idea here is that for each word we want to account for information that |
---|
0:08:23 | all the preceding words beginning and those exceeding words are giving |
---|
0:08:27 | so for example |
---|
0:08:28 | for the word b |
---|
0:08:30 | where taking that hidden state of the forward cell |
---|
0:08:34 | but sort of encodes the information for all the preceding word so noisy and word |
---|
0:08:38 | biggest one and |
---|
0:08:40 | also we take a hidden state of the backward so that encodes information about all |
---|
0:08:44 | the words |
---|
0:08:45 | from the backward from the backward direction so big red building and a number of |
---|
0:08:49 | bands and we have there |
---|
0:08:52 | why |
---|
0:08:53 | why do we need to do that |
---|
0:08:55 | so that's consider two examples the wording green strain |
---|
0:08:59 | now if you consider only the preceding words |
---|
0:09:02 | in both cases they're the same you can see hey and you can see so |
---|
0:09:06 | when you're deciding whether this word will be part for expression you have to have |
---|
0:09:11 | to look at the succeeding words and in the first case the station with hopefully |
---|
0:09:14 | indicate that this is a part of referring expressions the first train and in the |
---|
0:09:17 | second case departing would indicate that it's not hopefully |
---|
0:09:21 | and the center by spoken cistern but in the different direction |
---|
0:09:24 | so is turn |
---|
0:09:25 | is the same succeeding words but then proceeding more to different so we can hopefully |
---|
0:09:30 | labels and differently |
---|
0:09:32 | on them |
---|
0:09:33 | the part of the network |
---|
0:09:35 | part of breath that is getting of the subset and sub sentence encoding metrics |
---|
0:09:39 | we double as ref not and will be using it later |
---|
0:09:44 | so than with it |
---|
0:09:46 | this |
---|
0:09:46 | output of the red not through the fully connected layer followed by a drop out |
---|
0:09:50 | and then we get the final softmax layer of gives us this kind of metrics |
---|
0:09:53 | there word so far ward where getting a distribution over the three labels so be |
---|
0:09:59 | rough and direct then we take the maximum probability there you see the green that |
---|
0:10:04 | which is sort of the labeling so now i and c would get a and |
---|
0:10:07 | a million be so this is where the ranks person starts and the bigger building |
---|
0:10:10 | we get |
---|
0:10:11 | i rough and then all the possible get so this is |
---|
0:10:14 | then i began building is a referring expression |
---|
0:10:18 | when it comes to evaluation what do we consider as a positive data point positively |
---|
0:10:24 | labeled data point |
---|
0:10:25 | so we are interested only in those cases where the whole very expressions table so |
---|
0:10:30 | if a part of free expression is then we say it's not correct |
---|
0:10:33 | like the second case or whatever |
---|
0:10:35 | but then we also mm notice that there are cases where you have filler words |
---|
0:10:39 | in between |
---|
0:10:40 | and we label them with |
---|
0:10:42 | for the filler words but the network sometimes tend to put are up there and |
---|
0:10:46 | that's a pity to counter this the wrong example directly use it's also sort of |
---|
0:10:50 | with post processing can be used so we introduce the notion of partial precision and |
---|
0:10:55 | recall |
---|
0:10:56 | so we say the point is partially correct |
---|
0:10:59 | if the that they're for expression is labeled that's partially correct if its start at |
---|
0:11:03 | the same ward |
---|
0:11:04 | and then it has at most one error one labeling error and of course it's |
---|
0:11:08 | more longer than two words gives you start with one word while limiting our its |
---|
0:11:12 | role |
---|
0:11:13 | right |
---|
0:11:14 | then we have the baseline that we're comparing with of this is the most natural |
---|
0:11:17 | baseline you can think of this just basically taking no phrases you have an utterance |
---|
0:11:20 | u parsing in to get all the non-phrasal that you say those of the referring |
---|
0:11:26 | expressions |
---|
0:11:28 | let's see |
---|
0:11:29 | about |
---|
0:11:29 | what the results we didn't we had so the rest not |
---|
0:11:32 | perform better than the baseline |
---|
0:11:34 | but this not the most interesting result partial precision and recall is |
---|
0:11:38 | multipath the for definite than precision recall which indicates that |
---|
0:11:41 | probably if we get more data you will get much better performance but just precision |
---|
0:11:46 | recall there's of the whole architecture has the potential with thing |
---|
0:11:50 | and the second step is finding the can that's |
---|
0:11:52 | the geographical objects then we for that we use that open street map specifically two |
---|
0:11:56 | types of objects an open street maps ways representing most this trees and buildings and |
---|
0:12:01 | nodes representing the points-of-interest like say and from somewhere or the function for the static |
---|
0:12:08 | now the way we've construct a can that's that is all the objects that you |
---|
0:12:12 | can see |
---|
0:12:13 | from the point where is standing in |
---|
0:12:15 | so that say you standing over that |
---|
0:12:17 | and then we know the direction that a working walking in by just taking the |
---|
0:12:20 | fifth and |
---|
0:12:21 | that you're coordinates five ten and fifteen seconds before |
---|
0:12:24 | so that we |
---|
0:12:26 | look |
---|
0:12:27 | in the radius of one hundred meters from minus one hundred two hundred degrees and |
---|
0:12:31 | they called the objects visible |
---|
0:12:33 | and so in this case |
---|
0:12:35 | you get |
---|
0:12:36 | those objects over the |
---|
0:12:38 | and on average you have thirty three such objects in the candidate set |
---|
0:12:43 | and then each object we're going to encode it |
---|
0:12:47 | and the following way so first |
---|
0:12:49 | we have taken a four hundred and twenty seven |
---|
0:12:51 | automatically derived binary features from the open street maps and the way they were derived |
---|
0:12:55 | as by considering |
---|
0:12:56 | open always sam tags over here both tags and values and the typical that could |
---|
0:13:00 | be building with about the university |
---|
0:13:02 | and this would get one of those slots |
---|
0:13:04 | of zero months over that and we also take the normalized distance feature and then |
---|
0:13:10 | also take normalized swedish with sweet being how much of your visual field that's this |
---|
0:13:15 | particular object occupied and we divided by three hundred and sixty degrees |
---|
0:13:20 | so that this is the second network as promised |
---|
0:13:23 | it's called space ref that and this added it easier so it operates on the |
---|
0:13:28 | pair is not on the pairs of the referring expression and the candidate |
---|
0:13:32 | so |
---|
0:13:33 | for example we have a referring expression the bus station |
---|
0:13:35 | and we can that set which is just three objects here because it's hard to |
---|
0:13:38 | put thirty three there |
---|
0:13:40 | so |
---|
0:13:41 | it starts by building the bus station using the rest i think older as we've |
---|
0:13:45 | seen before and having the sub sentence encoding that kicks and it takes the last |
---|
0:13:49 | hidden state of the forward sell the first and say to the backward so concatenate |
---|
0:13:53 | those |
---|
0:13:53 | and this is the representation of your referring expression |
---|
0:13:57 | then it takes each candidate is we're operating in paris referring expression can that |
---|
0:14:01 | the first can that in this case and represented with those or some features distance |
---|
0:14:06 | and sweet as we've seen just couple of slides before |
---|
0:14:09 | then we concatenate all of those |
---|
0:14:11 | put it through fully connected layer and have a |
---|
0:14:14 | final softmax for each or rather the sigmoid prediction there is a binary classification and |
---|
0:14:19 | we have the label |
---|
0:14:20 | between zero and one also are zero or one zero meaning that there are faring |
---|
0:14:24 | spoken and the candidate do not match and one meaning that they do not so |
---|
0:14:28 | we |
---|
0:14:28 | so resolving water for expression |
---|
0:14:30 | would involve one averaged thirty three binary classification problem result and that after we don |
---|
0:14:36 | that hopefully the first one is being labeled as every frame as a |
---|
0:14:40 | reference for this referring expression "'cause" it is a bias about station |
---|
0:14:44 | and then we do the same thing with a second again |
---|
0:14:47 | and hopefully the second in the third a label as |
---|
0:14:54 | and now what kind of baseline do we have to compare two |
---|
0:14:57 | that's pretty straightforward the baseline that the first thing you could have focal |
---|
0:15:01 | so it can referring expression like a very nice big part |
---|
0:15:05 | displayed by space then you lower case of and remove stopwords we give a set |
---|
0:15:08 | of words like a very nice big part |
---|
0:15:10 | and then you look at the open street map tags for every can that |
---|
0:15:15 | and if any word from this set but we got and for a second step |
---|
0:15:19 | it appears in either technique or a value |
---|
0:15:22 | we say to match |
---|
0:15:23 | otherwise it's not too much |
---|
0:15:27 | and these of the results |
---|
0:15:28 | we also compared it with another method previously reported in literature that's called words s |
---|
0:15:33 | classifiers and spacecraft not performs better |
---|
0:15:38 | that |
---|
0:15:40 | which is where you stop sleeping and probably this is why tuple for everybody |
---|
0:15:46 | and |
---|
0:15:47 | many things can go wrong so |
---|
0:15:48 | that's what works |
---|
0:15:50 | so |
---|
0:15:52 | blue dot able to represent my position where am something |
---|
0:15:56 | so i'll just put myself |
---|
0:15:58 | just near the building where we are |
---|
0:16:01 | namely |
---|
0:16:03 | van |
---|
0:16:05 | i |
---|
0:16:07 | say the utterance like i |
---|
0:16:10 | standing near the university |
---|
0:16:14 | different number |
---|
0:16:15 | that one green is the work of rough not so we found an utterance |
---|
0:16:20 | a referring expression in the utterance |
---|
0:16:23 | now we take the data from the open street map |
---|
0:16:30 | these are all with the sort of |
---|
0:16:32 | do you have of objects that are present in the open street map |
---|
0:16:36 | now we assume that we're looking north |
---|
0:16:38 | so this direction where sort of |
---|
0:16:40 | going to |
---|
0:16:41 | and |
---|
0:16:42 | now we're trying to resolve |
---|
0:16:45 | the reference |
---|
0:16:47 | yes i |
---|
0:16:48 | so that was in orange is it those objects in orange the know the counter |
---|
0:16:53 | that's up so this is those object that have been considered by else |
---|
0:16:56 | to be possible reference and the one green denotes the actual reference s spacecraft not |
---|
0:17:01 | things |
---|
0:17:02 | and this is the building where in exactly |
---|
0:17:05 | so if we move |
---|
0:17:07 | i don't that down over here |
---|
0:17:10 | and tried to say |
---|
0:17:12 | i see |
---|
0:17:14 | the fountain |
---|
0:17:16 | in front of me |
---|
0:17:19 | and the same trick of that |
---|
0:17:22 | we see again all everything in orange is a can that's that |
---|
0:17:26 | and the one in green there is no the actual fountains with |
---|
0:17:30 | does not only find the ways of the buildings but also the notes so the |
---|
0:17:34 | point of interest |
---|
0:17:36 | then if we camelot bit different direction |
---|
0:17:40 | and say |
---|
0:17:42 | i |
---|
0:17:42 | passing |
---|
0:17:44 | at each for example |
---|
0:17:48 | with see that that's capable also finding multiple reference "'cause" sometimes from the cat your |
---|
0:17:53 | river expression can be ambiguous so it can be the case that you get more |
---|
0:17:58 | than one reference you can't also the case if you give the reference what |
---|
0:18:02 | but |
---|
0:18:03 | then |
---|
0:18:06 | it of course not perfect |
---|
0:18:08 | because you have sixty four percent precision cannot be perfect so let's see |
---|
0:18:11 | where is a perfect |
---|
0:18:14 | if i say something where i'm standing |
---|
0:18:19 | and the bits |
---|
0:18:22 | cool on the mean things |
---|
0:18:24 | right |
---|
0:18:25 | have |
---|
0:18:30 | so it somehow for some reason it selects as part of the street |
---|
0:18:34 | so i mean some streets not all of them we don't know why yet but |
---|
0:18:37 | this is also to the research question for us |
---|
0:18:40 | to understand why in this case it selected like something like eight object "'cause" the |
---|
0:18:44 | streets are actually not to contain the contiguous objects some for some reason an open |
---|
0:18:48 | street map the street there just |
---|
0:18:50 | stored as bit sort of just part of the streets and of the one contiguous |
---|
0:18:54 | trees which makes |
---|
0:18:56 | definitely our job harder |
---|
0:18:59 | right hand down |
---|
0:19:01 | we try one more |
---|
0:19:04 | somewhere |
---|
0:19:05 | here |
---|
0:19:08 | here somewhere |
---|
0:19:11 | and we say |
---|
0:19:12 | i see that george for example |
---|
0:19:17 | i mean in some cases it does not actually identify the optics although |
---|
0:19:21 | the charges up with our you see probably the cross |
---|
0:19:23 | but then |
---|
0:19:25 | if you come to be closer |
---|
0:19:30 | i still doesn't work |
---|
0:19:34 | models |
---|
0:19:36 | right okay out of course doesn't |
---|
0:19:39 | because |
---|
0:19:39 | when it did work because it was very hard for example |
---|
0:19:43 | if you say i see the church and trans |
---|
0:19:47 | and works on a one |
---|
0:19:49 | so it's sort of sensitive and we don't know what why yet but this is |
---|
0:19:52 | raise number of research questions we addressed in future thank you very much and not |
---|
0:19:57 | compressed |
---|
0:20:06 | the you're much for the very interesting talk and the |
---|
0:20:10 | we can call so we don't fall asleep again |
---|
0:20:13 | our questions |
---|
0:20:20 | i think you for your talk that was great i was wondering it in with |
---|
0:20:24 | the earlier slides you had an example where the person said now ica and then |
---|
0:20:28 | there was like an explicit reference to the |
---|
0:20:32 | the object or the building |
---|
0:20:34 | and i was what i was just wondering can you handle purely anaphoric references like |
---|
0:20:38 | if the person had just said now i see it |
---|
0:20:40 | no in there it's just like the for reference that we're howling in this paper |
---|
0:20:44 | we consciously excluded anaphoric references case we think it's separate problem with separate that's okay |
---|
0:20:49 | great |
---|
0:21:01 | well time to work are uses like to have a question the back to the |
---|
0:21:06 | power |
---|
0:21:07 | this like |
---|
0:21:09 | the powers for a what happen if it's close that we consider the distance i |
---|
0:21:14 | mean |
---|
0:21:15 | for some object like a church or something |
---|
0:21:19 | if the user might say that i see a charge |
---|
0:21:22 | well maybe just a direct form a really marcus that's |
---|
0:21:25 | also very short distance |
---|
0:21:27 | that is true so that the well as a as i said previously that the |
---|
0:21:32 | way we sort of the final can't is that is |
---|
0:21:35 | we take a fixed radius of one hundred meters in this case so if it's |
---|
0:21:39 | really far and the user says that if we do not increase the rate use |
---|
0:21:42 | it will not be able to track currently |
---|
0:21:45 | i hope to transfer sequence |
---|
0:21:48 | and thank you for the top is in that the ml ui the couple of |
---|
0:21:53 | examples in which you know it was and i nd a near the university and |
---|
0:21:59 | then there is another one with k t h where |
---|
0:22:03 | these are |
---|
0:22:04 | the not dean |
---|
0:22:06 | a large joe graphical object right |
---|
0:22:11 | but then especially in the for example in the first case you're is all to |
---|
0:22:14 | the building we re saul and |
---|
0:22:18 | and just wondering if you can speculate how |
---|
0:22:23 | these sort of references can be you know they are really context-dependent right so that |
---|
0:22:29 | university |
---|
0:22:31 | you identified that building but actually i n |
---|
0:22:34 | like in the corner of the campus and i'm near to the whole university in |
---|
0:22:39 | a sense right |
---|
0:22:41 | true so the first thing is again we have is rate is one hundred meters |
---|
0:22:45 | we can all get the whole universe that is the first limitation we have the |
---|
0:22:48 | second is again this was more to show the imperfection of the system rather than |
---|
0:22:52 | the fraction so actually when you say is seriously it and it and it identified |
---|
0:22:56 | as building it was just one because we're in this building |
---|
0:22:58 | but really you have also the building on the right-hand side which it didn't identified |
---|
0:23:02 | and this is more to show that you know it's imperfect and has sixty one |
---|
0:23:05 | percent precision so it sort of we have still the way to go |
---|
0:23:13 | okay how would improve okay by ice so that so okay a i'm not as |
---|
0:23:18 | effective requesting you press so the obvious thing is to collect more data and try |
---|
0:23:22 | to try to train the same thing and see if it works |
---|
0:23:25 | and the second the second thing it might it might be |
---|
0:23:29 | you know |
---|
0:23:30 | probably don't take on that those objects that are in the immediate vicinity maybe as |
---|
0:23:35 | one but this is probably will be harder because you know it is computationally will |
---|
0:23:39 | become invisible i guess |
---|
0:23:41 | "'cause" you know you have to identify which of the object you |
---|
0:23:44 | you i mean you still need to have some notion of visibility so to identify |
---|
0:23:48 | which of these you can potentially referred to run this you have a collisions computations |
---|
0:23:52 | you have mine side and you in these if you collide with specific operational and |
---|
0:23:55 | the lobster vision and also that |
---|
0:23:59 | i don't know that answers the question i probably not it seems like it is |
---|
0:24:02 | impeccable |
---|
0:24:12 | we conducted later "'cause" we sort of have a like times filling it here but |
---|
0:24:15 | we can take it later we have of the right |
---|
0:24:17 | okay thank you |
---|
0:24:19 | thanks i think that's all the time we had suppresses the sixes that's think the |
---|
0:24:22 | speaker again |
---|