Speech Transcript - Information-Gain View Planning for Free-Form Object Reconstruction with a 3D ToF Camera

well thank you uh so well minnesota for is um

i would percent might work that has the title information gain view planning for

free form object tracking destruction without really time of flight cameras this work has been

uh and on in collaboration with the german interspersed centre and you'd of politics or

dlr

with uh segment real and that some folks and might supervisors in america

okay the presentation is divided by the motivation of this plane

what motivated to do this work and then i will just possible to the main

algorithms that and how they may not way that works that's more or less with

the that decision was trained how the information gain representation works how we compute the

different kind of a view one generation

and what secretary on that we use in order to choose those viewpoints that i

will show my results and i finally conclude with my conclusion

so the motivation is active view planning and why did we know um in cs

given an unknown seen what we try to do is to move our sensor in

space in order to get more information and get more data of the of the

scene in order to

a build a model

so i went i'll uh how objective is to do this autonomously a two models

three uh an object in three D this object can be uh of preformed it

doesn't have to be of any form

and i one of our prerequisites is that it has we don't have any kind

of information of the see that different methods in the literature that the use a

three model based or some kind of course in order to get oriented uh through

the so the modeling of they opt

and our proposal is to use the information gain in order to decide which of

beers we are going to use in order to uh build our model

our main algorithm it's looks like this and it's embedded in mainly in four steps

and the first one it's the data acquisition we use a three D time of

my camera in order to get a point cloud from the image so once we

extract this images the for the second step is still update um some internal in

representations

uh_huh one the principle one is an occupancy grid it some of the resolution not

occupancy grid where the data of a time of flight camera get stored not only

like the point cloud but also the you statistically um and so trinity of those

points

i will explain it one of these steps like product so it gets good so

after D is that this firstly what we get is we have data now the

representation and its image representation from this mission we can compute uh like the boundaries

of these a rematch uh in order to select some pubes that a formal the

bikini of that match

so once we we've got these views what the out the main part of this

algorithm that's it's to decide between these views which one we should choose in order

to get more information from the model okay and that's nonviolent decision maker and it

gets information from the uh mesh representation of from the occupancy grid that is the

one that has all the inserting at of the model okay so now i will

go well we show all the steps of the know to quite high and then

i go

and explain uh each step called so this is that the same um first be

that we saw before so we've got an initial posted we set up yeah like

anywhere but just looking at the scene the only prerequisite is that it's looking at

the scene then we get a beer we have they the both um representations and

then we simulate a those be used in our occupancy grid in order to get

which once we are supposed to have this uh information gain once we have one

that it provides the highest information gain what we do is we go with the

wrong what we choose that goes and then we extract another point from the set

and again this is done repeatedly until the algorithm finish and completes the model features

just before uh it changes to the presentation of a mesh tries uh

we could also be to um be used can be a providing more information and

then it computes again information gain of those be used and then we select one

in order to continue modeling them uh the object

so the first that it's the data acquisition as i already said we used uh

a time of like camera in this experiment we were using uh the message imaging

asr a four thousand

it has to be said that it has been calibrated and characterize what we use

but we signal you rate it is not only the intrinsic parameters as normal parameters

do but we already calibrated F measurements

by amplitude done by all kind of errors that these cameras

so but even the when we finish this calibration this camera are one of the

disadvantages this comment is that they are still have noise in definition and so i

think that what we do is characterized that so each pixel has a covariance associated

with depending on the definite it's mess so it's pixel has that's really covariance

i related to it

so once well for those who doesn't know about this time of flight cameras they

provide intensity images and that uh images over just a correspondence or one by one

and they are rolling resolutions like one hundred seventy six or one hundred forty four

um pixels but they provided that uh twenty five frames per sex of a up

to approximate so they are very fast enough to get this just

so once we have uh

and this camera appointed to always seen what we do is to get a point

cloud and this point cloud gets updated you know an occupancy grid this occupancy grid

is some of the resolution occupancy grid

and the first two

and this occupancy grid it's first field in with nothing and feel nothing wouldn't we

understand like an unknown area and it's just one box with a high and something

to do then as far and as far as we are getting um introducing point

close to the occupancy grid all these um pixels in space this box and get

updated with new a measurement and these new measurements modify the entertain it is inside

all these boxes

okay so we've got an example of how to measurements like you pathetic a mess

of measurements will be if they were like ninety degrees

of each other that would be a box so the first row with the but

so without any kind of information and then we've got before the updated will be

these two measurement is to covariances and after updating the model like for using the

entertain it is it will get something like this okay so that's the formulation the

typical information gain at the

so this is only for you know uh putting the answer to anything inside a

model and keeping it so after this that what we in this produces its it

give us you know sensor directionality so each voxel stores that covariance in the direction

if the covariance hasn't direction and usually does that value of the measurement has a

higher and today needed and not the X and Y values so it gets a

story in each voxel in which uh direction is it has been taken and the

good thing is that

this allows to model refinement so at the end we can choose which be used

we will be able to choose which we use you know um give us less

um even more information or reduces morgan's identity of certain areas

so once we update this these representation what we do is to create a uh

a match

in order to get more candidates use to uh to check what's uh information gain

we provide so these candidate viewpoint generation is made on a more uh of gradient

that he presented in like the two thousand eleven

and what it does is it builds a at an alarm age it detects boundaries

of this image uh given certain parameters like the length of the of the boundary

or do you deviation of the comforter of the of these boundary then it separates

them and then what it does it grows uh region inside this match in order

to fit a quadratic patch

in order to so this but i think but

alright

so it's fitting next to the to the previous iteration in order to uh i

sure some overlapping between the two beams and then send you be it's extracted from

this from each button okay so after D is what we do is these new

bills we simulate an in the occupancy grid and then we take the information so

how is done in X

slide

so what we do is now that we've got these deals that we extracted from

the viewpoint planner we come back to the occupancy grid and then we take like

assimilating those of use a C as if we were extensive so we ray tracing

with a ray tracing in order to see all in which areas would collide our

readings and C of those readings how uh the information gain will be okay so

for each one of these like point one of the point clouds simulated pointless we

start the covariance we do this using the same pushing that we did it as

we needed in real and then we compute the information gain based on this formulation

so what it does it's just like estimation of all the logarithm softly traced metrics

that it contains all the updated um

covariance matrix

okay so by doing this we're piddly and at the end we manage to get

our results and these are the results of we

we obtain by three we tested by a on three statues with different shapes free

form

yeah as you can see we get quite a very nice property um models of

then you can see some areas that they have been not feeling or model but

it's you to the configuration because the studies where on top of uh like a

little

chair and then the robot can not access to certain band

and you can see that they are not there we define uh models in some

of them but that's mainly because of the resolution of the camera it doesn't have

more resolution

so and for concluding i presented uh this new three D information gain new method

for viewpoint selection

um you to its internal representation its simplicity allows D model refinement so what in

the future we would like to do is to define liking which resolution we would

like to have a model like or in which sent which parts of the model

we would like to have more resolution so in order to try to get uh

a better a better model so we could even decide like by if it has

a lot of curvature that's an interesting place so we would be able to get

more refinement of these in this i

that's so thank you

something how

not for about four oh i

not like not one thousand times but i cannot guarantee in a certain like number

of time

no it doesn't really concludes by construction it will and it's like definitely because it

will always fit anywhere we have that are no and then at some point it

will you know like calls the object

but it at like in this in this one we had to close of like

manually because we what we had restricted the area of down because we could not

go down so i can not shown in simulation we could we could do everything

but i can assure a number of leaves i can assure that they will be

close to a minimum because it's always

by construction it's obvious

building it incrementally

sorry

how this

so it's quite a D

uh i yeah actually yes

there's a distance like the got the camera has its calibrated that sent a thirty

centimetres so you can not move far away from the object always in the in

the distance that you probably because they are quite sensitive in that and yeah with

but uh i'd like what we assume it's like in this overlapping it has to

be like a at twenty percent of the of the of the first row of

the of the camera and

yeah and then it follows the angle of the of the product fitting surface

sorry

yeah well with that

it's the ones that fit

by construction so in order to refine the

the model you will be like getting new be used from different places following the

same structure because like usually what we have like they may never it's in this

at fourteen is what is plane and then you'll see a in a like you

matching just one point and you've got

see

i can be structure will be like the nicest when i just put it like

you know also normal way you just do a reading orthonormal way then you will

get rid use your and your covariance as much as possible but rather than these

i will not be able to get like if you decisions in which and i

will not get better than this like this is the best refinement that i can

get or calibrated camera better in order to get of reviews like this

yeah well yeah that will be able consider that

so what do i actually probably

okay i just a method of us that some folks and what he does it

calibrateds so this cameras um like they have ever skin distance for certain in distance

for each certain distance they have an offset the different often it follows a signal

to dial uh um function so you can get uh

you can then the detected and use it we usually calibrated sorry it's all in

the process calibration is like with a normal battery like the one that we use

in four intrinsic calibration but a huge one so that just the huge went and

then usually we use are different gray scales in the in the button so we

can because i'm different amplitudes the camera reacts differently so we have a different number

of incomplete you seen that depending on the intrinsic uh on the integration time that

we choose so all these parameters have to be chosen like in this experiment was

chosen for thirty centimetres and you calibrate the camera for that like for a range

of these

and then with this that with this pattern what we do is like we can

we compute all these uh functions that uh minimize the ever by uh we projecting

a plane like with the usual optical uh weight so what we do is you

know you can get because the intrinsic a parameters and then you were we like

put the plane on the space and then you mention what the mention that you

can get

for

i don't know it get it right

yeah

Information-Gain View Planning for Free-Form Object Reconstruction with a 3D ToF Camera

3D, Optics, and Light

Sergi Foix, Simon Kriegel, Stefan Fuchs, Guillem Alenya, Carme Torras