Speech Transcript - State-Driven Particle Filter for Multi-Person Tracking

0:00:15	thank you um that of this uh presentation is a state driven particle filter for
0:00:22	multi person tracking um these are so work with uh for that it let us
0:00:29	to and they're gonna look this
0:00:32	this is the other end of the of the presentation first uh define the introduction
0:00:36	to the problem of tracking
0:00:38	then i will uh
0:00:41	formulate a explain how the particle filtering general is formulated
0:00:46	uh you have touched and all that yeah
0:00:51	uh
0:00:52	afterwards
0:00:57	it's similar compute its own a computer
0:01:00	this is not this computer this is the laptop
0:01:04	oh
0:01:05	so it's uh
0:01:08	okay thank you
0:01:11	and
0:01:13	afterwards i will uh explain our proposal which is based on particle filtering and use
0:01:19	the state driven particle filtering and finally we present experimental results on the completion
0:01:25	well us most of you may know a multi person tracking have different applications for
0:01:30	instance or radiance driver assistance uh robot interaction with the media
0:01:35	um in sometimes uh the some pickups that some typical problems such as occlusions or
0:01:43	more than fifteen or lost targets detection is not uh addressed by the literature uh
0:01:50	people some things that present day um we propose something general or ready
0:01:57	basis in which these uh occlusions and problems doesn't happen uh so in this case
0:02:03	we have an example of target hijacking we have a target in the foreground weeks
0:02:09	recruits one in the background and high get the model
0:02:14	oh here we would have the problem of uh lost target detection uh and occluded
0:02:19	person by the bigram soul or by the scene
0:02:23	and um in general approaches and tracking dealing with tracking and not they do not
0:02:29	take these situations into account uh but typically um they use heuristics which are based
0:02:36	on a book um solutions for example detecting the targets overlapping or um stopping the
0:02:43	updating with the model updating when this happens getting the targets when detections are not
0:02:49	associated for a given set of frames et cetera uh some specific and papers which
0:02:56	uh really thinking talk on this problem by Q for example using um with the
0:03:00	common sff um so who proposed a post processing you know that to um you
0:03:08	know it to uh relate the relation the interactions between objects but this is made
0:03:13	as a post processing for the tracking for the for the acquisition and one hundred
0:03:19	depending on who proposed uh occlusion up that uh structures
0:03:26	a orange or object if is to formalize these problems uh in a in a
0:03:31	whole general framework you know what i what chuck hijacking model drifting decrease the false
0:03:38	positives the false negatives and also probably the identity switch yeah in the trucks
0:03:44	how do we make that well our contribution is and based on a graph uh
0:03:50	continue containing states this estates are formalising their in the nodes and uh the transitions
0:03:57	and between the nodes
0:04:00	are represented by the arcs
0:04:02	uh this assigned the scheme of our proposal i will go into it later but
0:04:08	uh just to right
0:04:10	say that uh one of the main uh
0:04:13	the important things of the of our proposal is that it's as and generic enough
0:04:17	so or the papers other proposals the specific techniques for occlusions to drop can be
0:04:23	already included here
0:04:25	now i will uh briefly overview of our particle filtering works uh just a set
0:04:31	of particle filtering use of a would method which is able to um follow the
0:04:37	objects in the scene um and we uh make use of five tracking-bydetection by reading
0:04:45	this means that we have and detections with our which are painted in general we
0:04:49	have the tracks which are painted in
0:04:52	blue in this case and uh with the corresponding id and a set of particles
0:04:58	which are the ones that are distributed along the frames in order to search for
0:05:03	the new appearance of the of the truck
0:05:07	particle filter has two ways that's the first one is the prediction also known as
0:05:12	a importance function this is the equation of the importance function um
0:05:18	and here you see the main components of these equations this is this will be
0:05:22	used later in our formulation uh it's a normally based on human detector are for
0:05:30	example whole class svm the historic oriented gradients and a classifier uh the dynamics pdf
0:05:37	uh for example in our case we use a random walk but we could use
0:05:41	of our higher-order approaches an upright which is open a predefined way to search for
0:05:47	the um new occurrences of the truck
0:05:50	the second is that if the is the correction which is the weighting of the
0:05:53	particles uh in this case we use a color model very simple is the colour
0:05:58	histogram of the of this torso region of the pedestrian
0:06:03	and that you see here uh we have uh the particles are and painted according
0:06:08	to the weighting of the of the of the colour matching so the brighter the
0:06:12	particle the highway
0:06:16	uh this is a proposal of we propose a
0:06:20	state based tracking we have potential tracked occluded lost states and also the additional that
0:06:28	when we keep the track
0:06:30	and now we will explain each one of the uh specific a steps not all
0:06:34	the details but uh the main important components that uh will highlight why this proposal
0:06:40	is useful
0:06:42	first um we have a detection in general and the first we make use to
0:06:47	create a new track which is painted in dashed blue and uh we initialize these
0:06:52	track with the state potential
0:06:56	after a set of frames if and their track has corresponding detections we obtain that
0:07:02	the state from potential to track
0:07:05	otherwise if the track does not have enough corresponding detections in the next frames we
0:07:11	give a soul goes from potential to that
0:07:16	then in the case that the person is correctly tracked so out of the results
0:07:21	of are several uh frames there's a corresponding detection we can apply frame-by-frame corresponding actions
0:07:29	specifically them at the track state this is an example there in a set of
0:07:35	actions for example the weighting of the particles gives an initial model used in the
0:07:39	data association between detections and the track and i will just highlight uh for model
0:07:45	of time their importance function which is how we draw we distribute the particles in
0:07:51	the frames in the case in this case uh we only use the dynamics without
0:07:57	thinking and talk on the detections in order to deceive it is practical so as
0:08:00	you see here we from the original equation we only use one part
0:08:07	then um you know to make the transitions
0:08:11	uh from track to do that and two lost we propose to conditions are quite
0:08:16	simple the first one is that the isolation of the tracks so if two tracks
0:08:22	overlap each other
0:08:24	uh in this case we use that pascal a criterion that pascal overlapping criterion is
0:08:29	also known the jaccard overlapping criterion is quite simple and the other um condition is
0:08:36	the classifier um and studied in this case we use a online classifier would uh
0:08:44	more than is the colour of the course of the person and
0:08:49	we track it a long time and in the case that we see in the
0:08:54	yellow uh line we see that tracked person in which the confidence on the of
0:08:59	the online model uh is maintained a long time
0:09:03	in a real we see the case of an occluded target in which there and
0:09:09	confidence of the of the color model and jumps down so we detect this uh
0:09:14	first we filter the signal which someone the kalman filter and then we test these
0:09:21	and jumps with the generalized likelihood ratio test
0:09:25	so in the case that and these conditions are fulfilled so the um pedestrian the
0:09:32	person is not isolated and there's a jump in their classifier we go to the
0:09:38	occluded uh
0:09:39	state
0:09:40	and so one here we have a additional uh conditions from occluded to track come
0:09:48	from the same occluded to this to the occluded the state and
0:09:53	i highlight this again the importance function in this case is based on detections so
0:09:58	when the truck uh when the person disappears which because it so you the by
0:10:03	another person we go to the occluded and the state and the particles are not
0:10:08	drawn according to i don't know what is it wasn't importance function of the track
0:10:13	state but are drawn according to the detections around
0:10:20	and so one so um we have the same conditions with the same tests in
0:10:26	order to make the transitions between the states in this case uh we see that
0:10:30	the lost target is occluded by their scene
0:10:36	and so on
0:10:38	and we finally if a target has been lost for different a different set of
0:10:42	frames week
0:10:45	to sum up um these are the experimental results um we have a me to
0:10:49	use of three datasets the tud-crossing to decompose and the one on pets two thousand
0:10:55	and nine and compared to the original um formulation of the particle filtering and our
0:11:02	proposed state based
0:11:05	one and give several statistics
0:11:09	the first ones are the with the object tracking precision and accuracy in the case
0:11:15	of the precision it evaluates the uh overlapping between the ground truth and the track
0:11:20	itself and we see in its around plus or minus one percent but this is
0:11:27	not significant if since uh we have um already detected the person but there's a
0:11:33	slight displacement with respect to the position of the tracks with respect to the ground
0:11:39	truth
0:11:40	uh the one which is a really significantly fees the accuracy in this case with
0:11:45	our proposal we gain about seven percent and also the false negative rate on the
0:11:50	false positives pretty much which are also decreased in our proposal and finally the identity
0:11:56	switches this is when a truck has to be really se reinitialised and given a
0:12:02	different id from the one but it had in our case uh the number channel
0:12:08	and then i show you some examples
0:12:12	and this case we have attract a pedestrian which is isolated and it works quite
0:12:18	well both with the original proposal and on the with the original approach and the
0:12:24	state based approach we see uh the potential track which is initialized then it uh
0:12:31	divorced what tracked track then uh you can see she has several detection so now
0:12:38	false positives will which will disappear but the track is currently uh track i don't
0:12:43	time so there's no problem this case
0:12:46	in the case in which we really see that the differences between the traditional approach
0:12:50	in our proposal is in this case for example have an occluded uh case we
0:12:56	have to target which overlap each other and in the first case the um person
0:13:02	on the front high yets the person on the back so the model uh drifts
0:13:08	and it gets a stark on the background and finally there's a new uh track
0:13:13	for the new detection and then new id for these same person so he would
0:13:18	have a switching in the idea
0:13:20	in our case uh the same happens the two targets okay with each other but
0:13:25	the system is able to reinitialize the track with the same id by detecting that
0:13:31	in the third frame in this case and there's an occlusion
0:13:36	and finally the case in which a track has been lost in this case the
0:13:40	track is computed by the background but the scene um uh by the person but
0:13:46	the our system is able to detect that it hasn't lost and it finally uh
0:13:52	detect that in matches the track that have we lost with a new detection in
0:13:57	the system that colour model matches and it we initialize the tracker and with the
0:14:03	same i
0:14:05	so to sum up the conclusions are that we have presented a and the state
0:14:10	based uh tracking approach which is able to deal with it got tracking problems that
0:14:15	are their creations that hijacking of the target one and drifting at sea trial it
0:14:20	uh gives a performance improvements with respect to the traditional on the state based approach
0:14:26	it's applicable to different existing approaches such as uh or occlusion classifiers specifically some classifiers
0:14:35	are used we saw that kind of filter in the test and one with a
0:14:40	different state of the art
0:14:42	uh_huh lost target detections the could be actually included in our proposal and especially in
0:14:48	the future work um since this is a on uh mutual to be framework this
0:14:54	is we have make use of very simple ingredients so for example the um the
0:14:59	colour model for the for the person is quite simple socially going to more advanced
0:15:06	that um
0:15:08	components for instance appearance models for the for the for the tracks or an increasing
0:15:13	the number of particles depending on the state of the person uh we think that
0:15:18	it would uh provide a better results
0:15:23	thank you
0:15:38	yeah
0:15:42	uh experimentally there are uh
0:15:47	as many as the ones you would use for a non state based approach but
0:15:54	multiplied by the number of states
0:16:02	uh_huh
0:16:03	and then they are related with the transitions for the for example for the isolation
0:16:09	uh we use the syllable five which is typically used uh
0:16:14	for the overlapping and for the classifier itself it's also i just the bike and
0:16:20	by looking experimentally
0:16:34	no not at the moment we uh have one frame per second
0:16:41	yeah
0:16:43	for
0:16:45	the case is my
0:16:48	ten people more or less yeah can people yeah
0:17:02	to become an attempt works
0:17:04	yeah in fact it would because uh the particle filtering uh allows you to process
0:17:10	independently the different uh tracks and seems uh each one of the states allows you
0:17:16	to um to make use of different parameters depending on this date you could spend
0:17:22	less particle so less processing on the trucks that for instance are isolated so by
0:17:27	accelerating these things ending in these things we can we think that it would be
0:17:31	to reach we have then
0:17:50	yeah
0:18:00	it is based on the chance writing
0:18:03	or tracking on particle filtering
0:18:07	yeah
0:18:14	yeah i don't know which one but it's based on that
0:18:20	okay fine
0:18:28	a speaker

State-Driven Particle Filter for Multi-Person Tracking

Object Tracking and Identification

Gerónimo David, Lerasle Frédéric, López Manuel Antonio