0:00:15 | the morning everybody a my name is get down to get the you know from |
---|
0:00:20 | the universe the straight fine glasgow and you two percent of work title dsp embedded |
---|
0:00:26 | smart surveillance sensor we propose swad based tracker uh if i speak too fast just |
---|
0:00:32 | um to slow down and the other cultures of the paper a novel so you |
---|
0:00:37 | under from texas instruments and prefer subjects hologram from the university of stuff like so |
---|
0:00:42 | this is the outline of my presentation first of all i would give a brief |
---|
0:00:46 | introduction to set up the scene and then i was state at an object it's |
---|
0:00:50 | oral work and after that i will show an overview of the entire system after |
---|
0:00:56 | that i will talk about in details about the been only dates within the system |
---|
0:01:02 | and then i show some results uh to become our work after that i was |
---|
0:01:07 | that the convolution our show some future work |
---|
0:01:11 | so we just surveillance is the monitoring we need a we commoners two video cameras |
---|
0:01:16 | this is convenient because we can use multiple common us to surveillance and Y data |
---|
0:01:21 | and weighted wireless person it's only here we have to be able in the looking |
---|
0:01:25 | at fifteen be used at the same time |
---|
0:01:28 | and i think is that we can analyze we can store to be used for |
---|
0:01:31 | future access |
---|
0:01:34 | but there's a problem so when we have too many we use and the personally |
---|
0:01:38 | sleeping so what we're gonna do this case is a suspicious individuals walking around but |
---|
0:01:44 | surveillance miscellaneous is sleeping |
---|
0:01:46 | so the problem is the level of tension the reaction time and the crime prevention |
---|
0:01:52 | so we don't want to use the surveillance this footage for a process for a |
---|
0:01:56 | trial what we want to the contraction straightaway |
---|
0:02:01 | so we analytics is the semantic analysis of video data to computer systems uh using |
---|
0:02:08 | image can be the processing techniques in this case we talk about smart surveillance "'cause" |
---|
0:02:14 | we have different divvied you in every applies my algorithms to analyze the speed you |
---|
0:02:20 | and when we have we do not be the ninety six um we want to |
---|
0:02:23 | achieve is to have smart so smart sensors so we have the beginning it's embedded |
---|
0:02:29 | on uh processes which are then attached to the commerce so we can create smart |
---|
0:02:34 | units and we can deploy the intelligence and the edge of the network |
---|
0:02:38 | so when we have multiple us a bit smart surveillance sensor like in this case |
---|
0:02:43 | we can severely um we can so be a whole building um you real time |
---|
0:02:49 | we don't need to send on to be just change the central a station but |
---|
0:02:53 | we didn't we just need to send all their anyone information for example an object |
---|
0:02:58 | in this but also the person has to be tracked |
---|
0:03:02 | so the aim of this work is to create a smart surveillance sensor for tracking |
---|
0:03:07 | for automatic tracking using the ptz camera type it is a common is a camera |
---|
0:03:11 | can be combined to into one uh one of our some object |
---|
0:03:16 | and the object is to implement this month algorithms on a dsp board to have |
---|
0:03:22 | um automatically problematical controls the ptz from the board in to be able to activate |
---|
0:03:28 | and deactivate the tracking algorithm from remote |
---|
0:03:32 | so this is an overview of the system in the sense that you can see |
---|
0:03:36 | uh the and em which is the dsp board and the ptz which is our |
---|
0:03:40 | camera in when they are connected together we can process the we just streams from |
---|
0:03:45 | the camera on the dsp in this case we can talk about smart so smart |
---|
0:03:49 | sensor |
---|
0:03:51 | so uh for either because a texas instruments dm six forty seven evm which is |
---|
0:03:57 | a fixed point dsp um then there's also i than a connection between the video |
---|
0:04:03 | in the because the kalman is and i think uh which can bomb gender and |
---|
0:04:08 | sixty degrees in two hundred and then the need to do this |
---|
0:04:12 | um the software is implemented in C uh space with the minimum pixel or system |
---|
0:04:17 | uh we also have it is if we sell where on the evm so we |
---|
0:04:21 | can send commands to activate and deactivate the algorithm rerun single time and more than |
---|
0:04:26 | twenty five frames per second uh on the ptz we have in http server but |
---|
0:04:31 | this property so we don't do anything we just send commands to try to be |
---|
0:04:35 | is that |
---|
0:04:37 | and this is the bead analytics uh basically uh we acquired we just three we |
---|
0:04:42 | decimate and then we didn't the leave and estimate so we uh have smaller frames |
---|
0:04:46 | the process and then we uh we apply our tracking algorithm um the result of |
---|
0:04:52 | this uh tracking algorithm is used to control the same and then uh we can |
---|
0:04:57 | send commands to the camera we can form of the target syllables the C |
---|
0:05:02 | so this is a yeah why the video stream which is that it be easily |
---|
0:05:05 | by C D C R the we didn't lev and we discard the chrominance components |
---|
0:05:11 | of retain only the luminance component uh so the algorithms can work on a actually |
---|
0:05:16 | works and gray scale images |
---|
0:05:18 | and then we decimate so we have small frames |
---|
0:05:22 | the tracking algorithm is based on them when matching and then we use and a |
---|
0:05:28 | sum of weighted absolute differences which is similar to slot is in the C and |
---|
0:05:33 | then we have another team uh rather uh updated them but uh that you really |
---|
0:05:37 | a bit more details about this algorithm are given in this paper |
---|
0:05:42 | so starting from the from a frame we have a region of interest ri and |
---|
0:05:47 | we uh try uh to find the best match for this template ti |
---|
0:05:54 | it's easy here we have the region of interest you have the time but we |
---|
0:05:58 | try to find the best match the in this region of interest so this is |
---|
0:06:01 | the basic concept of them but much |
---|
0:06:04 | the region of interest is defined as the surrounding area around the best match so |
---|
0:06:09 | in that case we have ri plus one in this is uh alright initial region |
---|
0:06:14 | of interest |
---|
0:06:16 | so to minimize uh to find this mismatch we minimize the swad coefficient is you |
---|
0:06:22 | can see here in this one coefficient basically say sum of weighted absolute difference but |
---|
0:06:27 | these um the weighting getting them |
---|
0:06:31 | it's a gaussian gonna this is because we want to give more weight to those |
---|
0:06:36 | to the peaks in this in the center of the target so in this course |
---|
0:06:39 | and uh peak so that the edge of the of the template i belong to |
---|
0:06:43 | an occluding object or in the background |
---|
0:06:48 | so uh up to update the template once a fun the best match which are |
---|
0:06:52 | we compute the template for the next frame so we start from the poor and |
---|
0:06:56 | then but uh we had the best match and then we fuse them together using |
---|
0:07:01 | uh this information which is basically an iir filter and i'll by submitting factor |
---|
0:07:09 | so in this way we can incorporate changes to the from the target in the |
---|
0:07:14 | time but getting on for the tracking in the next frames |
---|
0:07:19 | so once we have the position of the target we can control the ptz which |
---|
0:07:23 | is the common to and we do this to http requests a single H beta |
---|
0:07:27 | voice to the server on the comet or you can see a common commands for |
---|
0:07:31 | the ptz so basically we have uh maybe it is a common to the user |
---|
0:07:36 | name and the common the common the six is the see this is six bytes |
---|
0:07:40 | send um to the um to the camera in this is done from the dsp |
---|
0:07:46 | on the board to the car but also the internet at work so to want |
---|
0:07:50 | to control actually once all the ptz uh in save it to move up or |
---|
0:07:55 | about basically we detect if uh the ten the best match is in the stop |
---|
0:08:00 | originally that it up originally done that region basically the idea is if the best |
---|
0:08:05 | match is and near the edge we is likely that the target is going out |
---|
0:08:10 | of the field of view so we send the commander we don't the ptz either |
---|
0:08:13 | to give up or down left or right so in this way we are able |
---|
0:08:17 | to control the ptz import of the target |
---|
0:08:22 | so these same for frames from the memory of the dsp Z you can see |
---|
0:08:27 | the black box is the region of interest but at box is the target |
---|
0:08:33 | uh is the best match and on top left hand side you can see that |
---|
0:08:37 | there but for the current frame is you can see the target is moving |
---|
0:08:43 | and at the top you can see the template is you know the of any |
---|
0:08:45 | changes so we can always find the best match |
---|
0:08:51 | and for is also use a good as imprecision basically we have a position given |
---|
0:08:57 | by the target and the position uh you from the roundabout and we compute the |
---|
0:09:01 | cuda seriously involved in the precision is standard deviation |
---|
0:09:06 | at we apply the algorithm um with matlab implementations uh before sequences that do that |
---|
0:09:14 | for a sequence you can see that um basically all the track system for the |
---|
0:09:20 | target box the start and uh and cc the ncc is the normalized cross-correlation uh |
---|
0:09:27 | they perform worse because um they are formed by the peaks as a in the |
---|
0:09:33 | um i see that the edges of the time but as you can see the |
---|
0:09:36 | meat the middle this is fine and that's when the person in the video um |
---|
0:09:40 | uses an already space |
---|
0:09:43 | a in the pants the doesn't in six you can see the normalized cross-correlation the |
---|
0:09:48 | side the average so it means that they lose the target while the mean shift |
---|
0:09:52 | and this one can still for the target |
---|
0:09:56 | in these are the two sequences again we see that the normalized cross-correlation in the |
---|
0:10:00 | side the first uh the first graph the average so again that was the target |
---|
0:10:07 | one in this case we have the last example we have a lot sizzled we |
---|
0:10:11 | have that the mean shift just a single target so basically the slides uh tracker |
---|
0:10:15 | the swad based tracker perform but performs better than the sad ncc and the ms |
---|
0:10:20 | in the sequences |
---|
0:10:23 | in here we there are some but somebody got about as we can see that |
---|
0:10:26 | the accuracy could be that this anybody is always lower than all the general that |
---|
0:10:30 | are is not the sequences lots of the precision usually nor so this proves once |
---|
0:10:36 | again that we have good performance without tracking |
---|
0:10:40 | for execution time uh so this algorithm is implemented on dsp on the board and |
---|
0:10:48 | them in this one block with the takes seven milliseconds that we didn't all the |
---|
0:10:54 | fifty milliseconds or frames so basically is less than forty miliseconds and is much more |
---|
0:10:59 | than twenty five from the segment so we had she our name which is real |
---|
0:11:03 | time this efficient in this is done through intrinsics are uh C functions that implement |
---|
0:11:11 | it uh that are implemented for the a particle architecture in this case we have |
---|
0:11:17 | the dsp fixed point architecture so we use that meant for the subsets for the |
---|
0:11:22 | ball before which work on groups of four bytes or for pizza so basically be |
---|
0:11:28 | good um you one cycle one and um swipe matching block we compute we analyze |
---|
0:11:35 | for peaks also be basically got train cut down the competition by four |
---|
0:11:41 | yes an example here the non optimize mation of the same algorithm takes sixty three |
---|
0:11:46 | male milliseconds we just nine times more |
---|
0:11:50 | so this is a working example our system you can see that the board that |
---|
0:11:54 | we don't the ptz the bit that we just came from the ptz goes into |
---|
0:11:58 | the board the board analyses to be disagreement right before the target |
---|
0:12:03 | this is that we do |
---|
0:12:06 | so this is taken from a remote from a display the remote viewer |
---|
0:12:12 | it's you can see that it is a common is moving to follow the target |
---|
0:12:18 | as the target moves left to right |
---|
0:12:20 | your clothes are far away from the camera |
---|
0:12:25 | every as the label well the camera the algorithm is still able to track the |
---|
0:12:28 | target control the be the set so we can always of the target in the |
---|
0:12:32 | field of view |
---|
0:12:40 | so you conclusion uh i presented in a dsp embedded smart surveillance sensor uh using |
---|
0:12:47 | the ptz camera to uh for the target as he tried to move out of |
---|
0:12:52 | the field of view of the dsp on the dm six forty seventy six point |
---|
0:12:57 | uh and the target we use is the swad based tracker the results show high |
---|
0:13:02 | accuracy um accuracy and precision under partial occlusion |
---|
0:13:08 | so for future work we will try to think include also complete occlusion handling |
---|
0:13:14 | uh C |
---|
0:13:16 | take upon this paper is to avoid the just published so here you have a |
---|
0:13:21 | big deal with the swad based tracker we don't occlusion handling you can see that |
---|
0:13:25 | the tracker loses the target is it becomes occluded while we didn't you originally technique |
---|
0:13:34 | really able to recover the target this it comes out of depression |
---|
0:13:38 | so for future work we will try to implement also this feature on the board |
---|
0:13:47 | so this concludes my presentation thank you for listening in a few not constantly have |
---|
0:13:52 | a test i |
---|
0:14:17 | right |
---|
0:14:19 | uh_huh |
---|
0:14:21 | at the moment and we don't use the so uh feature of the calmer so |
---|
0:14:27 | yes when the target most close to the camera the at the target the size |
---|
0:14:31 | of the target sure larger screen and just not we don't do it for simplicity |
---|
0:14:37 | but as you can this is solved |
---|
0:14:41 | basically what we updated i |
---|
0:14:43 | see here though they're target the smaller so we can uh interface |
---|
0:14:50 | and then to close it closer to the camera so we can uh incorporate the |
---|
0:14:53 | changes of the target in of them but at the moment we don't uh i |
---|
0:14:58 | just a precise the target that's another thing to do in the future |
---|
0:15:11 | okay |
---|
0:15:18 | right |
---|
0:15:22 | right there this target is not for face tracking or any particular objects that is |
---|
0:15:29 | it's a target tracking so it works is always a target they say and the |
---|
0:15:34 | obvious a good texture |
---|
0:15:36 | okay so you can discriminate target from the from the background so here we start |
---|
0:15:40 | from the face as an example and then he moves exactly closer to the calmer |
---|
0:15:45 | obviously the face is the big for the template and gets my fading mimo my |
---|
0:15:49 | neck |
---|
0:15:50 | so by the generous for any object is not only for france |
---|
0:16:09 | yeah |
---|
0:16:11 | well |
---|
0:16:17 | a |
---|
0:16:19 | right |
---|
0:16:21 | but you mean for the for the future work on mention yes okay and in |
---|
0:16:25 | that in this uh in this paper to enter with a complete occlusion basically what |
---|
0:16:32 | we do is we don't update when the target was under occlusion with an update |
---|
0:16:36 | the whole template at the same time but same weight but we have different weights |
---|
0:16:41 | for all the pixels in the time it so when you go center occlusion we |
---|
0:16:46 | don't update decide the one of the possible with it only this one |
---|
0:16:50 | and eventually when you see it yep it only few pictures on the site the |
---|
0:16:55 | means of the target is going to be able to discern so in the next |
---|
0:16:59 | three next few frames that is occluded |
---|
0:17:02 | in that case you don't update anymore and you say the target is occluded and |
---|
0:17:06 | then when it comes out is you have not updated decide the occluded one when |
---|
0:17:11 | it comes out on the occlusion the target is the template is preserved so again |
---|
0:17:15 | you can find the best match for your started |
---|
0:17:33 | yeah |
---|
0:17:35 | a yeah it can be adapted for |
---|
0:17:43 | yeah well this is an usual in surveillance you have three components the detection algorithm |
---|
0:17:49 | the tracking algorithm then the position or something that this is only the talking a |
---|
0:17:54 | good |
---|
0:17:55 | for an to select the target you can either we manually we can use an |
---|
0:18:00 | automatic algorithm |
---|
0:18:02 | usually in surveillance systems you have a person driving the ptz |
---|
0:18:09 | trying to find something and then the rest and we're not to be the set |
---|
0:18:14 | on the target |
---|
0:18:16 | and then why this algorithm to talk |
---|
0:18:35 | she |
---|
0:18:39 | right |
---|
0:18:41 | okay and the template |
---|
0:18:44 | this thing about it depends on the landing factor |
---|
0:18:48 | yeah |
---|
0:18:50 | in this case as we process but more than twenty five from the second we |
---|
0:18:55 | give important way to the previous uh |
---|
0:18:58 | to the previous template into the best match but you can choose the brain in |
---|
0:19:03 | a real application so to who you want you want to give more weight so |
---|
0:19:08 | if you wanna have a um rgc |
---|
0:19:11 | you want to preserve just ten but then you would give more weight to your |
---|
0:19:14 | previous time but |
---|
0:19:16 | okay if you want to a docking very fast and you will give more weight |
---|
0:19:21 | to the best match in the case for example you give a divorce a divorce |
---|
0:19:25 | and seventy percent of the best match so you're able to incorporate the changes in |
---|
0:19:29 | the ten but for |
---|