0:00:13i
0:00:13so in in from university of a company as some babble
0:00:17and and and present a selection
0:00:20a a close to a recursive and and distortion estimation for
0:00:24subpixel pixel motion compensated video coding
0:00:26is what was down
0:00:28a a a a batteries min an hour a let's that's it you know
0:00:32so
0:00:32what we do call there's
0:00:34a employee to
0:00:35a a temporal of the motion compensated prediction
0:00:38uh to exploit the temporal a redundancy and that choose
0:00:43where a high operating efficiency
0:00:45so a a if that is
0:00:47a when a and at uh to channel distortion
0:00:50like the eigen use
0:00:52is it are updated
0:00:53due to that so or or or truncation
0:00:56so there are many in already rescinded
0:00:58techniques
0:00:59like a
0:01:01multiple description uh skin what we do coding
0:01:03or or a need for protection extension
0:01:06ah
0:01:07are introduced
0:01:08to mitigate this
0:01:09you fight of or or propagation
0:01:12and the basic principle of all of them is
0:01:14that's really
0:01:15a a screen
0:01:16some
0:01:17coding efficiency for the transmission us next
0:01:20and
0:01:21so
0:01:23i lots and quickly you mean by on the other and it is to estimate
0:01:27the actual and and distortion which is that
0:01:29incorporating in the rate distortion
0:01:31a more to optimize that's done
0:01:34so we to use the comes all and and distortion
0:01:38but the formal definition of are written at the uh
0:01:42uh
0:01:43this this
0:01:43so the basic idea is
0:01:45that's say we use
0:01:47this i
0:01:47a a and super square i
0:01:50to denote that a a a a fixed i and frame and
0:01:53and this is the original and two of this pixel
0:01:56and and we use that high
0:01:58to denote know that
0:01:59the encoders reconstruction
0:02:01which is only a a a a a third of to compression
0:02:04and then we use
0:02:05to that
0:02:06to denote the decoders we construct a
0:02:08which is a
0:02:09subject to the packet loss and or assume and a higher
0:02:14a a or a tradition
0:02:15so
0:02:16that
0:02:17the end-to-end distortion is really you by the difference
0:02:20from this
0:02:21decoder reconstruction
0:02:23uh
0:02:24between ah a from this
0:02:26to this all original binary
0:02:28so it i that can work
0:02:30like that
0:02:31a a lossy compression of the channel
0:02:33and
0:02:34consume and what is
0:02:35uh
0:02:36that's space
0:02:37but the problem is
0:02:39oh we we do the encoder a we do the encoding process at the encoder side
0:02:44and this
0:02:45we to it is really and known to the encoder do two
0:02:50the fact that
0:02:50a a packet loss is
0:02:52is really a i can loss is really of anything but
0:02:55with respect to the encoder
0:02:57oh the we to optimal per pixel estimate is
0:03:01a a pretty well known approach to X T makes such
0:03:04a a and and distortion
0:03:06the basic idea is
0:03:07uh the end and distortion
0:03:09oh of pixel
0:03:10can be formulated as that's
0:03:12and
0:03:13but decomposing composing these
0:03:14in two or three turn
0:03:16we we see that is really a
0:03:18linear combination of
0:03:19some the first and second moments of
0:03:22we you call there's reconstruction
0:03:25and a trees the decoder reconstruction of each pixel as seven of arrival
0:03:29then recursively compute
0:03:31a to the second moments of the recurrence are reconstructed pixels
0:03:34and then
0:03:35in in first and second moments of the the comes of a decoder reconstruction
0:03:40a we i that the the expected end-to-end distortion i think of their side
0:03:44and which is then incorporated it into
0:03:46uh
0:03:47the we use a simple a of optimization frame more
0:03:51or the at the recursion explicitly accounts for
0:03:54a all of the operations is and like
0:03:57uh
0:03:58coding and and consume and and
0:04:00also the channel still cast it
0:04:03um a have an extension and he's
0:04:05whether it's these tensions have been that's that we great i
0:04:08in into it is
0:04:09a a method for we're raising
0:04:11coding
0:04:12so here are some
0:04:13a a update equation is not no
0:04:16the basic idea is that a
0:04:17do the first and second moments of the reference pixel
0:04:21we can
0:04:22a i was compute the first and second moments of the current pixel
0:04:26and
0:04:27which
0:04:27then is that the uh uh i as i expected it and and distortion
0:04:33so
0:04:34everything recent fine
0:04:36but
0:04:36there is a limitation that it
0:04:38the come on but i it to me computes the end-to-end distortion of each pixel
0:04:43that is pretty good from what
0:04:44a single
0:04:45pixel in the previously reconstructed frame
0:04:48but many we do a regions night
0:04:50of
0:04:51a a set it's so motion compensation
0:04:53but if more people reconstructed pixels in the higher for
0:04:57a to pretty that signal
0:04:58to all at such prediction usually in that are model uh in the format of like
0:05:03linear combination
0:05:05in the so called cross correlation issue
0:05:08and
0:05:09i was
0:05:10a quickly you real at that time that and accurate estimation of are
0:05:15a score nation terms
0:05:17will you better and
0:05:19are there are and square no
0:05:21compute and memory units
0:05:23where this is and where
0:05:24you a we use these big then
0:05:26to do you know the total
0:05:28number of pixels in for a
0:05:30so this is really out
0:05:32computation computation of
0:05:33such
0:05:34optimal and a distortion
0:05:35estimate
0:05:37a i think we're
0:05:38example to show
0:05:40calls that's "'cause" condition term you emerges
0:05:43in two
0:05:44a a our update recursion
0:05:46some that's consider and
0:05:47by near and
0:05:48prediction
0:05:49the
0:05:50and it is
0:05:51we could just the average of two
0:05:53a reconstructed pixel X and Y
0:05:56so now we need to first and second moment of the
0:05:59interpolated pixel
0:06:01for the first moment is fine
0:06:02mean you just
0:06:03a and linear combination of the first
0:06:06no of the
0:06:07of of the reconstructed pixel
0:06:09but for the second moment yeah the two
0:06:12second
0:06:12well most of the
0:06:14reconstructed a pixel
0:06:15there's that
0:06:16start uh additional term
0:06:18yeah of X Y which is the cross-correlation correlation
0:06:21and we don't have a in our for
0:06:24uh
0:06:25many higher are a is
0:06:27uh have in
0:06:28the two
0:06:30so a overcome these
0:06:32each you
0:06:33ah
0:06:34while idea is
0:06:35that's just a
0:06:37a a the this cross correlation
0:06:39but its maximum and which is provided by
0:06:42the second moment of two marginal
0:06:44all
0:06:46second moment
0:06:48however we can narrow yourself doing
0:06:51the cross condition directly
0:06:52we come up with a correlation coefficient
0:06:55as a function of the distance
0:06:57the the use of X Y is really
0:06:59the spatial distance between two pixel
0:07:01oh two pixels X and Y
0:07:04and this class conditional model as
0:07:06oh and this is uh exponentially decreasing
0:07:10with this
0:07:11this data
0:07:14oh
0:07:14in this work
0:07:15we performance
0:07:16an alternative or of perspective
0:07:19in the transform domain
0:07:20so we know that the means square or or it's really comes and they're
0:07:23the unitary transformation
0:07:25and that we propose
0:07:27a some how a
0:07:29spectrum coefficient vision wise
0:07:30optimal recursive because if that's image
0:07:32recall score
0:07:33a a to it this and distortion the two in the transformed my
0:07:37it provides a per transform coefficient
0:07:40i made of the end-to-end and distortion
0:07:42and things like that that's per pixel
0:07:45but uh and i have to we you most the recursion recursive what all computation
0:07:50a first and second moments of
0:07:51oh
0:07:52so some coefficients of
0:07:54oh we well
0:07:55yeah right
0:07:56oh as a set of a mention here that it is really a good about
0:08:00a can for what is
0:08:02a the best coding operations that are perceived in the transform domain
0:08:06but since were
0:08:07what in this work or and exclusively focusing on
0:08:10but and more accurate and and distortion for subpixel motion come the decoding
0:08:14ah
0:08:16but just put it
0:08:17but it and that's i
0:08:19a a basic idea or
0:08:20we have
0:08:21the original band you of the transmit coefficient vision
0:08:24and you know K
0:08:26in frame and
0:08:28and
0:08:29in there to construct a once again next hi
0:08:32decoder the reconstructed X
0:08:33a to to
0:08:34but uh X to there is a random variable with respect to the T encoder
0:08:40and the and is don't expect it and and distortion
0:08:42can be from as set
0:08:44and we C this and and not change really
0:08:48a you can and a first and second moments of the transform coefficient
0:08:53or for in time and
0:08:55uh
0:08:55the uh the is pretty much say
0:08:58the same is now
0:08:59so what would be days
0:09:00with probability one man P
0:09:03the packet that contents
0:09:05this transform coefficient
0:09:06what is that are right at the at the decoder
0:09:09and was a so right
0:09:12the decoder time a reproduced
0:09:14it's that this the time
0:09:16a reconstruction as the encoder so that's why we're using
0:09:19i which is
0:09:20in there's reconstruction
0:09:22and with probability P this it will be about
0:09:26and then
0:09:27the order consume and will be calm
0:09:29and we know that the console
0:09:31pixel uh
0:09:32sort to some coefficient stuff is really are than the right
0:09:35so that's why we are using
0:09:36X
0:09:37a here
0:09:38two
0:09:39uh generate
0:09:40a first moment of
0:09:42but some feature
0:09:43and same idea of ice two second or one
0:09:47the a a a main computational right is
0:09:51in the in
0:09:52uh
0:09:53a big recursion
0:09:55where
0:09:56T the most a reference
0:09:57and
0:09:58it's okay
0:09:59it's not necessarily a great in fact uh i
0:10:03really possible that these guys
0:10:05all right
0:10:06and i that we only
0:10:08have that
0:10:09first and second moment of transform coefficients
0:10:11oh we well
0:10:12so
0:10:14we really need to generate
0:10:16a
0:10:16that's why what is and were using you to
0:10:19oh
0:10:20you K
0:10:21but
0:10:22a a great i
0:10:23so we really need to generally first and second moments of a piece of work will i the
0:10:27well
0:10:28from those i'm great
0:10:30what
0:10:31a a that's just assume that we now
0:10:33the moment all this
0:10:35right but now the make a run
0:10:37our update recursion
0:10:39that's that's
0:10:39with probably he one as P
0:10:41we have the it which contents
0:10:43the we but the if you
0:10:45and with the residual and the motion vector
0:10:47so what have the rate is you which use exactly the same as the encoder
0:10:50but
0:10:51the reference is star
0:10:52it's
0:10:53and no "'cause"
0:10:54due to the higher or uh i can also it
0:10:56and or or or or or a publication
0:10:59and that probably be happy
0:11:01assume and
0:11:02the same thing for second moment
0:11:04now
0:11:05that's can see that
0:11:06how to generate this
0:11:07a a first and second moments of the
0:11:09right for a
0:11:10the
0:11:11we can go
0:11:12a
0:11:14but is a can are in general
0:11:16a a reference that you which is the right one
0:11:19which is a a a great
0:11:21and all the thing to do that are only but
0:11:25so
0:11:26a i'm not that of generate a for by four
0:11:28reference well
0:11:30encoder nice too
0:11:31reference up to my and we go to to
0:11:35a six kind uh i a filter you
0:11:38ah
0:11:38used for interpolation
0:11:40i each of sixty four
0:11:43so
0:11:44the transformation is
0:11:45which is typically D C is simple linear transformation
0:11:49there exists is a constant a set of constants that right
0:11:52we have a
0:11:53to chosen coefficients of the a green well
0:11:56a a a a a a a a a way and now it's really a linear combination
0:12:00all that was
0:12:01one great but
0:12:02and then and we have to
0:12:03for uh first moment
0:12:05you
0:12:06two young being a combination of a known
0:12:09a hundred which is
0:12:11exactly
0:12:12oh
0:12:13which is a tractable
0:12:15but that's a commandment
0:12:16it seems that we were getting
0:12:18owing to this commission you choose okay
0:12:22we need to generate this
0:12:23cross correlation
0:12:24which we don't know
0:12:26by the major advantage that
0:12:27what do everything the transform domain is that
0:12:30are the spatial transform style as already removed
0:12:34i i remove the correlation code uh
0:12:36the correlation between
0:12:38transform coefficient
0:12:40so
0:12:41ah
0:12:42specifically
0:12:43where tree
0:12:44this is a cost on the into two categories
0:12:47the first one
0:12:49a
0:12:50to transform coefficients
0:12:51a different frequency but in the same now
0:12:54that content i was in the same packet
0:12:57so that by either simultaneously
0:12:59or or or that the clutter
0:13:01so we use X that are to uh
0:13:04to represent
0:13:05the you call we the packet is received
0:13:09and X F E
0:13:10then
0:13:10decoder can assume in the packet is not
0:13:13so
0:13:14first
0:13:15moment of these
0:13:16a reconstruction and consume
0:13:19are really
0:13:20accurate and a tractable
0:13:22and it it "'cause" that cross correlation can be
0:13:25from the data sets
0:13:26with probability one month P
0:13:28input uh the packet right
0:13:30and then
0:13:31uh at the decoder will we produce
0:13:34to to constructions
0:13:35and that list that the P
0:13:37i can last
0:13:38because that will produce
0:13:40so a will generate to consume and
0:13:43so this is really the exact cross correlation
0:13:46but that's a that we do know this
0:13:48two
0:13:49a a cross correlation
0:13:50we seem to
0:13:51a a a a a a a to make this two
0:13:54i
0:13:55the product of
0:13:56to marginal of first form
0:13:59ah
0:14:00appealing to
0:14:01the are going to be an is in the transform but my
0:14:05and and that the second category
0:14:07it's for
0:14:08in temporal correlation
0:14:10so
0:14:10a recall the energy come a complex and the of dct
0:14:14with say
0:14:15the the uh i don't have a correlation is really dominated by
0:14:19that be in the P C
0:14:21components
0:14:22and which is
0:14:23which is that's you meant to be unity
0:14:25and for all other easy C coefficients which
0:14:28we we compute as that are uncorrelated
0:14:31so we note that some more re fine
0:14:33models are possible
0:14:35but our experiments show that such treatment provides firmly
0:14:39uh i accurate estimate for most nature of the
0:14:43a here's seven
0:14:44uh overview of the procedures of this up recursion
0:14:48so given
0:14:49first and second moments of
0:14:51oh we uh are transform
0:14:53coefficients of we well in from a my one
0:14:57a a condition are the packet a ride or not
0:15:00with then
0:15:01update
0:15:02the moments for frame and uh you frame and
0:15:05so a first at five where the reference block base
0:15:08and then
0:15:09a a like this
0:15:10first and second moments of the reference point
0:15:13using a linear combination of the known
0:15:15a moments
0:15:17and the we compute
0:15:18the
0:15:19a a and
0:15:20moments
0:15:21a a helping to
0:15:22in or inter mode
0:15:23so either
0:15:24oh holding model
0:15:27so here are some
0:15:28a a image accuracy
0:15:30so we first try
0:15:32uh in the slide in
0:15:33all also all
0:15:35where a low is known to provide the
0:15:37a Q more and and distortion
0:15:39so in this
0:15:40peak sure the black i one
0:15:43is our our a simulation where
0:15:45we have a or where like
0:15:47fifty two a hundred
0:15:49uh
0:15:49i
0:15:50and a packet loss or realisation issue
0:15:52and
0:15:53the point one
0:15:54is the and end distortion provide be by low
0:15:57which is known to be the optimal
0:15:59and the
0:16:00the right one
0:16:01which
0:16:02uh
0:16:03well
0:16:04be seen
0:16:05this
0:16:06is the estimate provided by score
0:16:08is C
0:16:09uh that
0:16:10and in the second row
0:16:11for pixels square really
0:16:14i from a practical a soon
0:16:16two
0:16:17the law which is the optimal a
0:16:20and the in want to the subpixel pixel siding
0:16:23where do this
0:16:24a a what we're miss some modifications to accommodate such probably nature
0:16:30so
0:16:30we use a
0:16:32this close these wires approximation
0:16:35and
0:16:36the pixel so model respect
0:16:38to generate the variance of right
0:16:40also or
0:16:41and the two blue curves are the estimated
0:16:44uh i estimates provided by those to my third
0:16:47i one
0:16:48is the simulation a house
0:16:50a mining
0:16:51a a and and packet loss realisation issues
0:16:54and the the right one is
0:16:55the uh and when distortion one by score
0:16:59i see that's
0:16:59score are really what war
0:17:01i
0:17:02and distortions
0:17:03the second off
0:17:04the pixel motion compensated video coding
0:17:06so
0:17:08in conclusion
0:17:09i we proposed
0:17:10a spectrum coefficient once optimal recursive
0:17:13as as may
0:17:14to estimate the end-to-end distortion
0:17:17the spatial transform my
0:17:19as as well as
0:17:21the correlation property of the special
0:17:22for
0:17:23as well as energy compact set a property
0:17:27two close and be checked the cross correlation
0:17:29and provides a more accurate estimate of
0:17:31uh and when distortion in the deciding of sub pixel
0:17:34and we also know that
0:17:36square
0:17:37and then be shown to sub soon within in is our original no function that
0:17:43yeah
0:17:49so we have to one question
0:18:06you mean of a job um as
0:18:08yeah
0:18:09a
0:18:10yes so this is a a question
0:18:12so they you is that a a a a a really person
0:18:15where
0:18:16a computational
0:18:17calls compared to know
0:18:19but the thing is
0:18:20this
0:18:21i Z
0:18:22this
0:18:22ah
0:18:24back here
0:18:24so this
0:18:25set of a constant coefficient which is
0:18:28this small i
0:18:29which is really
0:18:30a a a a a lot
0:18:31spatial location dependent parameter
0:18:34uh
0:18:34not all of them are you can be important
0:18:36usually the usual case is that
0:18:38but most of them that are really
0:18:40new vegetable able the value of them are really that
0:18:47yeah that the
0:18:48that's so so that there it's eight
0:18:50uh is that are like fast algorithm within two
0:18:54simply by these computational complexity
0:18:56but in this work were we focusing on
0:18:59there are uh
0:19:01optimal he
0:19:03okay is you
0:19:04and