0:00:17graph to everyone
0:00:19uh
0:00:19i only a low of i them you have i i them yeah in kind issue sure that when you
0:00:23an inverse
0:00:26so that title of my presentation is cost says that is taking for audio tag annotation and retrieval
0:00:35so he uh here is a of
0:00:37example one famous
0:00:39uh music taking web
0:00:42that that every N
0:00:44uh we show the uh
0:00:46sound track that T and it's
0:00:48a so your text
0:00:51and this so take
0:00:53provide reach information four
0:00:55oh a and and then is this
0:00:57so for example we can trace and class wise using
0:01:01uh
0:01:01using uh the audio
0:01:04uh
0:01:05and that the audio so your at we use us that then take
0:01:10so in this paper we focus on two important information the first one
0:01:15at a um
0:01:16a which means the number of different users who have annotated this tech
0:01:21uh in this i guess each simple a larger font
0:01:25uh indicates a a higher take on
0:01:28and the second uh information used tech corporation
0:01:32uh as we know that
0:01:34uh sound takes a a uh often cork curve
0:01:39so we propose a a cost is at this they keep for X probably in take a long and take
0:01:44a should
0:01:45joint to the
0:01:49okay we first introduce
0:01:51uh to use the information retrieval task
0:01:54the first is
0:01:55a audio annotation that is given a an audio creep
0:01:59oh we will uh we can make pretty shouldn't use sound take five
0:02:04so we would know that
0:02:05are these audio you be but so at with which takes
0:02:10and the they why is
0:02:12uh take based audio retrieval
0:02:15uh given uh take where
0:02:17carrie
0:02:18we can opt to prediction score using in the
0:02:22take cross fine
0:02:23and they we will have a range in these
0:02:26for the query
0:02:30okay
0:02:31so the first interest is since is the take on
0:02:34uh
0:02:35uh i we know that uh uh show take a a a sign
0:02:39by people with a label musical and knowledge
0:02:43so they inevitable ready
0:02:45concave noisy information
0:02:49oh we think that uh take on information should be can see the rate in all to make all automatic
0:02:55may take it because luck on
0:02:57for X
0:02:58the constants degree of the tech
0:03:01and the higher take on
0:03:02a more reliable and at in
0:03:07yeah were here we can't that uh
0:03:09experiment woman
0:03:10we have to select a at high to take
0:03:14and sound a low can take
0:03:16we uh come and they are uh the prediction performance
0:03:21according to false negative rate
0:03:25as we can see that
0:03:26the force snake you rate on the high can take
0:03:30uh match
0:03:31more more than in the goal con
0:03:33okay
0:03:34so we believe that uh it is uh ever that that
0:03:38uh high context um more reliable way stay at a
0:03:45okay we try to calm be is you by D
0:03:47this it's temple yeah we show the uh not train likely be and it's
0:03:52so text
0:03:54and this a the height how high high cut takes
0:03:58uh include tea
0:04:01six these
0:04:02peters
0:04:03british
0:04:03drastic rock
0:04:05all D
0:04:07so we put believe that
0:04:08these are more reliable the D and and important take
0:04:15so uh is some uh previous work
0:04:19uh that that take on is transformed into one or zero by using a this a whole
0:04:24and then a binary classifier each
0:04:27trend for each take to make prediction
0:04:30but
0:04:30a this may uh at it uh this may have a some problem
0:04:35a first slice that
0:04:36take on information is lost
0:04:38but take a side twice is traded it
0:04:41in the say same way as a take
0:04:44a hunter of time
0:04:46is the second probably used let
0:04:48a it is hard to determine the source all
0:04:52and that's supper so
0:04:54that's there probably is that
0:04:55uh they will be ambiguity in and lower class membership
0:05:00all that is that is nearby by less is all
0:05:02for example you we set take on the so to ten
0:05:07so that is that is
0:05:08moos take on is
0:05:10ten it will be kind see there eyes up as the use simple
0:05:14but
0:05:15but it uh if it's take out is nine they you would be can see there S a negative example
0:05:21i with the are that you is it it is there is strange
0:05:27so i'll question use
0:05:29how to use the take on information for audio tag annotation and retrieval
0:05:35and all i and there is cost is it the name with the take on as
0:05:39cost
0:05:43so in close this at the learning we are given a training in
0:05:46set
0:05:47X Y in C
0:05:49the X is the feature vector
0:05:51and why is
0:05:52the class label and C is
0:05:55the
0:05:56crap and misclassification cost of these it's simple
0:06:01or look for all
0:06:03uh
0:06:03close says that the than the in is to then a class fine
0:06:06which minimize the expected
0:06:09cost and on and thing is
0:06:12and it is a more general state apple
0:06:14all
0:06:15traditional classification problem
0:06:20so in all uh application i'll court is to
0:06:24minimize
0:06:25mays classified take on for audio tag annotation
0:06:29and retrieval
0:06:31so if
0:06:31one hundred you use annotate a an audio clip which is rock
0:06:36but that
0:06:37five years
0:06:38oh force the egg negative
0:06:40then the cost is one hundred
0:06:43so the cost it's of the than in were
0:06:46where we pay more attention on the reliable or at and and important take
0:06:52and
0:06:53so we have it
0:06:55uh we have a it's probably at to close sensitive by binary classifiers
0:06:59the first why is close since that these support vector machine
0:07:03uh is a public the machine the training error wrote ten "'cause" see
0:07:07uh uh uh will uh will be a some shady
0:07:11with a cost
0:07:12to
0:07:13i
0:07:16and the second "'cause" since they class twice
0:07:19uh a close to the end up pose
0:07:21so here we show the update they uh way
0:07:24is that weight update do E do
0:07:27in add up pose
0:07:29and uh
0:07:30though uh weight updating eighteen all and is that is will be proportion to the cost of these is that
0:07:41okay
0:07:41uh the second
0:07:43uh you put "'em" information in is uh take variation
0:07:47i and a is on previous work the take on notation as is
0:07:51separated it into several
0:07:53a binary classification problem
0:07:55so uh
0:07:57les assume that that takes a are independent
0:08:02so
0:08:03the take colouration information use lost
0:08:06for example we know that he have and wrap open call curve
0:08:11for
0:08:11for example we yeah in
0:08:13our database
0:08:15uh
0:08:16we can't all the that
0:08:17they call curve
0:08:19vol one hundred and sixty times
0:08:22and they are only uh seventy and
0:08:25so T six times that
0:08:27they all curve
0:08:28a little
0:08:30or we propose close this at these take into it's probably eight
0:08:34take on and ageing information
0:08:36joint of the
0:08:39so uh
0:08:41in so uh for the uh a so how close is that these in these that
0:08:47uh in this first stage way which change stand close since the D take for a fine
0:08:53for each take
0:08:55and
0:08:56thus thinking class vice use the output put all take class at
0:09:01as
0:09:01it
0:09:02inputs
0:09:04and we use the in yeah class five for that's taking cows five
0:09:08so if the you if we then
0:09:11uh the and so we can then the
0:09:14top you here
0:09:16and if
0:09:17uh W i they is greater than zero
0:09:20then it means
0:09:22take
0:09:22they is positive the core eight at to take i
0:09:27so
0:09:28uh the take or if you information can be
0:09:31a head the read by that's taking cost five
0:09:36okay here you know we discuss uh with these by our experimental state up
0:09:42so uh
0:09:43our baseline is our weenie met the
0:09:46all E matrix
0:09:47two thousand nine audio take in task
0:09:50uh this mess the use cost is sensitive
0:09:53and
0:09:54only use binary class
0:09:57and all
0:09:57uh oh experiments basic the follow the matrix
0:10:02to like to thousand nice it up
0:10:04we use the then forty five take
0:10:07and uh we a little uh audio problem
0:10:10may john mind the with sign
0:10:12which is a web base you C get
0:10:15i we have so many
0:10:17uh a the paper
0:10:19and
0:10:20a it parameters amateurs a they can be
0:10:22select the based on in a course by dish you on a training data
0:10:26and we P
0:10:28cross validation one hundred time
0:10:32okay if you know we show our experiment results
0:10:35uh the the audio annotation is even a by trade but a use the
0:10:41that
0:10:41that is
0:10:42a a a keep we were run the correct
0:10:46uh take to be rank higher
0:10:49and audio retrieval is
0:10:52you rank by take
0:10:53use C and of F major
0:10:56that is given to take we one the
0:10:59correct in is is that is to be dragged higher
0:11:03so we have uh use uh different class Y i different class wise
0:11:07in the first
0:11:08state
0:11:09uh including and uh pose and S yet
0:11:12and the and sample is a combination of the these two
0:11:16uh these two got
0:11:18and we have come for method
0:11:21uh though first slice out matrix baseline line and the second a one is
0:11:27a a it's that the
0:11:28uh close a send the that need only
0:11:31a the sir is
0:11:32staking only and of force is our proposed
0:11:35a a at these taking
0:11:38as we can see that
0:11:40in all cases uh the close is at least expected problem
0:11:44better
0:11:45in
0:11:46a the of uh to other them is the
0:11:52and
0:11:53uh you thus taking only and
0:11:55close to the or learning all only
0:11:58will be better then
0:12:01our matrix baseline
0:12:07okay so i'll cook conclusion she's
0:12:09uh
0:12:10take on a hot and take a if you got two important you formation for so your take pretty should
0:12:16a are time media data
0:12:19and we have first formulate the
0:12:22oh T take should task as a close since T classification problem
0:12:27to minimize
0:12:28the means classified take on
0:12:32and we have uh then for me rate the task as a cost sensitive multi label classification problem
0:12:39and propose
0:12:40close says of these they kid to exploit
0:12:43uh take on and core you formation joint to the
0:12:47and the experiment
0:12:49experiment results show that the new approach
0:12:53oh i'll to prove our matrix two thousand than i we knee have the
0:12:59so uh here we have a a me uh uh a journal paper so please see out the our journal
0:13:06version of this paper of four
0:13:08start uh more details and start it's station walk
0:13:12all this idea
0:13:15okay thank you
0:13:37uh
0:13:39yeah i have a a a a a try to a i i've have used to so of the first
0:13:43mess the is uh transform the
0:13:46uh i'll put all S yeah and and outputs into power would be a T then every average you a
0:13:50proper bit
0:13:52and
0:13:52uh
0:13:53those stick a mess the is to transform the pretty she's goal in into read at rate
0:13:59and uh final decision use the uh
0:14:01every rate
0:14:09thank you