0:00:15 | first i will give a quick overview of i-vectors |
---|
0:00:19 | after that i will |
---|
0:00:21 | only some of the methods for hand recounts and start the |
---|
0:00:26 | of the i-vector eyes them estimate scores by |
---|
0:00:30 | limited the |
---|
0:00:31 | duration of recordings |
---|
0:00:34 | then i will |
---|
0:00:36 | describe a simple preprocessing weighting scheme which uses duration information as a measure of |
---|
0:00:46 | i wrecked or a oral ability |
---|
0:00:49 | then i will describe some experiments and the results |
---|
0:00:54 | followed by concluding remarks |
---|
0:01:00 | in theory each decision should be made to dependent on the amount of data available |
---|
0:01:07 | and the same should hold also in the case of speaker recognition since |
---|
0:01:13 | we usually have recordings of different lengths |
---|
0:01:19 | in practice this is usually not the case mainly due to practical reasons since panic |
---|
0:01:27 | of uncertainty increases the article we agreed to make and computational complexity |
---|
0:01:34 | and also |
---|
0:01:36 | the gain in performance |
---|
0:01:38 | cohen |
---|
0:01:39 | can be not that could be not so significant especially if the recordings are sufficiently |
---|
0:01:46 | long |
---|
0:01:49 | in the case of |
---|
0:01:51 | i-vector challenge |
---|
0:01:53 | the |
---|
0:01:55 | i-vectors were extracted from recordings of different lengths |
---|
0:02:00 | and to the duration follows log normal distribution this suggests |
---|
0:02:06 | that |
---|
0:02:08 | we should see some improvement |
---|
0:02:11 | if the duration information is taken into account |
---|
0:02:18 | i-vector is defined as a map point estimate of keeping the variable of factor analysis |
---|
0:02:23 | model |
---|
0:02:24 | and it serves as a compact representation of speech utterance |
---|
0:02:31 | the posterior covariance encodes the answer t |
---|
0:02:36 | of the i-vector or |
---|
0:02:39 | estimate |
---|
0:02:42 | which is caused by a limited to duration of the recordings |
---|
0:02:46 | usually |
---|
0:02:47 | the i sort the |
---|
0:02:50 | is discarded to and comparing i-vectors for example in the |
---|
0:02:54 | be lda model |
---|
0:02:59 | nevertheless there have been proposed some solutions how to the |
---|
0:03:05 | take the uncertainty into account for example be a day with uncertainty propagation |
---|
0:03:11 | where and then we should note term is added to the model |
---|
0:03:16 | which models |
---|
0:03:17 | which explicitly models the |
---|
0:03:20 | duration variability |
---|
0:03:22 | another one |
---|
0:03:24 | is score calibration using different |
---|
0:03:26 | duration is a quality measure |
---|
0:03:28 | and yet another recycle i-vector scaling where the length normalisation is modified this to account |
---|
0:03:35 | for the |
---|
0:03:37 | uncertainty of i-vectors |
---|
0:03:43 | and those solutions are not directly applicable or at least not |
---|
0:03:48 | easily applicable in the context of i-vector challenge |
---|
0:03:53 | scenes |
---|
0:03:54 | the data for we can start |
---|
0:03:57 | reconstructing the posterior covariance is not available |
---|
0:04:01 | and also there is no development data that could be used for |
---|
0:04:08 | optimising the calibration parameters |
---|
0:04:12 | so is there another possibility how to use duration information |
---|
0:04:18 | to as a measure of i-vector a |
---|
0:04:22 | rely reliability |
---|
0:04:27 | prior to |
---|
0:04:32 | comparing the i-vectors are usually preprocessed |
---|
0:04:37 | among more common preprocessing methods are pca lda and do within class covariance normalization |
---|
0:04:46 | in which the basic step is to calculate mean and |
---|
0:04:52 | covariance matrix s |
---|
0:04:55 | we implicitly assume |
---|
0:04:56 | in those calculations that |
---|
0:04:59 | each the i-vector is equally all i-vectors are equally reliable |
---|
0:05:07 | some to account for the difference in a reliability of i-vectors |
---|
0:05:14 | re |
---|
0:05:14 | proposed a simple weighting scheme in of each other |
---|
0:05:21 | in which the to could contribution of each i-vector is multiplied by its corresponding duration |
---|
0:05:29 | so |
---|
0:05:30 | to verify the |
---|
0:05:34 | soundness of the proposed idea |
---|
0:05:36 | we implemented that the baseline system right in which we compare it |
---|
0:05:42 | the standard pca with |
---|
0:05:45 | the weighted version of the pca |
---|
0:05:49 | the results showed that weighted version of peace |
---|
0:05:52 | pca |
---|
0:05:56 | produce slightly better results than a standard one |
---|
0:06:01 | we also wanted to |
---|
0:06:04 | try within class covariance normalisation |
---|
0:06:07 | but |
---|
0:06:08 | in order to |
---|
0:06:10 | the apply within class covariance normalization |
---|
0:06:14 | we need to have labeled to date time which was not the case in the |
---|
0:06:19 | challenge |
---|
0:06:21 | so we needed to perform unsupervised the clustering we |
---|
0:06:28 | but |
---|
0:06:29 | experiment that with the different clustering algorithms but that the end to be selected k-means |
---|
0:06:35 | with cosine distance and four thousand clusters |
---|
0:06:42 | unfortunately the results are worse for within class covariance normalization then for a pca but |
---|
0:06:49 | at least the |
---|
0:06:51 | the weighted version was |
---|
0:06:54 | slightly a cat of the standard one |
---|
0:07:00 | we tried also several different classifiers and the best results were at used |
---|
0:07:07 | with a logistic regression but only after reading remove the |
---|
0:07:12 | length normalisation of from the processing pipeline |
---|
0:07:17 | in that case within class covariance normalisation |
---|
0:07:21 | gave better results then pca and all spend the can |
---|
0:07:27 | weighted towards and was |
---|
0:07:29 | score the better than standard one |
---|
0:07:35 | we try to further improve the results by additional fine-tuning |
---|
0:07:42 | so we put the duration as it and additional feature of i-vectors we excluded clusters |
---|
0:07:49 | with small official score |
---|
0:07:52 | we were is the roles of |
---|
0:07:56 | target and test i-vectors |
---|
0:07:59 | and do you can't do you want to the hyper parameters of logistic regression |
---|
0:08:04 | we did this fine tuning we were able to improve |
---|
0:08:09 | the previous result |
---|
0:08:11 | for a little bit more |
---|
0:08:13 | so this was also our |
---|
0:08:16 | best submitted result |
---|
0:08:20 | to conclude we |
---|
0:08:22 | present that |
---|
0:08:23 | a simple preprocessing weighting scheme which uses do duration information is a measure of i-vector |
---|
0:08:30 | a reliability |
---|
0:08:33 | we at you would quite reason the bus six sets |
---|
0:08:37 | with a clustering in the case of within class covariance normalization |
---|
0:08:42 | but okay but cat |
---|
0:08:45 | nearly no success with clustering in the case of the lda |
---|
0:08:50 | which suggests that we had a is more susceptible for labeling errors |
---|
0:08:56 | and the last remark we found out that length normalization does not help logistic regression |
---|
0:09:03 | thank you |
---|
0:09:21 | okay |
---|
0:09:31 | just empirical results but maybe somebody s can comment that i don't know |
---|
0:09:40 | nicole |
---|
0:09:46 | with on the side |
---|
0:09:47 | but we with at the same results as logistic regression icons otherwise |
---|
0:10:06 | did you generate what we did a clustering or you just one clustering stage we |
---|
0:10:10 | tried |
---|
0:10:11 | different things also two |
---|
0:10:14 | to iterate the clustering but didn't six it |
---|
0:10:26 | this was also experiments clear sets of four thousand because we didn't the get then |
---|
0:10:32 | you improvements by |
---|
0:10:35 | changing |
---|