0:00:15 | next presentation is e |
---|
0:00:17 | don't by two people from the end of i both school that it and both |
---|
0:00:21 | in the same room |
---|
0:00:22 | working on more or less same problem but in with different approaches |
---|
0:00:27 | so we're going to talk about the database which may be relevant to forensic work |
---|
0:00:33 | be a the basic paradigm |
---|
0:00:36 | in forensic speaker recognition or speaker comparison you might say i think as has been |
---|
0:00:44 | put for by a fourteen then another's earlier but i summarizes here in the formula |
---|
0:00:49 | for people that like formant |
---|
0:00:52 | so basically what did you george or did you once or should want we have |
---|
0:00:57 | to tell them that this is what they want is posterior also bouts about |
---|
0:01:03 | claims where the defendant is guilty or not |
---|
0:01:05 | and i actually can be factorized into |
---|
0:01:09 | to factors |
---|
0:01:11 | the likelihood ratio which is the first factor on the right hand side and the |
---|
0:01:16 | prior odds and the idea is that the |
---|
0:01:19 | for all sorts german by |
---|
0:01:21 | well that somehow that have to be determined and it we say it's the ports |
---|
0:01:27 | dropped to do that but will be influenced by lots of other things at the |
---|
0:01:31 | circumstances but they might include other evidence other evidence which is not relevant to the |
---|
0:01:38 | speech |
---|
0:01:40 | so this is just to get your no idea what so what the framework is |
---|
0:01:43 | so that there's a connection to stop we do in nist most people are maybe |
---|
0:01:49 | more familiar with in nist questions and i've summarize set here |
---|
0:01:53 | namely in the forensic case you might say the judge should the jury won't want |
---|
0:01:58 | to decide that have been it is guilty if those posterior also part |
---|
0:02:03 | are higher than some reasonable that |
---|
0:02:06 | whatever the reasonable doubt should be it's related to the |
---|
0:02:10 | to do cost function you might say |
---|
0:02:13 | and in nist |
---|
0:02:15 | it is quite similar except that there we work with the likely richer itself which |
---|
0:02:19 | is just a ratio of the posterior and the prior odds |
---|
0:02:23 | and that should be bigger than some threshold and the threshold only depends on the |
---|
0:02:27 | cost function or you want included priors on right hand side is also possible |
---|
0:02:32 | and |
---|
0:02:33 | if you have well calibrated likelihood ratios than your threshold will be ideal and you'll |
---|
0:02:38 | be at the |
---|
0:02:40 | point we know as minimum dcf so that that's the relation where likelihood ratios in |
---|
0:02:47 | forensic cases unlikely variations in this case is limited |
---|
0:02:51 | are |
---|
0:02:52 | we say |
---|
0:02:54 | but these stories more about the circumstances because everything is dependent on |
---|
0:02:59 | the information and in a forensic case it is the case so you have these |
---|
0:03:03 | weird samples the joe showed us and you don't have to give an expression or |
---|
0:03:08 | an error rate estimate endowments criteria for a general |
---|
0:03:12 | a general case or an average over many comparisons no you have to for this |
---|
0:03:17 | particular case |
---|
0:03:19 | so |
---|
0:03:21 | our approach would be |
---|
0:03:23 | we need data which is similar to the case |
---|
0:03:26 | so what we've been doing model our approach to dealing with a specific case is |
---|
0:03:31 | to make a database with what's of the annotation so that we can more or |
---|
0:03:35 | less select a sub database |
---|
0:03:38 | which will be is a similar as possible to the case as we can |
---|
0:03:44 | a b and it the database will also allow us to test where they're circumstances |
---|
0:03:50 | so mornings for language differences actually matter if we have all that information |
---|
0:03:57 | so |
---|
0:03:58 | this is where we move on to the next speaker |
---|
0:04:02 | and it's gonna be tricky |
---|
0:04:06 | for this work |
---|
0:04:08 | right |
---|
0:04:12 | yes so i'm they've at on the flute also known as david the second |
---|
0:04:18 | or the first depending on your perspective of course |
---|
0:04:22 | i'll be talking |
---|
0:04:24 | to you about |
---|
0:04:25 | the database itself how we created it which metadata what's included |
---|
0:04:31 | which |
---|
0:04:32 | so you get a sense of |
---|
0:04:35 | of all which restriction star are |
---|
0:04:38 | using real data some of the metadata or is just uncontrollable and some of it |
---|
0:04:44 | is |
---|
0:04:47 | there you go |
---|
0:04:49 | it's just of |
---|
0:04:50 | short overview |
---|
0:04:54 | out |
---|
0:04:56 | the thing to note here that it's similar in set up to other database that |
---|
0:05:00 | use real data |
---|
0:05:01 | so the df as the end of i take it to you know of ten |
---|
0:05:04 | years ago and i might actually perhaps they're others that i don't know off because |
---|
0:05:09 | they're all secretive or that i don't know of because i didn't find a |
---|
0:05:15 | but this time it second six hundred speakers so i hope |
---|
0:05:19 | it can be contribution in the field |
---|
0:05:23 | yes but wanna do it is menu validation we want to use automatic speaker recognition |
---|
0:05:29 | in case work obviously we don't do that yet and we need validation research and |
---|
0:05:34 | i feel there are people using asr in case work |
---|
0:05:39 | and i feel |
---|
0:05:40 | perhaps a bit conservative i need realistic data for calibration and |
---|
0:05:46 | otherwise |
---|
0:05:47 | it will be the will be no real improvement because improvement in using asr over |
---|
0:05:53 | or next to human |
---|
0:05:55 | approach would to me would be that you can actually measure re reliability and for |
---|
0:06:01 | that you really need to realistic data |
---|
0:06:05 | so |
---|
0:06:07 | it's not our own data we're not |
---|
0:06:10 | really the formal owner the owner is the prosecution and they gave us permission to |
---|
0:06:15 | collect data are from the police intercept data and this has some restrictions i'm sure |
---|
0:06:22 | the first question after this presentation will be a question regarding availability |
---|
0:06:27 | i am i'm happy to cooperate i think but it's not in our hands entirely |
---|
0:06:32 | and we only got permission of strict conditions so we had to and on i'm |
---|
0:06:37 | anonymized the data so we have listened through all the data and no doubt |
---|
0:06:43 | names and |
---|
0:06:45 | and stuff and |
---|
0:06:51 | so |
---|
0:06:51 | what did we do we received a lot of data |
---|
0:06:55 | it's just all telephone concentrate conversations so that stereo generally true that one speaker is |
---|
0:07:02 | someone channel and the other speaker in the other channel |
---|
0:07:07 | not always the case because there's just so much data and a lot of cables |
---|
0:07:10 | they're |
---|
0:07:11 | we split the stereo files and half and we uploaded them into the database and |
---|
0:07:19 | this is to roll material and this is some hundreds thousand audio files make this |
---|
0:07:25 | way |
---|
0:07:26 | and |
---|
0:07:30 | we had some meta data to go with it |
---|
0:07:34 | just some general things |
---|
0:07:37 | and i made a medical tune because it's really realistic data so there's actually |
---|
0:07:42 | been it's really intercepted |
---|
0:07:45 | the stress the point idea |
---|
0:07:47 | i and of the core two |
---|
0:07:51 | which also means that lot of the speakers in the database don't know data recorded |
---|
0:07:57 | which as you can imagine |
---|
0:08:00 | is it is a major point in the privacy and permissions that we got it |
---|
0:08:04 | at that we got |
---|
0:08:07 | two |
---|
0:08:08 | to apply the database and two |
---|
0:08:11 | distributed or not and |
---|
0:08:13 | "'kay" |
---|
0:08:15 | so we took some |
---|
0:08:18 | it took some processing which to "'cause" about two years we had a because the |
---|
0:08:22 | chance to hire people to know out |
---|
0:08:25 | the det personal information like names and addresses |
---|
0:08:29 | and to actually find isolating speakers which is |
---|
0:08:34 | the most important part of the job because we just gotta |
---|
0:08:38 | or a whole a whole a |
---|
0:08:41 | a big a big power of audio files |
---|
0:08:45 | and they could sort they could use |
---|
0:08:49 | the telephone number to listen into the files and then they had to decide for |
---|
0:08:53 | themselves okay this is john i think and this is johns uncle and okay i |
---|
0:08:57 | get to know the people around revolving around a telephone number |
---|
0:09:01 | and this is how a speaker id was created just i |
---|
0:09:06 | through listening prude telephone number flew content of the audio |
---|
0:09:12 | and they added to make the metadata in |
---|
0:09:17 | so these people native speakers we call them |
---|
0:09:21 | they isolated these speakers and |
---|
0:09:24 | they |
---|
0:09:26 | banana whenever the rose doubts the recording was excluded |
---|
0:09:29 | but still this is not a hunter percent the could be between brother somewhere using |
---|
0:09:34 | the phone of his twin brother and that there is some confusion |
---|
0:09:40 | so |
---|
0:09:41 | we like to call the truth by proxy i mean you can be quite sure |
---|
0:09:46 | but never hundred percent about speaker id |
---|
0:09:50 | and another thing |
---|
0:09:53 | they choose a first the n-best then recordings per speaker then below it is to |
---|
0:09:57 | five because we were concerned that number of speakers would be too low in the |
---|
0:10:01 | end |
---|
0:10:01 | and |
---|
0:10:03 | they were instructed to take this factors |
---|
0:10:06 | that first recordings as possible if you take five recordings all from the same day |
---|
0:10:11 | talking to same person that's a little less interesting then |
---|
0:10:14 | three recordings of the type and two recordings |
---|
0:10:17 | but some whispering or car or anything |
---|
0:10:24 | and those aims of five and ten |
---|
0:10:26 | perhaps it's my management capabilities i don't know but up to |
---|
0:10:30 | it just varies a lot the modus is still five so most speakers have five |
---|
0:10:36 | recordings but there's even one with one hundred and thirty three recordings |
---|
0:10:41 | which is kind of interesting in itself but |
---|
0:10:45 | it just looked weights a lot |
---|
0:10:49 | so this is the unknown imitation other mean just means listening through it and |
---|
0:10:55 | assessing deciding what information could be deduced able to a real person and that's just |
---|
0:11:01 | know about so there's all row of samples there that's just no |
---|
0:11:06 | which isn't labeled |
---|
0:11:07 | so |
---|
0:11:08 | you just have to sometimes you have to guess better somebody didn't say anything or |
---|
0:11:13 | somebody |
---|
0:11:15 | us said anything set something that's just no doubt |
---|
0:11:20 | and in doubt it's just |
---|
0:11:22 | lee you've adapt so when they were and doubts whether this is really |
---|
0:11:27 | personal information just leave it out |
---|
0:11:31 | okay and that these people and of their own meta data and metadata |
---|
0:11:35 | the single most important operate just of course speaker id |
---|
0:11:39 | and then all these other things but these are all perceptive meta data that it |
---|
0:11:43 | is on the basis of listening |
---|
0:11:45 | so |
---|
0:11:48 | though sometimes |
---|
0:11:49 | there are some |
---|
0:11:52 | subjective measures they're like amount of noise they could choose between non |
---|
0:11:57 | a little and quite a lot |
---|
0:11:59 | but of course they were more than one |
---|
0:12:01 | a native speakers that the this job so it's a bit |
---|
0:12:06 | it's a bit depends on the person what i find it quite a lot of |
---|
0:12:09 | noise and we try to regulate it's but it's never a it's is one of |
---|
0:12:13 | the perks of listening and that then subjectivity judge this kind of metadata |
---|
0:12:23 | okay this was the end for them they this was the end of the job |
---|
0:12:27 | and we as post processing we |
---|
0:12:31 | anonymized metadata of course |
---|
0:12:33 | and |
---|
0:12:35 | the second the next step is something i |
---|
0:12:38 | i pretend here that the database all finished with actual have still working on the |
---|
0:12:43 | second one and the which is to make a clean version because for it to |
---|
0:12:47 | be comparable to |
---|
0:12:49 | forensic casework |
---|
0:12:51 | you will want to leave out all |
---|
0:12:53 | background stuff and it's background speakers music and like you would do with real case |
---|
0:13:00 | recordings so i'm labeling older parts but weather's those kind of background noises so we |
---|
0:13:06 | have a |
---|
0:13:07 | dirty version and a clean version of the same database in the end |
---|
0:13:12 | we go |
---|
0:13:14 | so this is to be database but numbers of just the two |
---|
0:13:18 | c and as you can see there's a two-to-one rate show for male and female |
---|
0:13:22 | a lot the native speakers prioritise males in the database because |
---|
0:13:27 | males are |
---|
0:13:29 | way more frequent in case work on the female |
---|
0:13:32 | and i'm sorry to say |
---|
0:13:34 | or perhaps not |
---|
0:13:37 | and just some |
---|
0:13:38 | some statistics |
---|
0:13:41 | this is interesting the dutch landscape language landscape is not strictly monolingual they're still quite |
---|
0:13:49 | sizeable minorities in holland mainly from oregon and turkish this end |
---|
0:13:54 | which means we have some multilingual speakers and |
---|
0:14:00 | and they come in different flavours their speakers that |
---|
0:14:04 | speak a mix of turkish for instance and dutch in the same compensation and their |
---|
0:14:09 | speakers that bill use touch of some computations and turkish in the other conversation and |
---|
0:14:16 | so we have |
---|
0:14:17 | quite some possibility due to cross language research with this which is the first experiments |
---|
0:14:22 | that will be presented is of the type |
---|
0:14:27 | and i would there some english recordings but don't get your hopes up there is |
---|
0:14:31 | only six speakers and their |
---|
0:14:34 | there most of them are not native and |
---|
0:14:37 | that they like being is like the english that i'm talking now but having their |
---|
0:14:41 | detection |
---|
0:14:45 | so number of recordings |
---|
0:14:50 | on the one hundred and thirty three recordings for the for the largest speaker which |
---|
0:14:54 | is a |
---|
0:14:55 | big criminal in holland |
---|
0:14:58 | course i'm not allowed to |
---|
0:15:00 | so you with it |
---|
0:15:01 | and i don't the in terms of trials same source trials and different source trials |
---|
0:15:06 | but i must admit the different source trials or also cross gender and cross anything |
---|
0:15:10 | so |
---|
0:15:12 | the actual usable number of different source trials is probably a little lower |
---|
0:15:20 | this is the duration to get few if you sense of duration |
---|
0:15:24 | the |
---|
0:15:26 | the pink bars are actually the gross length of durations and the blue bars are |
---|
0:15:32 | after a speech activity detection |
---|
0:15:35 | and their sum |
---|
0:15:37 | unexplainable |
---|
0:15:38 | thinks they're in the pink distribution |
---|
0:15:41 | for which i don't have an explanation |
---|
0:15:44 | you can see that minimum duration for telephone conversations to actually make it into the |
---|
0:15:51 | database of thirty seconds because lower than that there's just a lot of call tones |
---|
0:15:56 | without an answer and other |
---|
0:15:59 | rubbish |
---|
0:16:00 | and the maximum that i talked the native speakers that they could use was a |
---|
0:16:04 | computation of ten minutes "'cause" otherwise to would have too much work for just one |
---|
0:16:09 | recording |
---|
0:16:10 | however my management capabilities |
---|
0:16:13 | mate made by |
---|
0:16:15 | where so we that there are still some recordings over six hundred seconds excluded there |
---|
0:16:20 | but still there |
---|
0:16:23 | okay |
---|
0:16:25 | well like i set we repent to use it for validation research and it goes |
---|
0:16:30 | into two different types there's general validation just |
---|
0:16:34 | choosing which algorithm is best for us for our ugh case |
---|
0:16:39 | for our case work |
---|
0:16:41 | calibration method i'm happy to see that nikos gonna talk about different calibration and types |
---|
0:16:47 | which will be |
---|
0:16:49 | applicable soon hopefully |
---|
0:16:53 | and there's also and that's more relevant case specific validation that's |
---|
0:16:58 | the all the variation that mister campbell talked about that's real there's real really there |
---|
0:17:05 | is no |
---|
0:17:08 | case without something special to with perhaps not as extreme as the examples we heard |
---|
0:17:12 | but there's always there's always something which |
---|
0:17:16 | to me means that you need k specific validation so for every case you will |
---|
0:17:21 | have to be fine which is where which data is representative for my known sample |
---|
0:17:26 | which of data is with represent the for my reference sample |
---|
0:17:30 | and |
---|
0:17:32 | this means |
---|
0:17:36 | we need data are basically and |
---|
0:17:38 | this is why we did it |
---|
0:17:41 | i hope that i database will reflect a lot of cases |
---|
0:17:46 | of course this is only intercept data |
---|
0:17:49 | and the real monkey business with it screaming and yelling and running it's probably not |
---|
0:17:54 | in there |
---|
0:17:55 | so |
---|
0:17:57 | it that will restrict which cases you can do |
---|
0:18:01 | so there's two solutions to broaden |
---|
0:18:05 | to rolled on the type of cases you can do one find more data and |
---|
0:18:10 | to wait for you guys to have made an algorithm and find out that some |
---|
0:18:15 | conditions don't matter any more |
---|
0:18:17 | so you can |
---|
0:18:19 | you can to should that's than the evaluation data a little less strict |
---|
0:18:28 | of course i'm talking faster than the slides |
---|
0:18:31 | so the database i'm thinking of the mess trials of always try to find a |
---|
0:18:36 | lot of same source trials and different source trial so those two score distributions will |
---|
0:18:40 | be a |
---|
0:18:41 | will be |
---|
0:18:43 | i still have five minutes left and they've still has a |
---|
0:18:45 | apart so i'll |
---|
0:18:47 | the third one loss to mark |
---|
0:18:51 | please contact me about the film if availability that but it will be heart because |
---|
0:18:54 | it's not our own data and it's very sensitive |
---|
0:18:58 | being six hundred intercepted speakers |
---|
0:19:00 | but you find my email address on the presentation and please contact me |
---|
0:19:05 | okay |
---|
0:19:10 | a this was kind of |
---|
0:19:13 | expected |
---|
0:19:14 | the running a bit late to know how |
---|
0:19:18 | right selection |
---|
0:19:22 | so |
---|
0:19:23 | i've been talking about exactly what's screen that see anything from the u |
---|
0:19:29 | so we did in experiments we splits |
---|
0:19:31 | we take |
---|
0:19:33 | ten percent of the data set of multiple database |
---|
0:19:37 | so this is a pretty preliminary mites able to the ideas can we do some |
---|
0:19:41 | experiments speaker recognition experiments see what influences |
---|
0:19:46 | are important |
---|
0:19:50 | this is so more motivation all |
---|
0:19:53 | go and tell what we did so we looked at a turkish speakers the three |
---|
0:19:58 | either speaking turkish or |
---|
0:20:01 | dutch with a turkish actions or a mix of dutch and |
---|
0:20:04 | turkish |
---|
0:20:06 | a them here's a slight about |
---|
0:20:09 | still problem this skewness of the availability of you number of |
---|
0:20:16 | segments per speaker and have to deal with that |
---|
0:20:19 | and i had a nice are facing joke about the well-known public to instance you |
---|
0:20:26 | which i paraphrase us there are so some speakers are more equal than other speakers |
---|
0:20:32 | how to deal with the different amounts of trials |
---|
0:20:36 | george actually have it open solution |
---|
0:20:39 | and old solution |
---|
0:20:42 | with indicate that there make a debt curve per speaker pair basic |
---|
0:20:46 | we implemented that you see the influence of that c can be to more about |
---|
0:20:51 | the paper |
---|
0:20:55 | quickly go to |
---|
0:20:57 | the |
---|
0:21:00 | affect all |
---|
0:21:01 | say |
---|
0:21:02 | i'm speaker population for commercial speaker recognition system so here we use a commercial speaker |
---|
0:21:09 | recognition system that can do some calibration and what you see here |
---|
0:21:14 | is that if you give the |
---|
0:21:17 | recognition system some additional material so those were forty five speakers outside the test databases |
---|
0:21:24 | used here you can go |
---|
0:21:27 | problem very badly calibrated at the top line sc alarms above one it's just be |
---|
0:21:32 | useless to the lower lying where you see that the c llr which is a |
---|
0:21:37 | measure of calibration and discrimination as actually quite close to the minimum attainable c llr |
---|
0:21:46 | so this shows that the system that we used in you can be about paper |
---|
0:21:50 | more about the system |
---|
0:21:53 | did work and the data curves are more or less same so this is really |
---|
0:21:57 | the reference population only matter it's towards calibration |
---|
0:22:02 | but i was going to tell you a model wait a minute level |
---|
0:22:06 | about this is just an answer to one reviewers well what about distribution well yours |
---|
0:22:10 | distribution |
---|
0:22:12 | as also the paper |
---|
0:22:15 | this is my final slide already |
---|
0:22:17 | so i'm working towards the finishing in time |
---|
0:22:22 | you see and number of figures |
---|
0:22:26 | showing different tests |
---|
0:22:28 | but you can do with the database so out of this ten percent we took |
---|
0:22:31 | only turkish speakers |
---|
0:22:33 | we first looked at what is train and test both turkish and you see several |
---|
0:22:38 | performance measures in some statistics about the number of trials |
---|
0:22:44 | and the next thing you can do is what if both train and test |
---|
0:22:49 | or sampling questions |
---|
0:22:51 | or trace and reference you might say |
---|
0:22:54 | are both dutch speaking but with a turkish accent |
---|
0:22:59 | and the numbers actually |
---|
0:23:01 | they vary a bit so the equal error rate goes up while |
---|
0:23:05 | except me significant i don't know the remote too many speakers in our in our |
---|
0:23:08 | sub subspace |
---|
0:23:10 | but it might be more interesting to look at the fourth line where |
---|
0:23:16 | what is indicated there is that we |
---|
0:23:18 | training with |
---|
0:23:20 | speakers talking turkish and test with speaker spoken dutch but with a turkish accent for |
---|
0:23:26 | the other way around to these two cases |
---|
0:23:32 | and then the most |
---|
0:23:36 | interesting thing there i think is that nothing happens or not so much happens with |
---|
0:23:40 | the equal error rate that's they've more or less and same ballpark fifteen point eight |
---|
0:23:44 | percent |
---|
0:23:45 | some of the first two lines but the calibration |
---|
0:23:50 | suffers but it doesn't really suffer that much it is actually comparable to what happens |
---|
0:23:56 | to the two |
---|
0:23:59 | turkish speakers speaking |
---|
0:24:01 | dutch so |
---|
0:24:03 | where there |
---|
0:24:04 | from this data it looks like calibration is suffering from speakers speaking some of the |
---|
0:24:09 | language |
---|
0:24:12 | but |
---|
0:24:14 | the cross language effects is modes |
---|
0:24:16 | making things worse i i'd like to |
---|
0:24:21 | show you the figures you look at yourselves |
---|
0:24:26 | i think that general conclusion with the figures is quite a lot of self very |
---|
0:24:29 | so things depend a lot of how you set of experiments at least is data |
---|
0:24:35 | allows you to do those kind of experiments |
---|
0:24:40 | but i would be like to conclude more with that the general idea of this |
---|
0:24:43 | of this work of collecting the database is that |
---|
0:24:47 | first of all i hope to have shown that |
---|
0:24:50 | this kind of data is necessary both for answering questions like what is the error |
---|
0:24:55 | rates at the error rates of the methods for this particular case so one of |
---|
0:24:59 | the easiest outward conditions |
---|
0:25:02 | but you could also use the data that this is not shown in this work |
---|
0:25:05 | with you can use the data to actually make is case specific calibration so once |
---|
0:25:10 | you know which |
---|
0:25:12 | factors are influencing and where you which not then you can select |
---|
0:25:17 | make a selection important to those factors |
---|
0:25:20 | and used at four k specific calibration |
---|
0:25:24 | and i given very shortly an experiment showed shown an experiment there with it |
---|
0:25:31 | language data |
---|
0:25:33 | okay of this is this is |
---|
0:25:48 | thank you for that talks the talk you to david and i would like to |
---|
0:25:52 | ask you about the level the precision of your tagging in metadata |
---|
0:25:56 | and the reason i'm asking is because in australia we have very similar multi lingual |
---|
0:26:01 | context with waves of immigrants from certain countries when there's troubles for example we have |
---|
0:26:07 | whenever there's a war in eleven on we have the immigrants ways arabic accent speaking |
---|
0:26:13 | english and arabic accent |
---|
0:26:15 | now firstly that level of that accentedness varies as we all know |
---|
0:26:20 | and secondly after twenty five years we have a second generation what a native speakers |
---|
0:26:26 | who don't speak english with an accent that speak their own idiolect of english often |
---|
0:26:31 | as native speakers do you account for that kind of difference |
---|
0:26:37 | yes we have not only a language field |
---|
0:26:40 | but also and native nist field so there will be annotated learned this is native |
---|
0:26:46 | speaker or quite a good non-native speaker or a bad native speak of the language |
---|
0:26:51 | spoken |
---|
0:26:52 | and this idiolect this ad collect would be socially clicks us as a linguist term |
---|
0:26:58 | would count as native |
---|
0:26:59 | if it's second or third generation |
---|
0:27:03 | it's deaf |
---|
0:27:07 | for the each where was each speaker confined to the common in z |
---|
0:27:13 | where did they did you have speakers at one across handsets |
---|
0:27:21 | we have it'll mum the majority of speakers |
---|
0:27:25 | i know it about telephone number so |
---|
0:27:29 | is from the same number but there's also some speakers that are using one phone |
---|
0:27:34 | in one recording and for instance a landline in another recording |
---|
0:27:41 | i don't have no |
---|
0:28:15 | in the experiments that show you the non-target conditions are always the same start condition |
---|
0:28:20 | so |
---|
0:28:22 | absolutely i make an argument that the doesn't make any sense to different |
---|
0:28:26 | there "'cause" i c i c conditions is this is sort of the conditioning i |
---|
0:28:30 | in the likelihood ratio thing so you shouldn't |
---|
0:28:33 | have to the numerator with different conditions than the moment i don't believe that makes |
---|
0:28:39 | too much |
---|
0:28:49 | as well that would try to people put it make a speaker |
---|
0:28:53 | just pop of david the |
---|
0:28:56 | my feeling is what the calibration show shows is exactly will limit of speaker recognition |
---|
0:29:04 | has known as we are not able with how system to detect if we need |
---|
0:29:09 | new data |
---|
0:29:11 | to do we calibrations |
---|
0:29:13 | we wean upset with calibration it just v limit of that someone's of the system |
---|
0:29:23 | i'm not sure if a entire lots not to question but i |
---|
0:29:27 | i agree that there so we can test whether particular conditions your noise or whatever |
---|
0:29:32 | make a difference or not we should be able there without with this data |
---|
0:29:37 | but if there is a new condition |
---|
0:29:40 | the can always be a new condition and |
---|
0:29:43 | if we don't know whether it's of influence we can say whether we archer having |
---|
0:29:47 | matching data i agree that that's just a doberman work |
---|
0:29:51 | we should walk on automatic detection |
---|
0:29:54 | of the case where we are |
---|
0:29:59 | in a known condition |
---|
0:30:01 | if we are not able to describe a condition factor by fact or and to |
---|
0:30:05 | decide and to give to the user |
---|
0:30:07 | v probability to be compliant with for training set |
---|
0:30:12 | call choose how system in forensic conditions because has joyce eight reason |
---|
0:30:18 | a huge amount and we know that huge amount of conditions in forensic |
---|
0:30:22 | so the way i would approach is not by doing everything completely automatic back actually |
---|
0:30:27 | having the forensic expert listen to the data and that's no problem is it was |
---|
0:30:31 | limited amount of data |
---|
0:30:32 | and the forensic expert can say something sensible like well this is very much like |
---|
0:30:36 | i've heard before or well there is an enormous bows or there is an enormous |
---|
0:30:41 | this where that i haven't seen before so i |
---|
0:30:44 | we should not pdtb |
---|
0:30:50 | they've it i agree but completely we need to human and human expert to for |
---|
0:30:54 | that at least would beginning but we need to feed that the human expert was |
---|
0:30:59 | information of with the system |
---|
0:31:01 | and another time has norm will has we don't know exactly |
---|
0:31:05 | which information how system using we can't explain to human expert how to define what |
---|
0:31:13 | is the |
---|
0:31:15 | no condition pose system or not it's not enough you have some very interesting levels |
---|
0:31:20 | here but has |
---|
0:31:24 | michael saying about the previous questions we always have some question about language what is |
---|
0:31:31 | exactly but definition of the language of idiolect what is the definition of be conditions |
---|
0:31:36 | of the distance |
---|
0:31:38 | is it in the conditions |
---|
0:31:40 | and we should work without automatic system to deter mined the sensitivity of for system |
---|
0:31:45 | to each of these |
---|
0:31:46 | factors |
---|
0:31:47 | knowing that the u one expert could use the human brain to give a probably |
---|
0:31:54 | and i hope we will we will do but |
---|
0:32:03 | just really transgressions that the underlying assumption i think that most people are thinking about |
---|
0:32:08 | is that when it's intercepted that it's a telephone call but i think in a |
---|
0:32:12 | lot of forensic applications you have a confidential informant where the body warned microphone |
---|
0:32:18 | and in those cases the microphone you have lots of issues with a cloth committing |
---|
0:32:23 | the microphone support could you just say in the audio that you have are all |
---|
0:32:28 | then telephone calls |
---|
0:32:30 | and do not |
---|
0:32:31 | do you have any plans to kind of explore the body one type scenarios because |
---|
0:32:35 | that actually as a real challenge as well |
---|
0:32:40 | this is only telephone speech and we are planning a to expand our data |
---|
0:32:47 | collection to |
---|
0:32:49 | yep more ended mismatch but in holland telephone intercepted telephone speech is really the majority |
---|
0:32:56 | of the data so we're covering quite a lot already |
---|
0:32:59 | but this kind of circumstances or a parked car or anything |
---|
0:33:05 | we need data that's true |
---|