Good morning ladies and gentleman,
welcome to the third day of your Odyssey Workshop.
Out of fifty one papers, twenty seven have been presented over the last two days
and we have another twenty one to go.
twenty four to go, if I'm doing the calculation right. Twenty four to go and
yesterday was the... all papers were on... mainly on i-vectors. And we can say yesterday
was the i-vector day. And today papers are... except one paper, there are two major
sessions. One is language recognition evaluation and then features for speaker recognition.
My name is Ambikairajah, I'm from the University of New South Wales in Sydney, Australia.
I have the pleasure of introducing to you our plenary speaker for today, doctor Alvin
Martin.
Alvin will speak about the NIST speaker recognition evaluation plan for two thousand twelve and
beyond.
And he has coordinated NIST series of evaluation since nineteen ninety six in the areas
of speaker recognition, language and dialect recognition. And the evaluation work he's involved
collection and selection and preprocessing of data, and writing the evaluation plan, and evaluation of
the results, coordinate in the workshop and... and many more tasks.
He has served as a mathematician in the Multimodal information Group at NIST since nineteen
ninety one and to two thousand eleven.
Alvin holds a Ph.D. degree in mathematics from the Yale University. Please join me in
welcoming doctor Alvin Martin
Okay! Thank you! Thank you for that introduction and thank you for the invitation
to do this talk. I'm here to talk about this speaker evaluations and, as you
know, I have
at NIST
and I remain
associated with NIST for this workshop, however
I am here
independently, so everything I say or
I'm responsible for everything and no one else is, opinions are all my own.
I guess I might... don't think I subject to any restrictions, but
I'm at the clock.
okay
stay closer to this. An outline of the
Topics I hope to cover... Gonna talk about some early history, things that preceded the
evaluations, the current series of evaluation. The things that happened during the early times of
the evaluation
and giving kind of a history of the evaluations and in part of past Odysseys
evaluation... who's involved with I should note my debt to Doug Reynolds who gave a
similar
talk on these matters four years ago in Stellenbosch and I will update one of
the slides that
he presented there. Gonna say some things from the point of view of an evaluation
organiser, about ... about evaluation organisation. Say something about performance factors to look at, something
about metrics which we've already talked about at the others workshop. Say something about progress
measuring progress over time
and when we talk about the future, quitting SRE twelve evaluation process currently going on,
it will take place in the end of this year and then
so about what might happen after this year
The early history
ones I would mention
One thing that backed to the interesting speaker recognition evaluation success of speech recognition evaluation
back
in ... in the eighties and the early nineties, this
very much
involved in this, in this showed the benefits of independent evaluation on common data sets.
I'll show a slide of that in a minute.
I will mention the collection of various early corpora that were appropriate for speaker recognition:
TIMIT, KING and YOHO, but most especially Switchboard. It was a multi-purpose corpus that was
collected around nineteen ninety one, so one of the purposes that they had in mind
was speaker recognition, collected conversations from a large number of speakers so that you have
multiple conversations for each speaker. Success led to the collection later Switchboard two and similar
collections. And in fact in the aftermath of Switchboard, The Linguistic Data Consortium was created
in nineteen ninety two with the purpose of supporting a further speech and also text
collections in the... in the United States, and onto the first Odyssey, all wasn't called
Odyssey, it was Martigny in nineteen ninety four followed by several others. I will
show pictures and make a few remarks on those. And andthere were early NIST evaluations.
We date the current series of speaker evaluaqtions nineteen ninety six, but there were evaluations
in ninety two ninety five. There was a DARPA program evaluation at several sites involving
the DARPA programming in ninety two. In ninety five there was a preliminary evaluation that
used Switchboard one data and at the six sites. But these earlier evaluations, the emphasis
was rather on speaker identification
on closed set rather than on open-set recognition that we've come... to know in ...
in the series of evaluations
So here's this favourite slide on speech recognition. The Benchmark Test history. So these... you
know, these... the word error rate is on the lyrical scale, logarythmic scale
start from nineteen eighty eight
and this show best system performance of various evaluation, various conditions in... In successive years,
or years when evaluations were held. So pointing out, of course, is the big fall
in error rates when multiple sites participated on common corpora and we looked at error
rates and
with probably fixed conditions we could see progress being evident, specially this is showing the
early series. This
this... we came a mile over in the evaluation cycle research, collect data evaluate, show
progress that gave inspiration to other evaluations and in particular, speakers
okay, so now
do some walk down memory lane
the first
workshop of this series was Martigny in nineteen ninety four
It was called Workshop on automatic speaker recognition, identification and verification
and that workshop, you know, was the very first of this
recently will attended, but not as well as this one. And there were various presentations
and there were many different corpora, many different performance measures and it was very difficult
to make meaningful comparisons. I presented here one of the papers I presented papers that
interest from the NIST evaluation point of view. There was a paper on public databases
for speaker recognition and verification. It was given there
And to pull the other of the early ones... Avignon, nineteen ninety eight. Speaker recognition
and it's commercial and forensic application is what it was called. We called... also known
as RLA2C from the French title.
and one observation is that in terms of the talks there
TIMIT was a preferred corpus
for many was
too clean, too easy corpus. I remember Doug making comments that he didn't wanna listen
anymore. Papers that described results from TIMIT... there's also characterized by sometimes bitter debate over
forensics and how good job forensic experts could do with that at speaker recognition
there were
several
missed speaker evaluations related papers... actually, three of them that combined into
this paper in speech communication
from three presentations, perhaps most memorable was the presentation by George Doddington who told us
all how to do the speaker recognition evaluation
So, this was a talk that laid out the various principles, and most of the
principles are kept and followed in our evaluation series, includes a discussion of the
one golden rule of thirty
Crete, two thousand one
Two thousand one, A Speaker Odyssey, took the official name. Speaker recognition workshop. That was
the first official Odyssey
it was characterized by more emphasis on evaluation. There was an evaluation track that was
persuaded, the NIST was
involved with
So, one of the presentations, the NIST presentation, I think I
gave it...
the history of NIST evaluations up to that ... that point and I will actually
show a slide form there later on.
another
key presentation was... was one by several people from the Department of Defence: Phonetic, idiolectal
and acoustic speaker recognition, that was... these remained their ideas that were being pursued at
the time and that were influencing the course of research that point I think the
name was Noan George had a lot to do with that. He had the paper
on idiolectal techniques as well
Toledo in two thousand and four,
I think was really where Odyssey came of age
It was... it was well attended, I think it probably remains
the most
highly attended of the Odysseys. It was the first Odyssey in which we had the
NIST
SRE workshop, held in conjunction at the same location. That was to be repeated in
Puerto Rico in two thousand six and Bordeaux in two thousand ten. It was also
the first
Odyssey to include language recognition units. It had two notable key notes on forensic recognition
earlier in Avignon ... these were two excellent well receieved parts and since then, Odyssey
has been established biannual event that's been held every two years
And that this data presentation, I think Mark Przybocki and I gave called The speaker
recognition evaluation chronicles. And it was to be reprised, I think that about two years
later in Puerto Rico. So, Odyssey has marched on
Two thousand six was in Puerto Rico I find, incredibly, the picture of it. Two
thousand and eight, Stellenbosch hosted by Niko. Twenty ten, two years ago we were in
Brno. This is the logo designed by Honza's children. And now we're here in Singapore,
and I think
before we finish this workshop we will hear about plans for Odyssey in twenty fourteen.
Okay! Let's move on to talk about organisation.
think about evaluation. The thought are that is
part of the organisation responsible for organising evaluations. And questions are which tasks are we
to do, key principles, all this ... some of the milestones will be take directly
taught.
I've done different evaluations and talk about that participation
So which speaker recognition problem? These are research evaluations, but what is the application, environment
and alignment? Well, we know what we have done, but it won't be necessarily obvious
before we started, but it would be access control, the important commercial application. It might
have formed the model. It would raise s question of text independent or text dependent.
There are some problems, I think we shuld do text- dependent. In part of the
access control is the
prior probability of target used to be high.
their forensic aplications that could theoretically be or there's person spotting
of course, is the way ... sometimes the way we went. Inherently in person spotting
the prior probability target is low, it's text independent
Well, in ninety six... and we'll all look at the ninety six evaluation plan. The
separated NIST evaluations would concentrate on speaker spotting, emphasising low false alarm
area of the
performance curve
Some of the principles have been the speaker spotting
in our primary task
we were research system oriented, you know. Application inspired but in it to research
NIST traditionally, with some exceptions, doesn't do product testing in the English. You do the
product testing to advance the technology. We searched the principle that we're gonna pool across
target speakers
people had to
Get scores that will say that work independent on target speaker or having a performance
curve rate every speaker and then just averaging performance curves
and we emphasize the
alarm rate region, both scores and decisions were required in that context with the other
system
NIko suggested that George is gonna talk about tomorrow that calibration matters. It is part
... part of the problem, the adress
Some basics... Our evaluations were open to all willing participants to anyone that
you know, follow rules. I could get... get the data and run all the trials
and come to the workshop where research oriented we have tried to
discourage commercialised competition. Now, we don't want people saying an advertisements, we the missed ideal
our evaluation that's featured with the evaluation plans is specified applying all the rules or
all the detailed evaluation, we'll look at one.
Each evaluation is followed by workshop
these workshops were limited to participants plus interested government organizations that every site or team
that our participance was expected, we represented. At some of them we talked meaningfully about
the evaluation system. The evaluation datasets that we subsequently published, made publicly available by the
LDC. I would not give ... that remains the aim... remains the case the SRE
o eight data is currently available. In particular, sites getting started in research may wanna
be later... 'cause are able to obtain it. Typically, we'd like to have not the
most recent eval, but the next most recent eval, in this case that's o eight,
available publicly probably next year SRE o ten will be made available, heopefully the LRE
o nine, to mention language eval, will soon become available
okay
with one hand on this web page
hpage for the speaker eval, list of past speaker evals in for each year, you
can click on and get the information on the evaluation trought that year
started in nineteen ninety seven. For some reason, the nineteen ninety six evaluation plan things
have been lost, but I asked Craig to search for it and he found it,
so I hope that will get put out, but that mean
what went into the evaluation plan, the first evaluation plan of the current series, which
we said the emphasis will be on issues of handset variation and test segment duration
in traditional goals as were said to drive the technology forward, measure state-of-the-art, find the
most promising
approaches
Task has been task of the hypothesized speaker, segment of conversational speech on the telephone.
That's been expanded, of course, in recent years. Interestingly, are you surprised to see this?
The research objective, given our overall ten percent miss rate for target speakers, is to
minimize the overall false alarm rate.
That is, actually, what we said in ninety six. It is not what we emphasized
in the year since
until
this past year, as you heard in the best evaluation, that's was made the official...
Craig is gonna talk about the best evaluation tomorrow, so in that sense, come full
circle.
but this also mentions that performance is expressed in terms of the detection cost function.
And the researchers than minimize DCF. They also specify the research objective that I am
natural emphasize and I don't think we'd achieve the... achieve uniform performance across all target
speakers. There have been some investigations about classes of speakers and
sometimes attributed Doddington different
types of speakers in different levels of difficulty
so again the task is given up
speaker... target speaker in that segment
is the hypothesis if that speaker's true or false
two measured performances in two related ways. Detection performance from the decisions and detection performance
characterized by roc.
word is roc
here is the dcf formula we're all familiar with. We have parameters cost
which was once expressed as ten, also false alarm as one and the prior probability
target
expressed as point zero one. We also... in this old computerized DCF for a range
of p target in a sense where to return to that promise in the current
evaluation site,
Here we say our ROC will be constructed by pooling decision scores
these scores will then be sorted and plotted on PROC plots.
PROC are ROCs plotted on normal probability
plots. So this was in nineteen ninety six, the term for what we now
all refer to as
as DET plots
we talk about various conditions ... results by duration not this type decision previous task
or reqiure explicit decisions
and that scores of multiple target speakers are pooled before plotting the PROCs. So that
requires score normalization across speakers. So that was the key emphasis that was new in
the ninety six evaluation
previously. Now we honor the term DET curve following the nineteeen ninety seven Eurospeech papers,
which preserved ... used the term DET curve, the detection error tradeoff. I think George
had a role in choosing tyhat name
George turning to one person involved, another is... you may know as Tom Crystal. Incouraging
the use of ...of this kind of curve that linearizes
a performance curves assuming normal distributions
and
I was surprised to find that there's a Wikipedia page for DET plots. So, this
is the page showing the linearizing effect.
okay, now we talk about milestones
These are sorted down, others may choose different ones, but you know We realized that
we had earlier evaluations in ninety two and ninety five, the first in the series
was in ninety six.
Two thousand is first that we had a language other than English, we used the
AHUMADA,
Spanish data, along with other data. Two thousand one was
rather late, we were in the United States with first evaluation cellular phone data. Two
thousand one we also started providing ASR transcripts, errorful transcripts. We had kind of limited
forensic evaluation using a small FBI database in two thousand two. Also, two thousand two
there was the SuperSid workshop, one of the projects that Johns Hopkins workshop; it followed
the SRE and helped to advance the technology. Other Baltimore workshops that followed up on
speaker recognition. Many people here participated. Two thousand five
first multiple languages, bilingual speakers
in the eval... Also the first microphone recordings of telephone calls and therefore included some
cross channel trials. Interview data, like with the mixer corporate day in two thousand eight
have been used in two thousand ten. Two thousand ten involved the
new DCF and the cost function stressing even lower false alarm rate, a little more
about that later. Also in two thousand ten there are lots of things coming out
in the recent years. We have been collecting high and low vocal effort data; also
some data that look at aging. Two thousand ten also featured HASR, the human assisted
speaker recognition. Evaluation small set that invited some systems that involve human as well as
automatic systems.
Twenty eleven is best. We had a broad range of test conditions, including add noise
and reverb, Craig will be telling you about that tomorrow.
Twenty twelve, it's gonna involve target speakers to find beforehand
participation
participation
grown
begin with. The number in fifty eight is... we have it in... these numbers are
all a little fuzzy in terms of what's a site, what's a team, but I
think of these numbers like... these are the ones that Doug used a few years
ago and updated them. Fifty eight in twenty ten
Doug, the MIT has provided... I think we're not doing physical notebooks anymore, but when
we did, provided a cover pictures of the notebooks that the sold
sure wanted to. One thing to note for understandable reasons, I guess, is the big
increase in participation after two thousand one
and the point I should notice is handling
the scores of participating sites becomes a management problem. To a lot more work doing
the evaluation of fifty eight participants than one dozen participants, and you know,
this is the... actually this is a
can't handle scores of the participants, that is handling this
trial scores of all these participants, it doesn't matter if score is of participants, they're
score participants
so this is one of Doug's cover slides from two thousand four showing logos of
all the sites and in the centre is a DET curve
condition of primary interest, common condition well
systems
from two thousand six
than Doug for those efforts
So here it is, the graph,
ninety two and ninety five were
outside the series and had limited number of participants. Twenty eleven is the best evaluation,
it also was limited to a very few participants
otherwise, you didn't see the train... particularly those that trained after two thousand one growing
to the
fifty eight alongtime twenty ten. Twenty twelve evaluation to the registration is open, was being
open over the summer and last count I had is thirty eight and I expect
that's going to grow
So, this is a slide
from
two thousand one presentation at Odyssey that describe the evaluations up to that point
in the center is the number of target speakers and trials, so the first
six evaluation on Switchboard one had forty speakers that had really a lot of conversations
and one of the trains in the other evals restored more speakers up to eight
hundred by two thousand
we... each case to find a primary condition
whether we are basing that on the number of handsets in training
or whether we... can we emphasize different number... different phone number trials, we were looking
at the issues of electret versus
a carbon button, that was a big issue is the days of landline towers. So,
this specifies the primary conditions and evaluation features for these early evaluations
here is an attempt without putting in numbers to update some of that for
evaluations after two thousand one
we end up pulling primary condition of a common condition in that everyone but that
the true for the official chart that we first evaluate all other conditions a when
we introduced different languages to the common condition involved English only all the kinds of
handsets so time to trade it on know and how well a problems that
and on the right you see some of the other features that came out anew.
Cellular data was added, multilingual data
came on in two thousand five
two thousand six we had some microphone tests
and then
things only got more complicated in the most recent evaluations
on terms of common conditions, in two thousand eight we had eight common conditions
two thousand ten we had nine common conditions. Two thousand twelve five common conditions that
classified
so in eight, we contrasted English in bilingual and contrasted interview
in conversational telephone speech
in two thousand ten we were contrasting nineteen telephone channels, interviewing conversation speech and high,
low and normal vocal effort. Two thousand twelve we get interview test without noise or
with added noise or repeated with added noise or with conversation phone test collected in
a noisy environment.
two thousand eight and ten involved interviews with the mics collected over multiple microphone channels
two thousand ten, of course, added high and low vocal effort
effort in aging with the Greybeard corporate in two thousandten also introduced HASR. Two thousand
twelve offered more about target speakers, specified in advance.
So, something about performance factors.
I'll try not to say too much of this, but in terms of what we've
looked at over the years, we've tried to look at demographic factors
like sex and in general, there have been exceptions. The performance has been a bit
better in male speakers than female. Kind early I would look at age and Geordge
more recently has done a study of age and recent evaluation; he may say something
about that tomorrow. Education factor... haven't looked in too much. One very interesting thing in
getting the early evaluation is to look at mean pitch.
people's
test segments and training. And
if he's put a non-target trials between
similar pitch or pitch not.. it means it's similar, not close. The difference... and even
more interesting, look at target trials, where the meet pitch was the same or not
similar pressure person and all that it seriously that's all
speaking style
conversational telephone interview, particularly .... A lot of data has been collected on that. Vocal
effort, more recently. The questions about
defining vocal effort and have it coillected. Aging switchboard with the reviewed corpus ... limited
time collecting it is difficult. These are the intrinsic factors related to the speaker.
The other category, extrinsic factors relates to the collection by microphone or telephone channel. Telephone
channel, landline, cellular, VOIP is something we work on. Earlier times, since days, carbon versus
electret. Telephone handset type; various types are various microphones in the recent evaluation of matched
to mismatched microphones. Placement of the microphone relative to the speaker and
background noise and room reverberation
talk about that tomorrow and it kind of takes the best
And finally, parametric factors. Duration of training test, and also the number of training segments,
the training sessions which
evaluations that have eight sessions of training for telephone speech could greatly improve performance. We
tried carrying along for many years, ten seconds is short duration of things, but there's
also the increase in duration, especially in twenty twelve, we're gonna have lots of sessions
and duration
in training and I think, perhaps the emphasis is larger than
seen the effects of multiple sessions and more data in evaluation. English, of course, has
been the predominant language, but several of the evaluations include a variety of other languages
and one of the hopes is that performance is good in every language as English.
We at first suspected the reason that overall performance had been better in English is
due to the regularity and more quantity of the data available in Englis. Cross language
trials are a separate challenge
okay
the metrics
Mention equal error rate, it is with us, it's part of our lives, in it's
substance. I tried to discourage it, but ... It is easy to understand
in some ways
at least amount of data
but, you know, doesn't deal with calibration issues and basically the operating point of equal
error rate is not the operating point of applications
high target
Prior probablities target or may have load not really equal. Decision cost has been our
main state bread and butter, we'll hear more about that. CLLR has been championed by
Niko, we talked about it
monday and we've talked about just looking at false alarm rate, it affects miss rate,
which we return to in best. So, you all know about the decision cost function,
it's the sum of the specified parameters
First we normalize it by the cost of a system that has no intelligence, but
simply always decides yes, always decides no, so the worst possible score is one.
So the parameters that were mentioned in ninety six, these were the parameters form ninety
six to two thousand eight,
twenty will reach for domain, conditions for core and extended test.
we changed
what's the miss is one, false alarm is one, target point zero one.
the driving force, and a lot of people
were upset, their scepticism with
create systems before that. I think the outcome has been relatively satisfactory, I tink people
feel that they developed a good systems
before this
Niko talked about
cllr, he noted that George suggested limiting cllr to
to false alarm rate, covers a broad range of operating points.
Fixed miss rate, we said, has it's roots in ninety six, but
is used in twenty twelve. It's practical for applications, it may be viewed as cost
for listening to false alarms. Some conditions... conditions were really good, you see, can't get
a ten percent miss rate, maybe appropriate for one percent miss rate.
recording progress
How do we do that? it's always difficult to assure test set comparability, if you're
collecting data the same way as before, is it really equal tested? Well, we encourage
participants in the evaluations to run their prior systems, moth old systems.
a new data, which gives us some measure
But, even more, it's been a problem with changing technologies, you know, ninety six landline
phones predominated, we dealed with carbon and electret.
now, the world is largely cellular, we need to explore VOIP , present the new
channel. So, the technology makes changing and with progress we will make the test harder.
Always want to add new evaluation conditions, new bells and whistles.
More channel types, more speaking styles, languages... the size of the evaluation data measures.
In two thousand eleven, we explored externally added noise and reverb. The noise will continue
in this year. So, Doug attempted in two thousand
eight to
to look at this, to explore existing condition, the course of years and looka at
the best system.
and here is an updated version of his slide, showing for more or less fixed
conditins.
logarythmic of a
DCF, I believe
where things worked. This numbers go up to two thousand six
with added data in the right, two thousand eight showed
some continued progress on various test conditions. Then in twenty ten
we threw in the new measure. That really messes things up, numbers went up, but
they're not directly comparable. This is the current
of our history slide tracking progress
so, let's, you know, turn to the future
SRE twelve
target speakers
at most are specified in advance. There are speakers in recent past evaluations. I think
it's something in the order of two thousand. That it's best why it is potential
target speakers. So, sites can know about these targets, they have all the data, they
can
develop their systems to take advantage of that. All prior speeches are available for training.
There would be some new target speakers with training data provided at evaluation times; that's
one check on the effect of providing the targets in advance. We also have a
test segments that will include non-target speakers.
that is the big change for twenty twelve. Also, new interview speech will be provided,
and was mentioned yesterday, in sixteen bit
linear PCM
some of the test phone calls are gonna be collected specifically in noisy environments
And moreover, we're gonna have artificial
noise, you know - added noise, test was done best... some test segment. Another challenge
in this community. But, will this be an effectively easier task
because we find the targets in advance and subsets
It's... it mix it into these partially ... close that trial, you know, you are
allowed to know not only about the one target, but these two thousand other targets,
will that make a difference? We had, you know, we have open workshops. you know,
workshops where the participants... we debate these things. Last December this got debated how much
will this
change the system? Will it make the problem too easy?
It was ... we could have conditions when people were asked to assume
that segment is target so it... since things fully close that.... or to assume no
information about targets other than that of the actual trial.
clearly speaker siding is in the past, so people do this, their results provide basis
for comparison. This is what's to be
investigate to be seen in SRE twelve. In terms of metrics, log- likelihood ratios now
are required. And since we're doing that, no hard decisions are asked for.
in terms of primary metric
you know, just use the
the dcf of
point ten, but Niko pointed out that if you're not really required to calibrate your
log likelihood ratios if you're only using it at one operating point.
so therefore
to require calibration and stability, we're gonna actually have two DCS and take the average
of them. Also, cllr is alternative. Cllr m ten ... Niko referred to
the limit cllr trials with
high miss rate
so
the formula for TCF. We have three parameters, but we're working right at this one
parameter beta and so
the cost function of the simple average TCF one , TCF two, where cost of
one where target priors are either point zero one aas is twenty ten, or point
zero one
order things to
that would be
the official metric
and finally
future hold
that of course
none of us knows, but
twenty twelve , the outcome will determine whether all this
idea of prespecified targets will be
an effective one, that doesn't make the problem too easy or bigger, now we're gonna
see
Artificially added noise will be included in noise and reverb added may be part of
the future.
HASR twelve will be repeated, HASR ten had both other tests of fifteen hundred trials
or a hundred and fifty. HASR twelve will have even twenty or two hundred, and
anyone, you know, those with forensic interest, but anyone interested in to involve human assisted
systems is invited to participate in HASR twelve, I would like to get more participations
this year
HASR extraction
and answer is
to just bigger
fifty or more particiating sites. Data volume is now getting up to terabytes.
best evaluation that so much, this year will be in twenty twelve, because we're only
providing
most priors will be run test data but that... you know, the numbers are
segments are in the hundreds of thousands and the number trials
is going to be in millions, tens of millions, even hundreds of millions before the
optional full sets of trials, so
likely you see the schedule moving to an every three year-one but details really need
to be
a lot more
I don't know it, but I think that's where
I'll finish
discussion
segments for a speaker
know about our speakers
so
I didin't say they are normal curve
Right, so LDC has an agreement with the sponsors suppert the lre in sre evaluations,
that we will keep the most recent evaluation set blind, hold it back from publications
and the general LDC catalog until the new data set has been created and so
part of the timing of the publication of those eval sets is
the requirement to have a new blind set
is the current evaluation set and I
ascending we have
answers. We can raise that issue with them, and give them the key back they
were getting
right, so the SRE eval set is just being finished now
as soon as that's finalised
sre ten will be put into the queue for publication. Sre... sre ten, will be
put to queue for publication
it is sort of rolling in circle
you'll have to ask the sponsor about that, I can't speak of their motivation, only
they're contractually obligated to delay the publication of discussion.
right well LDC is also balancing the needs of the consortium as a whole and
so we... we are staging publications in the catalog, balancing a number of factors, so
the speaker recognition and language
are the one of the communities that we support. I hear your concern, we can
certainly raise this issue with the sponsors and see if there's any
any ... if we can provide. But, at this point, I think this is the
strategy that spending
this one