Well the main thing I'm grateful for
is for the award and this wonderful medal. It's a
amazing honor.
And
particularly
particularly pleasing to me because I love this community. I love the Interspeech
community and the Interspeech conferences.
Some people in the audience, I don't know who ??, but she knows particularly that
I'm particularly proud of my ISCA,
previously ESCA, membership number being thirty.
And here is a list of the conferences in the Interspeech series starting
with the predecessor of the first Eurospeech and it was the meeting in Edinburgh in
1988.
All of the Eurospeech conferences and on the ICSOP
conferences and since Interspeech 2000
and the one ?? come read and the one I was actually at.
And another four that you find my name in the program was
co-author or member or area chair.
And so that's only three of the them.
You see I have nothing to do with it's Genevan, it's Burgan and it's Budapest.
I have actually being to
Pittsburgh and I've been to Geneva.
Pitty about Budapest.
Such a lovely city and I'll probably never get the chance. I missed it in
1999.
However I love these conferences
and
it's the interdisciplinary nature that I particularly
appreciate.
You heard from the introduction that some
interdisciplinary is
... well it's heart of psycholinguistics
that we're the interdisciplinary undertaking.
But I loved the idea from the beginning of bringing all the speech communities together
in a single organization and
single conference series.
And
I think the founding fathers of the organisations, the founding
members of Eurospeech
quite a broad theme there
and the founding
father or founding fellow, because we
never knew who it was, for ICSOP that was Heroi Fujisaki.
These people were visionaries
and the continuing success of this conference series is a tribute
to their vision.
Back
in the 1980's, early 90's
and that's
that's why I'm very proud to be to be part of this
of this community, this interdisciplinary community
and
I love the conferences and I'm just tremendously grateful
for the award of this medal, so thank you very much to everybody
involved.
So
back to my title slide.
I'm afraid it's a little messy
or they're all my affiliations on that. Tanja
already mentioned most of them. You would think wouldn't you that
the various people involved would at least chosen the same shade of red
but
down on the right-hand side is my primary affiliation at the moment
the MARCS Institute and University of Western Sydney. My previous european
affiliations which I still have a meritus position on the
left of the bottom
and
the upper layer of loggers there.
I want to call your attention to for practical reason.
So on the on the right is the Centre Of Excellence For The Dynamics Of
Language which is the
an enormous ground actually, it's the big prize in Australian
ground landscape
and this is
this is gonna run for
seven years. It's just started. In fact if I'm
not mistaken it's actually today, it's the first
day of its operation. So it was just awarded, we've just been setting it up
of the last six months and it's starting off today.
And it's a grant worth some 28 million Australian Dollars over seven years
and on the left of that is another big ground
running in the Netherlands for the last .. it's been going for about a year
and a half now
Language in Interaction
and that's a similar kind of undertaking and again it's 27 million euros
over period of ten years.
And
it is remarkable
that two
government organizations, two government research councils, across different sides of
The World more and less simultaneously saw it was really important to stick some serious
funding
into language research, speech and language research.
Okay now the practical reason that I wanted draw
your attention to these two is that they both have websites
and
if you have
bright undergraduates looking for a PhD place
at the moment, please go to the Language and Interaction web website where every
six months for at least next six years will be
bunch of new PhD positions advertised.
We are looking worldwide for bright PhD
candidates. It's being run mainly as a training
ground, so the mainly PhD positions on this ground.
And on the right if you know somebody's looking for a postdoc position we are
about to in
the Centre of Excellence about to advertise a very large number of postdoctoral positions mostly
many
of them require linguistics background,
but please go on look at that website
too, if you or your students or anybody you know
is looking for such a position.
Okay.
Onto my title Learning about speech why did I choose that?
As Tanja
rubbed in
there weren't many topics that I could have chosen.
In choosing this one
I was guided by first looking at the abstracts for the other keynote
talks in this conference.
And I discovered that there is a theme
two of them actually have learning in the title, two out of the others.
And all of them address some
form of learning about speech and I thought well okay
it would be really useful
in the spirit of encouraging the interdisciplinary communication and integration across the various
Interspeech areas,
if I took
the same kind of
general theme
and started by
by sketching what I think of the
some of them most important basic attributes
of human learning about speech. Namely.
But it starts at
the very earliest possible moment,
no kidding,
I will illustrate that in a second.
That it
actually shapes the
processing, it engineers the
the algorithms that
are going on in your brain
that is that the speech you learn about
sets up the processing that you're going to be using for the rest of your
life. This is
also was foreshadowed and what Tanja just told you about me.
And it never stops, it never stops learning.
Okay
so onto
the first part of that.
So let's listen to something.
Warning: you won't be able to understand it.
Well, at least I hope not.
Okay, I see several people in the audience
making ...
movements to show that they have understood what was going on.
Because
what we know now that
infants start learning about speech as soon as the auditory system that they have
is functional.
And the auditory system becomes functional in the third trimester of a mother's pregnancy.
But this to say
for the last three months before you are born you are already listening
to speech
and
when
a baby is born
the baby already shows preference for the native language or another language. Very like you
can't tell a difference between individual languages for instance, it's known that you can't tell
the
difference between Dutch and English on the day you born.
If you're
but you have a preference if you were
exposed to an environment speaking one of those languages for that kind of language.
So what did you think
was in that audio that I just played, I mean what did it sounds like?
Speech, right? But
what else could do ... What language was that?
Do you have any idea?
What language might that have been?
Was it Chinese?
I think that this is an easy question for you guys, come on.
Well, were they speaking chinese in that? No!
Sorry?
Yeah, but it was English, it was Canadian English actually, so
the point is you can't and the baby can
tell
before birth
that it's recording taken from a Canadian team which did the
recording in the mood of
almost
in a moment about eight and half months to nine
months of pregnancy, right? So you can put a little microphone in.
And
let's don't thing
too much about this.
You can actually make a recording within a womb and that's the kind of
audio that you get. So that kind of audio is
presented to a babies before they're even born and so that's why
they get born with preference, with knowing something about the general shape
of the language. So you can tell that's stress based language, right?
That was the stress based language you were listening to.
So.
Learning about speech starts as early as possible.
We also know now, another thing that many people in this audience would know, that
actually infant
speech perception is one of the most rapidly
growing areas in speech processing,
speech research and all of the moment.
When I set up
a lab 15 years ago in the Netherlands, it was the first modern speech perception
lab,
infant speech lab in the Netherlands, now there're half a dozen.
And people who,
PhD students who graduate in this topic have no trouble finding a position. Everybody in
the
U.S. is hiring every psychology and linguistics
department's that have somebody doing infant speech perception at the moment.
Good job.
Good place
for students to get into.
But what
the
recent explosion of research in this area
has meant that some
we've actually overturned some of the initial ideas that we had in this area, so
we now know that
it is really
infant
speech
learning that's really grounded in social communication. It's
these social interactions with the caregivers that
that
actually
motivates
the child to continue learning.
That we also know that
we don't teach individual words to the babies
in the
in this very early period they're mainly exposed to continuous speech input and they learn
from it.
That constructing vocabulary and phonology together
it was first thought because of the results that we had that you had to
learn the
the
finding repertoire of your language first and only then you could start building a vocabulary.
Well
successful building of vocabulary is slow, but nevertheless the very first
access to meaning can now be shown
as early as the very first access to
sound
contrast.
And the latest,
also from my colleagues in Sydney, is that part of the,
sorry you know how it was, the a kind of speech
called Motherese. The special way you talk to babies.
You know you see a baby and you start talking in a special way and
it turns out
that part of this is under the infants control, it's the infant who
who
actually
elicits this kind of speech by
responding positively to it and
also trains
caregivers to stop doing or
to start doing one kind of
speech with enhanced finding contrasts and then stop doing that later and start doing
individual words and so on. So that's all under the babies' control.
So what we
tried to do in the lab that I set up in
Nijmegen, the Netherlands, some fifteen years ago was to
look
using
the techniques, the electrophysiological techniques
of brain sciences, so using Event-related potentials in the infant brain
to look at
the signature of word recognition in an infant brain, that's what we were looking for.
We decided to go
and look for what does word recognition look like
in an infant's brain.
And we found it.
So he's an infant in our lab,
so
sweet, right?
You don't have to stick the electrodes on their heads
separately, we just have a little cap
and
they were quite happy to wear a little cap.
And,
and so
what we usually do is
familiarize them with speech, so it could be words in isolation or it could be
sentences
and
and then we
continue
playing some
speech as it might be
continuous sentences containing
the words that they've already heard or containing some other words.
Okay?
And
what we find is a particular kind of response, this is the word recognition response,
a negative
response to familiarized words compared to the
unfamiliarized words.
It's in the left side of the brain
and
it
this is word onset, it's the word onset here, right.
And
and you'll see it's about some
half a second after
word onset.
And so this is the word recognition effect that you can see
in
in an infant's brain.
So
we know
as I said that in the first year of life
infants mainly hear
continuous speech.
Okay so they're able to learn words from continuous
speech and so in this experiment
we only used continuous speech.
And this was with ten month old infants now they don't have understanding any of
this, you don't have to
understand. Whatever, it's in Dutch.
It's just the
showing what they were like, so that in the particular trial
you'd have
eight different
sentences and all the sentences have one word in common
and this is the word drummer, which happens to be drama, right?
And
and
then you switch to hearing four
sentences
later on
and
the trick is that of course all of these things can occur in pairs, so
for every infant
that hears eight sentences with drummer
right there's gonna be another
infant that's gonna hear eight sentences with fakirs.
Okay
and
so then you have two each of these sentences and what you expect is that
you get more
negative response to whichever word you have actually
already heard
and that's exactly what you found. This one has just been published, as you see.
And so what we have is the proof that
just exposing
infants to a word in an continuous speech
contexts is enough for them to recognize that same word form
and now they don't have understanding of anything at ten months
right, they are not understanding anything about. They're pulling out
words out of continuous speech
at this
at this early age.
Okay
now this is
given the fact that
the
input to infants is mainly continuous speech
is of course vital that they can do this, right? And another
important finding that has come from this series of
experiments and in using infants' word recognition effect
is that
it
predicts
your later
language performance
as a child,
right? So that
if you're showing that
to become negative going response that I've just talked about already
at seven months which is very early
if it's a nice big effect that you get, a big difference
and if it's a nice clean
a response in the brain
then
for instance here is the
I've sorted here
two
groups of infants
which had a negative responses at age of seven months
or in the same experiment did not have a negative response.
And at age three
look at their comprehension scores, their sentence productions scores, the size of vocabulary scores.
The blue guys, the ones who showed that segment, that word recognition effect in continuous
speech
at age seven months already
performing
much better. So it's a vital for your
later development of
speech and language competence.
Here is an actual
participant by participant correlation
between the size of the response,
so remember that we're looking at negative response so
the bigger it is down here, right? The more negative it is
the bigger your scores
in the number of words you know at age one or the number of words
you can speak
at age two. Both correlate significantly, so this is really important.
Okay, so starting early
and
listening actually just to real continuous speech
and
recognizing that what it consists of is
reccuring
items, that you can pull out of that speech signal and store for later use.
That is
setting up a vocabulary
bin and starting early on that
really launches your
language skill.
And we're currently working on just how long that some
that effect lasts.
So the second
major topic
that I want to talk about is how learning shapes processing.
You'll know already from Tanja's introduction that this has actually been the
guiding
theme of my research for the last
well I don't think we are going how many years it is now
for a long time.
And I could easily stand here and talk for the whole hour about this topic
alone
or I could talk for a month about this topic alone but I'm not going
to. I am going to take one particular
really cool,
very small
example of how it works.
So the point is that
the way you actually deal with the speech signal,
the actual processes that you apply
at different
depending on the language you
grew up speaking or your primary language, right? So those of you out there
for whom English is not your primary
language you're gonna have different
processes going on
in your head
than what I have.
Okay
now
I'm gonna take this really tiny
form of processing. So you take a fricative sound right s or f.
Now these are pretty simple sounds.
How do we recognise? How do we identify
a sound,
right? For these fricatives do we actually just
analyze
the frication noise
which is different for sss, fff.
You can hear just hear the difference
sss high frequency energy, right?
fff is lower.
Or do we analyze the surrounding
that information in the vowels? Well, there is always transitional information in any speech
signal between sounds. So are we using this in identifying s and f?
Well.
Maybe we shouldn't because s and f are
tremendously common
sounds across languages and their pronunciation is very similar across languages so we probably
expect it to be much the same way they are processed across languages.
But we cannot always test whether
vowel information is used in the following way.
You ask:
is going to be harder
to identify particular sound,
this works for any sound, right, now we are talking about s and f,
if you insert them into a context that was originally added with another sound?
Okay.
So in the experiment I'm gonna tell you about
your task is just to detect a sound that might be s or f in
this experiment,
okay?
And it's gonna be nonsense you're listening to so
dokubapi pekida tikufa
right and your task would then be to press the button when you hear f
as sound of
f in tikufa.
And crucial thing is that every one of those target
sound is gonna come from another recording every one of them
and it's gonna be either another
recording which had origin,
which originally have the same.
In the tikufa is either gonna have come from another utterance of tikufa
or it's gonna come from
the tiku_a is gonna come from
tikusa
and have the f put into it, right? So you're going to have
mismatch in vowel cues if it was originally tikusa
and congruent vowel cues if it was another utterance of tikufa.
Now some of you who teach speech science may recognise
this experiment because it was originally ... it's a very old experiment,
right?
Anybody recognised it?
It was originally published in 1958, right? Really old experiment.
First done with American English
and the result was very surprising because what
was found was different for f and s,
right?
That in the case of f
if it came from another, if tiku_a was originally tikusa
then
it was harder to, if you put the f
into a different context that was much harder to detect it,
whereas if you did it with the s there was zero effect
of the cross-splicing. No effect whatsoever for s.
But a big effect for f.
So listeners are only using vowel context for f but they weren't using it for
s, right? A so this
just seemed like a bit of puzzle at the time. But you know in 1958,
these old results has been
in the text books for years you know. It's in the text books.
And the explanation was well you know that it's the high frequency energy in s
that makes it clearer,
it's you don't need to listen to anything else the vowels, you can just do
s on the frication noise
alone but f is not so clear, so you need something else.
Wrong.
As you will see
so
I'm going to tell you about some thesis work of my student A. Wagner
a few years ago.
And she first replicated this experiment, so what I'm gonna plug up here is
the cross-splicing effect
for f minus the effect for s,
right so,
you know that
the bigger effect for f
than there is for s, we just saw that, right?
And so she replicated that right. The original one was American English she did it
with British English and get exactly the same
effect, so the
huge effect for f and very little effect for s
So the size of the effect for f is bigger.
And she did in Spanish and got exactly the same result,
okay.
So it's looking good for the original hypothesis, right?
And then she did it in Dutch.
Nothing.
In fact there was no effect for either s or f in Dutch
or in Italian, she did an Italian,
or in German, she did in German,
so okay.
Audience response time again, right? So I missed that,
I didn't tell you one crucial bit of information here.
The Spanish listeners were in Madrid,
so this is Castilian Spanish,
so what two English,
think now
what two English
and Castilian Spanish have
that Dutch and
German and
Italian,
Chinese or whatever languages don't have?
You're good, you're really good.
That's right.
So here, this is the reason you think the original explanation
?? that s is clearer.
Accounts for the results for English and Spanish, but doesn't account for the results for
Dutch and
Italian and German, right? But the
the explanation that
you need extra information for f,
because it's so like θ, right? Because f and θ are about the most confusable
phonemes in any phoneme repertoire.
As the confusion matrix of English certainly shows us.
So you need the extra information for f just because there is another sound in
your phoneme repertoire which its confusable with,
but how do you test that explanation?
Well,
you need,
now you know I'm not gonna ask you to guess what's coming
up, right, because you know it from it if you are looking at the slide.
But you need a language
which has a lot of different s sounds, right?
Because then the effect should reverse
if you find a language with a lot of other sounds like s
and yes Polish is such a language.
Then want you should find in that cross-slicing experiment is that
that
you get a big effect
for mismatching vowel cues for s
and nothing much for f, if you don't have also have θ theta in the
language.
And that's exactly what you find in Polish.
Very nice result. How cool is that overturn the textbooks in your PhD?
So,
we listened to different sources of information in different
languages, right? So we learn to process the signal differently
even s and f are really articulated much the same across languages, but in Spanish
and English you
have fricatives that resemble f and in Polish
you have fricatives that resembles s, so you have to pay
extra attention to surrounding,
well it helps to pay extra attention to surrounding
speech information to identify them.
The information that surrounds
inter-vowel vocalic
consonants is always going to be there. There is always information in the vowel which
you only use
if it helps you.
Okay
onto the third
point that I want to make.
Learning about speech
never stops.
Even if we were only to speak one language,
even if we knew every word of that language, so we didn't have to learn
any new words,
even if we always heard speech spoken in clean conditions
there still learning to be done, especially whenever we meet new
talker which we can do every day. Especially at the conference.
When we do meet new talkers, we adapt quickly.
That's one of the
the most robust findings in human speech recognition, right? We have no problem walking into
a shop
and engage in a conversation with somebody behind the counter we never spoken to before.
And this kind of talker adaptation also begins very early
in infancy
and it continues through
life.
So
as I already said
you know about
particular talkers you can tell your
mother's speech from other
talkers at birth.
So these experiments that people do at birth, right. I mean it's literally within
the first couple of hours after an infant is born. In some labs they are
presenting them with speech and see
if they shown a preference. And they show a preference by sucking
harder to keep the,
you got to pacify the sucker with the transducer and
keep speech signal going and you find
that infants will suck longer that hear their own mother's voice than other voices.
But when do they,
when do they tell the difference between
unfamiliar
talkers, so you have new talkers, when can an infant
tell whether,
whether,
whether they're same or not?
Well you can test discrimination easily
in infants, right.
And it's a method habituation test methat that we use.
So what you do is that you have baby sitting on
caretaker's mother's lap.
And mother's listening to something else, right. You bring in a music tape or something,
so mother
can't hear what babies are hearing
and
baby is hearing speech coming over
loudspeakers
and is looking at a pattern on the screen which
and if they look away the speech will stop,
right.
Sorry.
What happens is you
play them
a repeating
stimulus of some kind, so
in this experiment that I'm gonna talk about, the repeating stimulus is just
some sentences that they wouldn't understand
being spoken by
three different speakers, interchanging one's. Speaker will say
a sentence and the next one will say a couple of sentences and the first
one will also say a couple of sentences
again and third speaker also says sentence These are just sentences that the babies can't
actually understand.
These babies are actually seven months old. Younger than the baby in the picture there.
And
so as to the
stimulus keeps repeating the infant keeps listening, right.
And the stimulus keeps repeating,
and the infant keeps listening,
and the stimulus keeps repeating,
and eventually baby get bored and looks away, right.
And at that point
you change the input,
right.
And then you wanna know if and that's the way you test discrimination, does the
baby look back? Right.
Look back at the screen and perk up.
Okay and continues to look at
the screen and thereby keep the speech going.
Well,
so
these were seven month olds as I said, so really they don't understand anything like
no words yet.
Maybe that recognise their own name, that's about it.
And we have
got three different voices, the three different
young women
that have reasonably similar voices
talking away and saying sentences that are you know way beyond seven month olds' comprehension
like: Artist are attracted to life in the capital.
And then at the point in which the infant
loses attention you'll bring in a fourth voice,
a new voice and the question is: Does the infant notice?
Okay.
So these are Dutch babies. This was run in Nijmegen.
And yes, they do.
They really do notice the difference, right.
As long as it's in Dutch.
We also did the experiment with four people talking in Japanese,
four people talking Italian
and it was no significant
discrimination in that case. So it's only in the native language, right. That is to
say the
language of the environment that they have been exposed to.
So
this is important because it's not
whether speech is understood that's going on here, it's whether sound is familiar, beucase what
infants are doing between six and nine months is there
they're building up their knowledge of the phonology of
their language and building up their first
store of words.
So
and then this is important. Some of you probably know the literature from forensic
speach science on this and you know that
that
if you're trying to do a voice lineup and pick a speaker you heard in
a
criminal context or something and that speakers is speaking a language you don't know very
well
you're much poorer at making a judgement than if they're speaking
the same language as your native language.
And
this appears to be based on exactly the same
the same
basic phonology
adjustment that some
that we see happening in the first year of life.
And we can do a little bartery. We can show adaptation to
to new talkers
and strange speech sounds
in a perceptual learning experiment that we first
ran about eleven years ago
and has been replicated in many languages and in many labs around The World since.
And in this paradigm what we do is we start with a learning phase, right.
Now there are many different kinds of things you can do in this learning phase,
but one of them is
to ask people to decide, they're listening to individual
tokens and you ask them to decide
is this the real world or not?
Right.
And that's called lexical decision task, right.
So here's somebody doing lexical decision and they're looking
the hearing cushion,
astopa, fire place, fire place yes, that's the word, magnify yes,
heno no that's not a word, devilish yes, defa no that's not a word and
so on just going through pressing the button.
Yes, no, yes, no and so on.
Right.
Now the crucial thing in this experiment that we're doing
is that we're changing one of the sounds
in the experiment,
okay.
And we're gonna stick with s and f here, just to keep things simple,
but again we've done it with a lot of different sounds,
so
if you
for instance had a
sound that was halfway between s and f,
we
create a sound along a continuum between s and f that's halfway in between, in
the middle,
and we stick it on the end of a word like which would've been giraffe
but
but then that sounds like
this.
No, like here.
Can you hear that it's a blend of f and s.
And
and a dozen of other words in the experiment
which all should have an f in
and
if they had a s it would be a non-word, so we expose
a group of people to learning that
the way the speakers says f
is this strange thing which is a bit more s like.
Meanwhile there's another group
that's doing the same experiment,
right.
And they're hearing things like this.
That's exactly the same sound at the end of what should be horse.
Right, so they have been trained
to
hear that particular strange sound and identify it as s.
Where the other group identifies it as
as f, right.
And then you do a standard phoneme categorization experiment,
right. Where what everybody hear is exactly the same continue
and some of them were better s and some of them were better f,
but none of them are really good s but the
the point is that
you make a
categorization function out of an experiment like that, right, which goes from one
of those sounds to the other
and you would normally,
under normal conditions get
a baseline categorization function that are shown up there
and if you, but if you're
if a category was expanded
you might get that function and if your s category was expanded you might get
that function okay so
that's what we're gonna look at
as a result of
our experiment, which just one group of people and expanded their f category and another
group of people
and expanded their s category and that's exactly what you get,
right.
Completely different functions for identical continua,
right.
Okay, so we exposed these people to a change sound in just a few words
so we had
up to twenty words in our experiments, but people were
tested on many fewer words and obviously
in real life where the new talker probably works with one
occurrence
and
it only works if you could work out what the sound was
supposed to be, right. And with real words, so if we did
the same thing with non-words there's no significant shift, those are both exactly
equivalent to the baseline function.
So that's basically what we're doing.
Adapting to talkers we just met by adapting our phoneme boundaries
especially for them.
Now this as I've already said
has spawned a huge number of follow-up experiments, not only in our lab.
We know that to generalize across the vocabulary don't have to
have the same sound in a similar
context.
We know that lots of different kinds of exposure
can
can bring about the adaptation
doesn't have to be lexical decision task, you don't have to be making any decision
about the word,
you just have passive exposure, you can have
non-sense words if their phone is phonotactic
constraints force you to
choose one particular sound.
And we know that it's pretty much speaker's specific
that is the least adjustment is bigger for the speaker you actually heard
and we've done it across many different languages and I brought along some results
from Mandarin, because Mandarin gives as something really beautiful.
Namely that you can do the same
adjustment, the same
experiment with segments and with tones, right.
Different kinds of speech sounds as I said not just
the same segments that I used in that
experimental but here they are again f and s in Mandarin. Same result.
Right.
Very new data.
And there is the result when you do it with tone one and tone two
and
in Mandarin exactly the same way. Make an ambiguous stimulus halfway between tone one and
tone two.
And you get the same adjustment.
You do
use this, you can use this
kind of adaptation
effectively in a second language which is good.
At least
in this experiment by colleagues of mine in
Nijmegen using the same Dutch input with Dutch listeners get
exactly the same shift, right.
And
German students, now German and Dutch are very close languages, and the German students come
to
study in the Netherlands in Nijmegen, they take, imagine this the rest of
you who've gone to study in an another
country you know, which doesn't speak your L1 (first language).
They take a course for five weeks,
a course in Dutch for five weeks and at the end of that five weeks
they just go into the lectures
which are in Dutch
and they're just treated like anybody else
in the,
so that long it takes to learn
to get up to speed.
If you're German that long it takes to get up to speed with
Dutch, okay.
So not surprisingly
huge effect, the same effect, the same
experiment
and
with German students in the Netherlands. I have to say that I'm actually, this is
this is my current research, one of my current research projects
and the news isn't hundred percent good on this
topic after all, because I brought along some data which
which is actually just from a couple weeks ago, we've only just got it in,
and this is
adaptation in two languages,
in the same individuals. Now you just seen that graph.
That's the Mandarin listeners doing the task in Mandarin
and what I'm trying to do in one of my current projects
is look at the processing
of different languages by the same person,
right. Because I want to track down what's
what is the source of native language listening advantages in
various different context and so what I'm trying to do now is look at the
same people
doing the same kind of task.
It might be listening to noises, it might be perceptual learning for speakers and so
on
in their different languages.
So here are the same Mandarin listeners
doing the English experiment.
Not so good.
So
it looks
and these were tested in China so
it was,
they are not in immersion situation, it is their second language and they are living
in their
L1 environment, so that's not quite
as hopeful as
as the previous
study. However one thing we know about some
about this adaptation to talkers, we've already seen that discrimination
between talkers is something that even seven month old listeners can do, so what about
this kind of
lexically based adaptation to strange pronunciation. We decided to test this in children
which couldn't really use a
lexical decision experiment, because you can't really ask kids, they don't know a lot of
words.
So we did a picture verification experiments with them.
A giraffe and the one on the right is a Platypus, right.
So the first one ends with the f and the second
one ends with the s. We're doing the s/f thing again.
And
and then we had a name continua for our
for our
finding categorization, so again you don't want to be asking young kids to
decide whether they're hearing f or s, it's not natural
task but if you teach them that the guy on the left is called Fimpy
and the guy on the right is called Simpy
and then you give them something that's halfway between Fimpy and Simpy, right.
Then
then you can
get a phoneme categorization experiment and we first of all had to validate
the task with adults, needless to say we did not
have to do,
the adults could just press a button.
So I didn't have to point to the character and so on.
But we get the same shift again for the adults
and we get it with twelve year olds and we get it with sixty years
olds and important differences with twelve
year olds and six year olds is that twelve year olds can read already.
And six year olds can't read.
And there is a certain school of thought that believes
that you get phoneme categories from reading. But you don't get phoneme categories from reading,
you have
your phoneme categories in place very early in life.
So
that's exactly the same effect as you say very early in life even at age
six
you're using your perceptual learning to
understand new talkers.
And I think I saw our debt over there, so I'm going to show some
of
some of ?? data presented,
so we know, yes there you are.
This is some of the older work so that we know that
that this kind of perceptual learning goes on in life. I brought this particular
result which is again with s and f and was presented to Interspeech in 2012
so I
hope you were all there and you all heard it actually
but they also have some
2013 paper with
different phoneme continuum which I urge you also to look at.
So
even when you're losing your hearing you'll still doing this perceptual learning
and adapting to
to new talkers, so learning about new talkers is just
something that human listeners do
throughout
the lifespan.
So that brings me
to my final slide.
So this has been a
quick
tour through some highlights of some really important issues in human learning about speech.
Namely that it starts as early as a possibly can,
that it actually trains up the nature of the processes
and that it never actually stops.
So
when I was doing this I thought well actually you know
I love these conferences because they're the
interdisciplinary, because we get to talk about the same topic from
from
from different viewpoints. So what actually
would I think after
preparing this talk?
What I think is the
biggest difference you could put your finger on between human learning about speech and
machine learning about speech.
So I have been talking about this during week and I'll give you
that question to take to all the other keynotes and think about too
but
if you'd say, you know, it starts at the earliest possible moment, well I mean
so would a good machine
learning algorithm, right? I mean
it shapes the processing, it actually changes the algorithms that you're using, that's not the
usual
way because we usually start
in programming
machine learning system we start with the algorithm, right?
You don't actually change the algorithm
as a result of the input, but you could. I mean
there's no logical reason why that can't be done I think.
And never stops what I mean that's not the difference, is it? No that's not
a difference you can run
any machine learning algorithm as long as you like.
I think buried in one of many very early slides is
something which is crucially important
and that is the social reward.
That we now know to be really important factor in the early human
learning about speech and you can think of humans
as machines that really
want to
learn about speech. I'd be very happy to talk about this
at any time
during the rest of this week
or
or at any other time
too and I thank you very much for your attention.
Hi and fascinating talk
so a quick question. Your boundaries the ??. Do they change as a function
of the adjacent vowels? So far versus
fa, sa versus fa. ??
We've always used a whatever was the constant
context.
So you're talkind about perceptual learning experiments?
The last set of experiments, right? We've always tried to use a
varying context so I can't answer that question. If we had used only a
or hang on
we did use a constant context in the non-word experiment with
phonotactic constraints, but then that was different in many other ways so
no I can't answer that question but,
there is some tangential
answer, information from another lab
which has shown that people can learn
in this way,
a dialect feature
that is only
applied in a certain context.
So
the answer would be yes. People would be sensitive to that if it was consistent,
yes.
Tanja?
There are two in the same row.
Caroline.
Have you found any sex specific differences in the infants' responses?
Have we found sex specific differences in the infants' responses. There are some
sex specific differences
in. But we have not found them in
in these speech
segmentation. In the word recognition in continuous speech we've actually always looked
and never found a significant difference between boys and girls.
That was the a short one. So are there any other questions or not?
With respect to the
negative responses
on the words
that you used there,
that was presented in the experiment
and
that
at age three the children were..
Right.
The size of the negative going brain potential, right?
Is that just
would you say that could be good to
detect pathology?
Yes.
Definitely and the person whose name you saw on the slides as first author Caroline
Junge
is actually starting a new
personal career development award project in Amsterdam
and in Utrecht, sorry in Utrecht, where she will actually look at that.
Okay so, thank you so much again for delivering this wonderful keynote and
congratulations again for being our ISCA medalist. I am happy that you're around so you
can back our medallist over
the whole duration of the Interspeech conference. Thank you Anne.