Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems <BR>(3 minutes introduction)

Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems
(3 minutes introduction)

Jesús Villalba (Johns Hopkins University, USA), Sonal Joshi (Johns Hopkins University, USA), Piotr Żelasko (Johns Hopkins University, USA), Najim Dehak (Johns Hopkins University, USA)

Adversarial attacks have become a major threat for machine learning applications. There is a growing interest in studying these attacks in the audio domain, e.g, speech and speaker recognition; and find defenses against them. In this work, we focus on using representation learning to classify/detect attacks w.r.t. the attack algorithm, threat model or signal-to-adversarial-noise ratio. We found that common attacks in the literature can be classified with accuracies as high as 90%. Also, representations trained to classify attacks against speaker identification can be used also to classify attacks against speaker verification and speech recognition. We also tested an attack verification task, where we need to decide whether two speech utterances contain the same attack. We observed that our models did not generalize well to attack algorithms not included in the attack representation model training. Motivated by this, we evaluated an unknown attack detection task. We were able to detect unknown attacks with equal error rates of about 19%, which is promising.

Search in Audio

Related Recordings

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing
(3 minutes introduction)

Tomi Kinnunen , Andreas Nautsch , Md. Sahidullah , Nicholas Evans , Xin Wang , Massimiliano Todisco , Héctor Delgado , Junichi Yamagishi , Kong Aik Lee

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
(3 minutes introduction)

You Zhang , Ge Zhu , Fei Jiang , Zhiyao Duan

InterSpeech 2021

Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems (3 minutes introduction)

Search in Audio

Related Recordings

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing (3 minutes introduction)

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems (3 minutes introduction)

Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems
(3 minutes introduction)

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing
(3 minutes introduction)

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
(3 minutes introduction)