Fine-tuning multilingual pre-trained speech models for automatic speech recognition (ASR) has led to improved recognition accuracy for many languages, including low-resourced languages (LRLs). However, studies have shown that these models exhibit intrinsic biases that propagate to ASR systems during fine-tuning. If not mitigated, the resulting ASR systems may be biased against certain groups of speakers due to demographic characteristics such as gender, age, dialect, and region, resulting in inequitable access to speech-based technologies. In this research, we propose to develop a novel fine-tuning strategy that leverages knowledge sharing and transfer to minimize bias propagation in pre-trained speech models for ASR of LRLs during the fine-tuning process. This method integrates small neural networks referred to as debiasing adapters with task adapters via a fusion mechanism within the pre-trained speech model. In addition, we will investigate mitigating bias using Meta-adapters, which are adapters that exploit meta-knowledge learned from multiple tasks using meta-learning for fast and efficient adaptation to target LRLs. We propose investigating gender, non-native speakers, and dialectal bias in ASR for languages spoken in Zambia, namely Bemba, Nyanja, Tonga, and Lozi. We hypothesize that our proposed mitigation strategy will minimize bias propagation in pre-trained speech models and enhance speech recognition fairness in ASR models for LRLs.
Explore the visual story of this exhibit
Adapter-based Debiasing of Pre-trained Speech Recognition Models for Low Resource languages