Development of a Speech-to-Text (STT) System for the Breton Language

ZEMOURI, OUassim

Full metadata record

DC Field	Value	Language
dc.contributor.author	ZEMOURI, OUassim	-
dc.date.accessioned	2026-06-23T11:51:08Z	-
dc.date.available	2026-06-23T11:51:08Z	-
dc.date.issued	2025	-
dc.identifier.uri	https://repository.esi-sba.dz/jspui/handle/123456789/851	-
dc.description.abstract	This thesis explores Automatic Speech Recognition (ASR) for Breton, a low-resource language with significant dialectal variation. We evaluate several ASR models including OpenAI’s Whisper models, focusing on Whisper-Large, across two datasets: Mozilla Common Voice 21 and La Banque Sonore des Dialectes Bretons (BSDB). Experiments were conducted with and without text cleaning, using Word Error Rate (WER) and Character Error Rate (CER) as evaluation metrics. Due to resource limitations, full fine-tuning of Whisper proved challenging, leading to the use of Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA. A finetuned Whisper-Small model was produced, demonstrating the effectiveness of PEFT for under-resourced languages. This work underlines the potential of modern ASR models and efficient adaptation techniques to improve speech recognition for Breton language and offers insights applicable to other low-resource languages.	en_US
dc.language.iso	en	en_US
dc.subject	Deep Learning	en_US
dc.subject	Automatic Speech Recognition	en_US
dc.subject	Breton Language	en_US
dc.subject	Model Evaluation	en_US
dc.subject	Low-resource Languages	en_US
dc.subject	Data Cleaning	en_US
dc.title	Development of a Speech-to-Text (STT) System for the Breton Language	en_US
dc.type	Thesis	en_US
Appears in Collections:	Master