Design And Implementation Of A Ubiquitous Framework For Pronunciation Learning

2022

Thèse de Doctorat

ASJP

Informatique

Université Badji Mokhtar - Annaba

Dendani Bilal

Résumé: Handheld and wearable devices have exponentially increased the usage of speech-enabled interfaces and promoted the widespread use of ubiquitous learning applications that aim to be accessible from anywhere and at any time. In particular, computer-assisted language learning (CALL) applications witnessed high growth. Herein, pronunciation learning is a challenging task in ubiquitous environments. Indeed, speech signals are prone to be corrupted by several sources, such as background noises, coding errors, or channel disturbance. The original speech should be recovered from the corrupted version to assess it reliably. For that purpose, many real-world speech samples are available. On the other hand, the pronunciation assessment task is the core component of any computer-assisted pronunciation learning (CAPL) system since it provides reliable feedback for students to improve their training. Such applications require the availability of annotated and rated nonnative speech data. However, most of the time, such corpora are not available, especially for low resource languages such as Arabic. This thesis aims to develop an Arabic pronunciation learning system in a ubiquitous environment under the scarcity of dedicated corpora. Thus, the contribution of this thesis is twofold. In the absence of dedicated corpus, an unsupervised approach is adopted to perform the speech enhancement; it consists of two steps. First, an overcomplete deep autoencoder (OAE) is trained with noisy/noisy pairs to produce enhanced speech. Next, a denoising deep autoencoder is trained in a supervised way leveraging the previous stage. The obtained results showed an improvement of the word error rate (WER) of about 4.48% for a mobile Arabic corpus. Moreover, a significant improvement was achieved for speech quality and intelligibility by 0.835 and 0.06, respectively. The second contribution aims to overcome the scarcity of nonnative computer-assisted pronunciation training (CAPT) dedicated Arabic speech corpora. Inspired by the success of deep learning, we propose to detect abnormal pronunciation in an unsupervised manner using two deep learning algorithms trained on solely correct pronunciations. Experimental results on two Arabic corpora proved the potential of the proposed approach to distinguish between good and bad pronunciations. Additional experiments leveraging audio augmentation techniques to expand the training dataset confirmed the efficiency of the proposed method.

Mots-clès:

Publié dans la revue:

Nos services universitaires et académiques

Thèses-Algérie vous propose ses divers services d’édition: mise en page, révision, correction, traduction, analyse du plagiat, ainsi que la réalisation des supports graphiques et de présentation (Slideshows).

Obtenez dès à présent et en toute facilité votre devis gratuit et une estimation de la durée de réalisation et bénéficiez d'une qualité de travail irréprochable et d'un temps de livraison imbattable!