Detection And Tracking Of Targets In Real-time Images case: arabic Visual Speech Recognition System blidaavs10 (lipreading)
2024
Thèse de Doctorat

Université Saad Dahleb - Blida

B
Baaloul, Ali

Résumé: Automatic visual speech recognition (AVSR) techniques are increasingly prevalent in various domains, including manufacturing, public use, and multimedia devices, making Visual Speech Recognition (VSR) a promising technology that can improve communication accessibility for people with hearing impairments. However, most existing VSR systems are designed for languages like English, leaving a gap for languages like Arabic, which is spoken by over 400 million people worldwide and has unique linguistic and phonetic characteristics. This thesis presents a novel framework for Arabic Visual Speech Recognition, which aims to address this gap and cater to the needs of the Arabic hearing impaired community. The framework integrates state-of-the-art deep learning techniques, such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViT), to transcribe Arabic speech from visual cues accurately and efficiently. The framework also relies on a specialized Arabic dataset, which is carefully curated to capture the diversity and complexity of the Arabic language. This dataset serves as a benchmark for training and evaluating the VSR models, ensuring their robustness and reliability in real-world applications. The framework employs the deep learning techniques like YOLO, CNNs and ViT for robust mouth detection and recognition, which enables the extraction of crucial visual features for accurate speech transcription. The experimental results show that the proposed framework achieves promising performance in enhancing communication accessibility for Arabic speakers with hearing impairments. The framework also demonstrates its effectiveness in handling various linguistic and phonetic variations of the Arabic language, opening up new possibilities for wider applications in real-world scenarios. This research contributes significantly to advancing Arabic Visual Speech Recognition technology, enriching the VSR landscape and fostering greater inclusivity in communication for Arabic speakers.

Mots-clès:

Nos services universitaires et académiques

Thèses-Algérie vous propose ses divers services d’édition: mise en page, révision, correction, traduction, analyse du plagiat, ainsi que la réalisation des supports graphiques et de présentation (Slideshows).

Obtenez dès à présent et en toute facilité votre devis gratuit et une estimation de la durée de réalisation et bénéficiez d'une qualité de travail irréprochable et d'un temps de livraison imbattable!

Comment ça marche?
Nouveau
Si le fichier est volumineux, l'affichage peut échouer. Vous pouvez obtenir le fichier directement en cliquant sur le bouton "Télécharger".
Logo Université


Documents et articles similaires:


footer.description

Le Moteur de recherche des thèses, mémoires et rapports soutenus en Algérie

Doctorat - Magister - Master - Ingéniorat - Licence - PFE - Articles - Rapports


©2025 Thèses-Algérie - Tous Droits Réservés
Powered by Abysoft