[article]
Titre : |
Audio-Visual Automatic Speech Recognition Towards Education for Disabilities |
Type de document : |
Texte imprimé et/ou numérique |
Auteurs : |
Saswati DEBNATH, Auteur ; Pinki ROY, Auteur ; Suyel NAMASUDRA, Auteur ; Ruben Gonzalez CRESPO, Auteur |
Article en page(s) : |
p.3581-3594 |
Langues : |
Anglais (eng) |
Index. décimale : |
PER Périodiques |
Résumé : |
Education is a fundamental right that enriches everyone?s life. However, physically challenged people often debar from the general and advanced education system. Audio-Visual Automatic Speech Recognition (AV-ASR) based system is useful to improve the education of physically challenged people by providing hands-free computing. They can communicate to the learning system through AV-ASR. However, it is challenging to trace the lip correctly for visual modality. Thus, this paper addresses the appearance-based visual feature along with the co-occurrence statistical measure for visual speech recognition. Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and Grey-Level Co-occurrence Matrix (GLCM) is proposed for visual speech information. The experimental results show that the proposed system achieves 76.60 % accuracy for visual speech and 96.00 % accuracy for audio speech recognition. |
En ligne : |
https://doi.org/10.1007/s10803-022-05654-4 |
Permalink : |
https://www.cra-rhone-alpes.org/cid/opac_css/index.php?lvl=notice_display&id=511 |
in Journal of Autism and Developmental Disorders > 53-9 (September 2023) . - p.3581-3594
[article] Audio-Visual Automatic Speech Recognition Towards Education for Disabilities [Texte imprimé et/ou numérique] / Saswati DEBNATH, Auteur ; Pinki ROY, Auteur ; Suyel NAMASUDRA, Auteur ; Ruben Gonzalez CRESPO, Auteur . - p.3581-3594. Langues : Anglais ( eng) in Journal of Autism and Developmental Disorders > 53-9 (September 2023) . - p.3581-3594
Index. décimale : |
PER Périodiques |
Résumé : |
Education is a fundamental right that enriches everyone?s life. However, physically challenged people often debar from the general and advanced education system. Audio-Visual Automatic Speech Recognition (AV-ASR) based system is useful to improve the education of physically challenged people by providing hands-free computing. They can communicate to the learning system through AV-ASR. However, it is challenging to trace the lip correctly for visual modality. Thus, this paper addresses the appearance-based visual feature along with the co-occurrence statistical measure for visual speech recognition. Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and Grey-Level Co-occurrence Matrix (GLCM) is proposed for visual speech information. The experimental results show that the proposed system achieves 76.60 % accuracy for visual speech and 96.00 % accuracy for audio speech recognition. |
En ligne : |
https://doi.org/10.1007/s10803-022-05654-4 |
Permalink : |
https://www.cra-rhone-alpes.org/cid/opac_css/index.php?lvl=notice_display&id=511 |
|