Recurrent Neural Networks for Decoding Lip Read Speech
Conference paper
Fenghour, S, Chen, D and Xiao, P (2019). Recurrent Neural Networks for Decoding Lip Read Speech. 2019 8th International Conference on Software and Information Engineering (ICSIE 2019). Cairo 09 - 12 Apr 2019
Authors | Fenghour, S, Chen, D and Xiao, P |
---|---|
Type | Conference paper |
Abstract | The success of automated lip reading has been constrained by the inability to distinguish between homopheme words, which are words have different characters and produce the same lip movements (e.g. ”time” and ”some”), despite being intrinsically different. One word can often have different phonemes (units of sound) producing exactly the viseme or visual equivalent of phoneme for a unit of sound. Through the use of a Long-Short Term Memory Network with word embeddings, we can distinguish between homopheme words or words that produce identical lip movements. The neural network architecture achieved a character accuracy rate of 77.1% and a word accuracy rate of 72.2%. |
Year | 2019 |
Accepted author manuscript | License File Access Level Open |
Publication dates | |
09 Apr 2019 | |
Publication process dates | |
Deposited | 20 Mar 2019 |
Accepted | 18 Mar 2019 |
Permalink -
https://openresearch.lsbu.ac.uk/item/866z8
Download files
Accepted author manuscript
2019 03 14 ICSIE_2019_paper_30.pdf | ||
License: CC BY 4.0 | ||
File access level: Open |
398
total views235
total downloads5
views this month4
downloads this month