Lip Reading Sentences Using Deep Learning with Only Visual Cues

Journal article


Fanghour, S., Chen, D., Guo, K. and Xiao, P. (2020). Lip Reading Sentences Using Deep Learning with Only Visual Cues. IEEE Access.
AuthorsFanghour, S., Chen, D., Guo, K. and Xiao, P.
Abstract

In this paper, a neural network-based lip reading system is proposed. The system is lexicon-free and uses purely visual cues. With only a limited number of visemes as classes to recognise, the system is designed to lip read sentences covering a wide range of vocabulary and to recognise words that may not be included in system training. The system has been testified on the challenging BBC Lip Reading Sentences 2 (LRS2) benchmark dataset. Experiments with videos of varying illumination have shown that the proposed model has a good robustness to varying levels of lighting. Compared with the state-of-the-art works in lip reading sentences, the system has achieved a significantly improved performance with 15% lower word error rate. The main contributions of this paper are: 1) The classification of visemes in continuous speech using a specially designed transformer with a unique topology; 2) The use of visemes as a classification schema for lip reading sentences; and 3) The conversion of visemes to words using perplexity analysis. All the contributions serve to enhance the accuracy of lip reading sentences. The paper also provides an essential survey of the research area.

KeywordsDeep learning; Lip reading; Neural networks; Perplexity analysis; Speech recognition
Year2020
JournalIEEE Access
PublisherInstitute of Electrical and Electronics Engineers
ISSN2169-3536
Publication process dates
Accepted19 Nov 2020
Deposited21 Nov 2020
Accepted author manuscript
License
File Access Level
Open
Additional information

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Permalink -

https://openresearch.lsbu.ac.uk/item/8v4xv

Download files


Accepted author manuscript
2020 11 10 LRS_VC_revised.pdf
License: CC BY 4.0
File access level: Open

  • 8
    total views
  • 32
    total downloads
  • 8
    views this month
  • 30
    downloads this month

Export as

Related outputs

The Development of a Skin Image Analysis Tool by Using Machine Learning Algorithms
Xiao, P., Zhang, Xu, Pan, Wei, Ou, Xiang, Bontozoglou, C., Chirikhina, E. and Chen, D. (2020). The Development of a Skin Image Analysis Tool by Using Machine Learning Algorithms. Cosmetics. 7 (3), p. e67. https://doi.org/10.3390/cosmetics7030067
Deep Learning Causal Attributions of Breast Cancer
Chen, D, Hajderanj, L, Mallet, S, Camenen, P, Li, B, Ren, H and Zhao, E (2020). Deep Learning Causal Attributions of Breast Cancer. Computing 2021. London 15 - 16 Jul 2021 The Science and Information (SAI) Organization.
UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning
Li, B., Gan, Z., Chen, D. and Aleksandrovich, D.S. (2020). UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning. Remote Sensing.
Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing
Hajderanj, L., Chen, D., Grisan, E. and Dudley-McEvoy, S (2020). Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing. IEEE Access.
Learning Bayesian Networks based on Order Graph with Ancestral Constraints
Wang, Z., Gao, X., Tian, X., Yang, Y. and Chen, D. (2020). Learning Bayesian Networks based on Order Graph with Ancestral Constraints. Knowledge-Based Systems. https://doi.org/10.1016/j.knosys.2020.106515
In Vivo Assessment of Water Content, Trans-Epidermial Water Loss and Thickness in Human Facial Skin
Chirikhina, E., Chirikhin, A., Xiao, P., Dewsbury-Ennis, S. and Bianconi, F. (2020). In Vivo Assessment of Water Content, Trans-Epidermial Water Loss and Thickness in Human Facial Skin. Applied Sciences. 10 (17), p. e6139. https://doi.org/10.3390/app10176139
An Adaptive Task Scheduling Method for Networked UAV Combat Cloud System Based on Virtual Machine and Task Migration
Li, B., Liang, S., Tian, L., Chen, D. and Zhang, M. (2020). An Adaptive Task Scheduling Method for Networked UAV Combat Cloud System Based on Virtual Machine and Task Migration. Mathematical Problems in Engineering. p. 5391479. https://doi.org/10.1155/2020/5391479
An adaptive dwell time scheduling model for phased array radar based on three-way decision
Li, B., Tian, L., Chen, D. and Liang, S. (2020). An adaptive dwell time scheduling model for phased array radar based on three-way decision. Journal of Systems Engineering and Electronics. pp. 500-509. https://doi.org/10.23919/JSEE.2020.000030
Advances in Radiometry Research
Xiao, P. (2019). Advances in Radiometry Research. Nova Science Publishers, Inc..
Practical Java Programming for IoT, AI, and Blockchain
Xiao, P/ (2019). Practical Java Programming for IoT, AI, and Blockchain. John Wiley & Sons.
Intelligent aircraft maneuvering decision based on CNN
Li, B, Liang, S, Tian, L and Chen, D (2019). Intelligent aircraft maneuvering decision based on CNN. the 3rd International Conference on Computer Science and Application Engineering. Sanya, China 22 - 24 Oct 2019 ACM Press. https://doi.org/10.1145/3331453.3362046
Intelligent Attitude Control of Aircraft Based on LSTM
Li, B, Gao, P, Li, X and Chen, D (2019). Intelligent Attitude Control of Aircraft Based on LSTM. 3rd International Conference on Artificial Intelligence Applications and Technologies. Beijing, China 01 - 03 Aug 2019 IOP Publishing. https://doi.org/10.1088/1757-899X/646/1/012013
Applications of Capacitive Imaging in Human Skin Texture and Hair Analysis
Bontozoglou, C. and Xiao, P. (2019). Applications of Capacitive Imaging in Human Skin Texture and Hair Analysis . MDPI Applied Sciences. 10 (1), p. 256. https://doi.org//10.3390/app10010256
A Task Scheduling Algorithm for Phased Array Radar Based on Dynamic Three-way Decision
Li, B., Tian, L., Chen, D. and Han, Y. (2019). A Task Scheduling Algorithm for Phased Array Radar Based on Dynamic Three-way Decision. Sensors. 20 (1). https://doi.org/10.3390/s20010153
Intelligent Flight Control of Combat Aircraft Based on Autoencoder
Li, B., Gao, P., Liang, S. and Chen, D. (2019). Intelligent Flight Control of Combat Aircraft Based on Autoencoder. 2019 The 4th International Conference on Robotics, Control and Automation. GuangZhou 26 - 28 Jul 2019 https://doi.org/10.1145/3351180.3351210
Skin Capacitive Imaging Analysis Using Deep Learning GoogLeNet
Zhang, X., Pan, W., Bontozoglou, C., Chirikhina, E., Chen, D. and Xiao, P. (2019). Skin Capacitive Imaging Analysis Using Deep Learning GoogLeNet. Computing Conference 2020. London, UK 16 - 17 Jul 2019 Springer.
FRS: A Simple Knowledge Graph Embedding Model for Entity Prediction
Wang, L.F., Lu, X., Jiang, Z., Zhang, Z., Li, R., Zhao, M. and Chen, D. (2019). FRS: A Simple Knowledge Graph Embedding Model for Entity Prediction. Mathematical Biosciences and Engineering. 16 (6), pp. 7789-7807. https://doi.org/10.3934/mbe.2019391
Predicting Customer Profitability Dynamically over Time: An Experimental Comparative Study
Chen, D., Guo, K. and Li, B. (2019). Predicting Customer Profitability Dynamically over Time: An Experimental Comparative Study. 24th Iberoamerican Congress on Pattern Recognition (CIARP 2019). Havana, Cuba 28 - 31 Oct 2019 https://doi.org/10.1007/978-3-030-33904-3_16
In Vivo Human Hair Hydration Measurements by Using Opto-Thermal Radiometry
Bontozoglou, C, Zhang, X, Patel, A, Lane, ME and Xiao, P (2019). In Vivo Human Hair Hydration Measurements by Using Opto-Thermal Radiometry. International Journal of Thermophysics. 40 (2). https://doi.org/10.1007/s10765-018-2477-x
Learning Bayesian Networks using the Constrained Maximum a Posteriori Probability Method
Yang, Y, Gao, X, Guo, Z and Chen, D (2019). Learning Bayesian Networks using the Constrained Maximum a Posteriori Probability Method. Pattern Recognition. 91, pp. 123-134. https://doi.org/10.1016/j.patcog.2019.02.006
Micro-relief analysis with skin capacitive imaging
Bontozoglou, C, Zhang, X and Xiao, P (2019). Micro-relief analysis with skin capacitive imaging. Skin Research and Technology. 25 (2), pp. 165-170. https://doi.org/10.1111/srt.12628
Epsilon Interactive Virtual User Manual (VUM)
Al Hashimi, O. and Xiao, P (2019). Epsilon Interactive Virtual User Manual (VUM). 2018 International Conference on Computing, Electronics and Communications Engineering, ICCECE 2018. Southend-on-Sea 16 - 17 Aug 2018 pp. 138-143 https://doi.org/10.1109/iCCECOME.2018.8658872
Learning Bayesian network parameters via minimax algorithm
Gao, X, Gao, G, Ren, H, Chen, D and He, C (2019). Learning Bayesian network parameters via minimax algorithm. International Journal of Approximate Reasoning. 108, pp. 62-75. https://doi.org/10.1016/j.ijar.2019.03.001
Improving Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnosis via RBF Networks trained with EKF models
Adegoke, V, Chen, D and Banissi, E (2019). Improving Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnosis via RBF Networks trained with EKF models. International Journal of Computer Information Systems and Industrial Management.
A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification
Hajderanj, L, Weheliye, I and Chen, D (2019). A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification. 2019 8th International Conference on Software and Information Engineering. Cairo 09 - 12 Apr 2019
Recurrent Neural Networks for Decoding Lip Read Speech
Fenghour, S, Chen, D and Xiao, P (2019). Recurrent Neural Networks for Decoding Lip Read Speech. 2019 8th International Conference on Software and Information Engineering (ICSIE 2019). Cairo 09 - 12 Apr 2019
Decoder-Encoder LSTM for Lip Reading
Fanghour, S, Chen, D and Xiao, P (2019). Decoder-Encoder LSTM for Lip Reading. 2019 8th International Conference on Software and Information Engineering (ICSIE 2019). Cairo, Eygpt 09 - 12 Apr 2019
Enhancing Ensemble Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnostic using optimized EKF-RBFN trained prototypes, The 10th International Conference on Soft Computing and Pattern Recognition
Adegoke, V, Chen, D, Banissi, E and Barikzai, S (2019). Enhancing Ensemble Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnostic using optimized EKF-RBFN trained prototypes, The 10th International Conference on Soft Computing and Pattern Recognition. The 10th International Conference on Soft Computing and Pattern Recognition. Porto, Portugal 13 - 15 Dec 2018
Distributed deep networks based on Bagging-Down SGD algorithm
Qin, C, Gao, X and Chen, D (2019). Distributed deep networks based on Bagging-Down SGD algorithm. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics. 41 (5), pp. 1021-1027. https://doi.org/10.3969/j.issn.1001-506X.2019.05.13
Towards automated cost analysis, benchmarking and estimating in construction: a machine learning approach
Chen, D, Hajderanj, L and Fiske, J (2019). Towards automated cost analysis, benchmarking and estimating in construction: a machine learning approach. 13th Multi Conference on Computer Science and Information Systems (MCCSIS). Porto, Portugal 16 - 18 Jul 2019
Design of a voice control 6DoF grasping robotic arm based on ultrasonic sensor, computer vision and Alexa voice assistance
Wang, Z, Chen, D and Xiao, P (2019). Design of a voice control 6DoF grasping robotic arm based on ultrasonic sensor, computer vision and Alexa voice assistance. International Conference on Information Technology in Medicine and Education. Qingdao, China 23 - 25 Aug 2019 IEEE. https://doi.org/10.1109/ITME.2019.00150
Capacitive Imaging for Skin Characterizations and Solvent Penetration Measurements
Zhang, X., Bontozoglou, C., Chirikhina, E., Lane, M. and Xiao, P. (2018). Capacitive Imaging for Skin Characterizations and Solvent Penetration Measurements . MDPI Cosmetics . 5 (3), p. 52. https://doi.org/10.3390/cosmetics5030052
Developing a web based interactive 3D virtual environment for novel skin measurement instruments
Xiao, P and Al Hashimi, O. (2018). Developing a web based interactive 3D virtual environment for novel skin measurement instruments. 2018 Advances in Science and Engineering Technology International Conferences (ASET). Dubai, Sharjah, Abu Dhabi, United Arab Emirates 06 - 07 Feb 2018 London South Bank University. https://doi.org/10.1109/ICASET.2018.8376823
In-Vivo Skin Capacitive Image Classification Using AlexNet Convolution Neural Network
Zhang, X, Pan, W and Xiao, P (2018). In-Vivo Skin Capacitive Image Classification Using AlexNet Convolution Neural Network. 2018 3rd International Conference on Image, Vision and Computing (ICIVC 2018). Chongqing, China 27 - 29 Jun 2018 Institute of Electrical and Electronics Engineers (IEEE). pp. 439-443 https://doi.org/10.1109/ICIVC.2018.8492860
Educational Network Bandwidth Analysis and Prediction
Oumar, O A, Dyllon, S and Xiao, P (2018). Educational Network Bandwidth Analysis and Prediction. the 14th International Conference on Machine Learning and Data Mining (MLDM'2018) July 14-19, 2018. New York, USA 14 - 19 Jul 2018
Visual analytics in the public sector: An analysis on diversities and similarities of London’s wards
Chen, D, Sanz, BM and Zhao, E (2018). Visual analytics in the public sector: An analysis on diversities and similarities of London’s wards. International Conference on Big Data Analytics, Data Mining and Computational Intelligence 2018 (BigDaCI 2018). Madrid, Spain 18 - 20 Jul 2018 Bigdaci.
Educational Bandwidth Traffic Prediction using Non-Linear Autoregressive Neural Networks
Oumar, O A, Dyllon, S, Xiao, P and Hong, T (2018). Educational Bandwidth Traffic Prediction using Non-Linear Autoregressive Neural Networks. The 21st International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines - CLAWAR 2018. Panama 10 - 12 Sep 2018
Contour Mapping for Speaker-Independent Lip Reading System
Fenghour, S, Chen, D and Xiao, P (2018). Contour Mapping for Speaker-Independent Lip Reading System. The 11th International Conference on Machine Vision (ICMV 2018). Munich, Germany 01 - 03 Nov 2018
Building an online interactive 3D virtual world for aquaflux and epsilon
Al Hashimi, O. and Xiao, P (2018). Building an online interactive 3D virtual world for aquaflux and epsilon. Advances in Science, Technology and Engineering Systems. 3 (6), pp. 501-514. https://doi.org/10.25046/aj030659
In vivo skin capacitive imaging analysis by using grey level co-occurrence matrix (GLCM).
Ou, X, Pan, W and Xiao, P (2013). In vivo skin capacitive imaging analysis by using grey level co-occurrence matrix (GLCM). International Journal of Pharmaceutics. 460 (1-2), pp. 28 - 32. https://doi.org/10.1016/j.ijpharm.2013.10.024
The occlusion effects in capacitive contact imaging for in vivo skin damage assessments.
Pan, W, Zhang, X, Lane, ME and Xiao, P (2015). The occlusion effects in capacitive contact imaging for in vivo skin damage assessments. International Journal of Cosmetic Science. 37 (4), pp. 395 - 400. https://doi.org/10.1111/ics.12209
On the use of skin texture features for gender recognition: An experimental evaluation
Bianconi, F, Smeraldi, F, Abdollahyan, M and Xiao, P (2017). On the use of skin texture features for gender recognition: An experimental evaluation. 6th International Conference on Image Processing Theory Tools and Applications (IPTA), 2016. Oulu, Finland 12 - 15 Dec 2016 IEEE. https://doi.org/10.1109/IPTA.2016.7821018
Learning Bayesian Network Parameters from a Small Data Set: A Further Constrained Qualitatively Maximum a Posteriori Method
Guo, Zhi-gao, Gao, Xiao-guang, Hao, Ren, Yang, Yu, Di, Ruo-hai and Chen, D (2017). Learning Bayesian Network Parameters from a Small Data Set: A Further Constrained Qualitatively Maximum a Posteriori Method. International Journal of Approximate Reasoning. 91 (Dec), pp. 22-35. https://doi.org/10.1016/j.ijar.2017.08.009
Feature Extraction and Labelling Large Data Sets Using Deep Learning
Chen, D (2017). Feature Extraction and Labelling Large Data Sets Using Deep Learning. RESEARCHER LINK: Smart Technology for Fighting Virus Epidemics & Bioinformatics. Recife, Pernambuco, Brazil 10 - 13 Sep 2017
Prediction of Breast Cancer Survivability using Ensemble Algorithms
Adegoke, V, Chen, D, Banissi, E and Barikzai, S (2017). Prediction of Breast Cancer Survivability using Ensemble Algorithms. International Conference on Smart System and Technologies 2017 (SST 2017),. Osijek, Croatia 18 - 20 Oct 2017
Predictive Ensemble Modelling: An Experimental Comparison of Boosting Implementation Methods
Adegoke, V, Chen, D, Barikzai, S and Banissi, E (2017). Predictive Ensemble Modelling: An Experimental Comparison of Boosting Implementation Methods. 2017 European Modelling Symposium (EMS). Manchester 20 - 21 Nov 2017
Membrane solvent penetration measurements using contact imaging
Xiao, P, Abdalghafor, H and Lane, ME (2013). Membrane solvent penetration measurements using contact imaging. in: Chilcott, R and Brain, K (ed.) Advances in Dermatological Sciences London, UK Royal Society of Chemistry.
Photothermal Radiometry for Skin Research
Xiao, P (2016). Photothermal Radiometry for Skin Research. Cosmetics. 3 (1), p. 10. https://doi.org/10.3390/cosmetics3010010
Making Better Use of Big Data
Chen, D (2016). Making Better Use of Big Data. LSBU Enterprise Count Event, March 2016. London Southbank University 18 - 18 Mar 2016 London South Bank University.
Capacitive Contact Imaging For Skin Characterization
Xiao, P, Zhang, X and Bontozoglou, C (2016). Capacitive Contact Imaging For Skin Characterization. Perspectives in Percutaneous Penetration 2016. La Grande Motte, France 29 Mar - 01 Apr 2016
Skin Image Retrieval Using Gabor Wavelet Texture Feature
Ou, X, Pan, W, Zhang, X and Xiao, P (2016). Skin Image Retrieval Using Gabor Wavelet Texture Feature. International Journal of Cosmetic Science. 38 (6), pp. 607-614. https://doi.org/10.1111/ics.12332
Hair Water Content and Water Holding Capacity Measurements
Xiao, P, Bontozoglou, C, Ciortea, LI and Imhof, RE (2016). Hair Water Content and Water Holding Capacity Measurements. 7th International Bi-Annual Conference on Applied Hair Science. Red Bank, NJ, USA 08 - 09 Jun 2016
Capacitive Imaging For Skin Characterization and Solvent Penetration
Xiao, P, Zhang, X and Bontozoglou, C (2016). Capacitive Imaging For Skin Characterization and Solvent Penetration. Skin Forum 2016 Annual Meeting. London
Big Data Analytics In The Public Sector: A Case Study Of NEET Analysis For The London Boroughs
Chen, D, Asaolu, B and Qin, C (2016). Big Data Analytics In The Public Sector: A Case Study Of NEET Analysis For The London Boroughs. International Conference on Big Data Analytics, Data Mining and Computational Intelligence. Funchal, Madeira, Portugal 02 - 04 Jul 2016
On Distributed Deep Network for Processing Large-Scale Sets of Complex Data
Qin, C, Gao, X and Chen, D (2016). On Distributed Deep Network for Processing Large-Scale Sets of Complex Data. 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC). Hangzhou, China. 27 - 28 Aug 2016 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/IHMSC.2016.55
On Distributed Deep Network for Processing Large-Scale Sets of Complex Data
Chen, D (2016). On Distributed Deep Network for Processing Large-Scale Sets of Complex Data. 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC). Hangzhou, China 27 - 28 Aug 2016
Personal identification based on skin texture features from the forearm and multi-modal imaging
Bianconi, F, Chirikhina, E, Smeraldi, F, Bontozoglou, C and Xiao, P (2016). Personal identification based on skin texture features from the forearm and multi-modal imaging. Skin Research and Technology. 23 (3), pp. 392-398. https://doi.org/10.1111/srt.12348
A Bayesian Approach to Learn Bayesian Networks Using Data and Constraints
Gao, X, Yu, Y, Zhi-gao, G and Chen, D (2016). A Bayesian Approach to Learn Bayesian Networks Using Data and Constraints. 23rd International Conference on Pattern Recognition (ICPR 2016). Cancún, México 04 - 08 Dec 2016 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/ICPR.2016.7900204
Big Data Analytics System for Fact/Data-driven Decision Making
Chen, D (2015). Big Data Analytics System for Fact/Data-driven Decision Making. The Royal Statistical Society, Business and Industry Section. London, UK 18 Nov 2015 Royal Statistical Society .
Multi-Location Clinical Trials: Do TEWL Readings Change With Altitude?
Kramer, G, Xiao, P, Crowther, J and Imhof, RE (2015). Multi-Location Clinical Trials: Do TEWL Readings Change With Altitude? Scientific meeting & technology showcase. New York 10 - 11 Dec 2015 Society of Cosmetic Chemists.
Determining Key (Predictor) Modules for Early Identification of Students At-Risk
Chen, D and Elliott, G (2013). Determining Key (Predictor) Modules for Early Identification of Students At-Risk. International Conference on Advanced Information Engineering and Education Science (ICAIEES 2013). Beijing, China 19 - 20 Dec 2013 Atlantis Press. https://doi.org/10.2991/icaiees-13.2013.22
Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining
Chen, D (2012). Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing and Customer Strategy Management. 19 (3), pp. 197-208. https://doi.org/10.1057/dbm.2012.17