[1]
T. Li, and F. Shen, "Automatic segmentation of Chinese mandarin speech into syllable-like", 2015 International Conference on Asian Language Processing (IALP), 2015pp. 57-60
[2]
A. Pradhan, A. Shanmugam, A. Prakash, K. Veezhinathan, and H. Murthy, "A syllable based statistical text to speech system", In 21st European Signal Processing Conference, 2014pp. 1-5
[4]
L. Lu, X. Zhang, and X. Xu, "Fusion of face and visual speech information for identity verification", In 2017 IEEE International Symposium on Intelligent Signal Processing and Communication Systems, 2017pp. 502-506
[6]
V.J. Alcazar, A.N. Maulana, R.O. Mortega, and M.J. Samonte, "Speech- to-visual approach e-learning systems for the deaf", In 2016 11th International Conference on Computer Science and
Education (ICCSE), 2016pp. 239-243
[9]
Y. Mroueh, E. Marcheret, and V. Goel, "Deep multimodal learning for audio-visual speech recognition", In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, 2015pp. 2130-2134
[10]
J.C. Hou, S.S. Wang, Y.H. Lai, J.C. Lin, Y. Tsao, H.W. Chang, and H.M. Wang, "Audio- visual speech enhancement using deep neural networks", In 2016 Asia-Pacific Signal and Information
Processing Association Annual Summit and Conference, 2017pp. 1-6
[11]
W. Feng, N. Guan, Y. Li, X. Zhang, and Z. Luo, "Audio visual speech recognition with multimodal recurrent neural networks", In 2017 International Joint Conference on Neural Networks, 2017pp. 681-688
[12]
M. Karthikadevi, and K.G. Srinivasagan, "The development of syllable-based test to speech system for Tamil language", In 2014 International Conference on Recent Trends in Information Technology, 2014pp. 1-6
[13]
J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A.Y. Ng, "Multimodal deep learning", In Proceedings of the 28th International Conference on Machine Learning, 2011pp. 689-696
[15]
B. Sabzalian, and V. Abolghasemi, "Iterative weighted non-smooth non-negative matrix factorization for face recognition", Int. J. Eng., vol. 31, no. 10, pp. 1698-1707, 2018.
[18]
A.Z. Frisky, C.Y. Wang, A. Santoso, and J.C. Wang, "Lip-based visual speech recognition system", 2015 International Carnahan Conference on Security Technology (ICCST), 2016pp. 315-319
[20]
J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A.Y. Ng, "Multimodal deep learning", In 2011 Proceedings of the 28th International Conference on Machine Learning (ICML), 2011pp. 689-696
[21]
N. Srivastava, and R. Salakhutdinov, "Multimodal learning with deep boltzmann machines", Adv. Neural Inf. Process. Syst., vol. 1, p. 2, 2012.
[22]
M. Zimmermann, M.M. Ghazi, H.K. Ekenel, and J.P. Thiran, "Visual speech recognition using PCA networks and LSTMs in a tandem GMM-HMM system", In Asian Conference on Computer Vision, 2016pp. 264-276
[25]
S. Petridis, Z. Li, and M. Pantic, "End-to-end visual speech recognition with LSTMs", In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, 2017pp. 2592-2596
[26]
B. Shillingford, Y. Assael, M.W. Hoffman, T. Paine, C. Hughes, U. Prabhu, H. Liao, H. Sak, K. Rao, L. Bennett, and M. Mulville, "Large-scale visual speech recognition", arXiv preprint arXiv:1807.05162, 2018.
[28]
F. Yaghmaee, "Robust fuzzy content based regularization technique in super resolution imaging", Int. J. Eng., vol. 29, no. 6, pp. 769-777, 2016.
[30]
I. Jarraya, S. Werda, and W. Mahdi, "Lip tracking using particle filter and geometric model for visual speech recognition", International Conference on Signal Processing and Multimedia Applications, 2016pp. 172-179
[32]
P. Bratoszewski, G. Szwoch, and A. Czyzewski, Comparison of acoustic and visual voice activity detection for noisy speech recognition.In 2016 Signal Processing., Algorithms, Architectures, Arrangements, and Applications, 2016, pp. 287-291.