Multimodal Human Computer Interactionwith Context Dependent Input Modality Suggestion and Dynamic Input Ambiguity Resolution

Multimodal Human Computer Interactionwith Context Dependent Input Modality Suggestion and Dynamic Input Ambiguity Resolution

  IJETT-book-cover           
  
© 2021 by IJETT Journal
Volume-69 Issue-5
Year of Publication : 2021
Authors : N. S. Sreekanth, N.K Narayanan
DOI :  10.14445/22315381/IJETT-V69I5P222

How to Cite?

N. S. Sreekanth, N.K Narayanan, "Multimodal Human Computer Interactionwith Context Dependent Input Modality Suggestion and Dynamic Input Ambiguity Resolution," International Journal of Engineering Trends and Technology, vol. 69, no. 5, pp. 147-151, 2021. Crossref, https://doi.org/10.14445/22315381/IJETT-V69I5P222

Abstract
This paper reports a novel approach for enhanced implementation of a practical multimodal interface system with context based input modality suggestion and dynamic input error correction or ambiguity resolution algorithm. The context based input modality suggestion algorithm suggests the user to switch over to alternate modality in adverse environment. The dynamic input error correction module helps the user to correct the omission or resolve the ambiguity in the primarily communicated message by asking for the clarification from the user. If the user provides the input corresponding to reported error, the system completes the operation without asking for a fresh start. Tricolor Finite State Transducers (T-FST) introduced in this paper, analyze the semantics of the communicated multimodal message. The strategy adopted for grammar definition provides a wider operational space for the users to interact with the computer system. A T-FST based message understanding module emphasis on completing the desired operation rather than giving importance for recognizing the each and every signal from the input channel. The proposed architecture is tested with a standard set of operations used for basic human computer interaction.

Keywords
Human Computer Interaction, Speech Recognition, Gesture Recognition, Multimodal Interaction, Man Machine Interaction.

Reference
[1] Markku Turunen, Jaakko Hakulinen, AnssiKainulainen, AleksiMelto, Topi Hurtig. Design of a Rich Multimodal Interface for Mobile Spoken Route Guidance, Proceedings of Interspeech 2007 - Eurospeech: (2007) 2193-2196.
[2] Michael Johnston, Srinivas Bangalore, GunaranjanVasireddy, Amanda Stent Patrick Ehlen, Marilyn Walker, Steve Whittaker, PreetamMaloor, MATCH: An Architecture for Multimodal Dialogue Systems ,Proceedings of the 40th Annual Meeting of the Association forComputational Linguistics (ACL), Philadelphia, (2002) 376-383.
[3] Johnston, M. and S. Bangalore. 2005. Finite-state Multimodal Integration and Understanding. Journal of Natural Language Engineering 11(2) 159-187, Cambridge University Press.
[4] Dupont, S. ; TCTS Lab., Mons Polytech. Inst., Belgium ; Luettin, J. Audio-visual speech modeling for continuous speech recognition, Multimedia, IEEE Transactions on 2(3) (2000).
[5] Gutierrez-Osuna, R. Coll. Station, Texas A&M Univ., College Station, TX, USA Kakumanu, P.K. ; Esposito, Garcia, O.N. Bojorquez, A. ; Castillo, J.L. Rudomin, I. Speech-driven facial animation with realistic dynamics, IEEE Transactions on Multimedia, 7(1)(2005) 33-42.
[6] Tsuhan Chen ; AT&T Bell Labs., Holmdel, NJ, USA ; Rao, R.R., Audio-visual integration in multimodal communication, Proceedings of the IEEE, 86(5)(1998).
[7] Nguyen, D. ; Dept. of Electr. &Comput. Eng., Univ. of Toronto, Ont. ; Halupka, D. ; Aarabi, P. ; Sheikholeslami, A. Real-time face detection and lip feature extraction using field-programmable gate arrays, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 36(4)(2006) 902-912.
[8] Arianna D` Uliza,FernadoFerri, PatriziaGrifoni A Learning algorithm for multimodal Grammar Interface, IEEE Transcations on System Man, Machine and Cybernetics, 41(6)(2011) 1495-1510.
[9] Arianna D` Ulizia, FernadoFerri, PatriziaGrifoni, Generating Multimodal grammars for Multimodal Dialogue Processing, IEEE Transactions on Systems , Man and Cybernetics- Part A, Systems and Human, 40(6)(2010).
[10] NicuSebe, Multimodal interfaces: Challenges and perspectives,Journal of Ambient Intelligence and Smart Environments 1 (2009) 19–26
[11] S.L. Oviatt, Mutual disambiguation of recognition errors in a multimodal architecture, ACM CHI, 1999.
[12] AngelikiMetallinou, Martin Wollmer, Athanasios Katsamanis, Florian Eyben, Bjorn Schuller, ShrikanthNarayanan, Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification, IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 3(2)(2012) 184-198.
[13] LoicKessous · Ginevra Castellano · George Caridakis, Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis, Journal of Multimodal User Interfaces (2010) Springer 3: 33–48.
[14] Bruno Dumas1, Denis Lalanne1, Sharon Oviatt, Multimodal Interfaces: A Survey of Principles, Models and Frameworks, Human Machine Interaction: Research Results of the MMI Program, LNCS 5440 (2009) 3-26.
[15] Leishman. F,Monfort. V, Horn. O, Bourhis. G, Driving Assistance by Deictic Control for a Smart Wheelchair: The Assessment Issue, IEEE Transactions on Human-Machine Systems, 44(1) (2014) 66 - 77
[16] W3C multimodal interaction frame work - http://www.w3.org/TR/mmi-framework/
[17] Bogdan Ionescu , Didier Coquin , Patrick Lambert , VasileBuzuloiu , Dynamic Hand Gesture Recognition Using the Skeleton of the Hand , EURASIP Journal on Applied Signal Processing 2005: 13, 2101–2109
[18] Ming-Hsuan Yang ; Ahuja, N. ; Tabb, M. Extraction of 2D motion trajectories and its application to hand gesture recognition, Pattern Analysis and Machine Intelligence, IEEE Transactions on 24(8)(2002) 1061-1074.
[19] Ohn-Bar, E, Trivedi, M.M., Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations, Intelligent Transportation Systems, IEEE Transactions on 15(6)(2014) 2368-2377.
[20] Hung Yuen, A chain coding approach for real-time recognition of on-line handwritten characters, IEEE International Conference on Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 6(1996).
[21] Plamondon, R. ; Srihari, S.N. Online and off-line handwriting recognition: a comprehensive survey, Pattern Analysis and Machine Intelligence, IEEE Transactions on 22(1)(2002) 63 - 84.
[22] Tappert, C.C. ; Suen, C.Y. ; Wakahara, T., The state of the art in online handwriting recognition, Pattern Analysis and Machine Intelligence, IEEE Transactions on 12(8)(2002) 787 – 808.
[23] (2021) CMU Sphnix : Building Applications with Sphnix 4 - Website http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4
[24] Potamianos A, Ammicht,E, Fosler-Lussier, E., Modality tracking in the. Multimodal Bell labs communicator, IEEE Workshop on Automatic Speech Recognition and Understanding, 2003. ASRU `03. 2003
[25] ManolisPerakakis,Potamianos A., A Study in Efficiency and Modality Usage in Multimodal Form Filling Systems, IEEE Transactions On Audio, Speech, And Language Processing, 16(6)(2008) 1194-1206.
[26] N.S Sreekanth,et.al., Multimodal Interface for Effective Man Machine Interaction, Media Convergence Handbook; Artur Lugmayr and Cinzia Dal Zotto; Springer, Berlin, Heidelberg, Germany, 2016; 3(2016) 261-281.
[27] N.S Sreekanth, Enhanced Malayalam Speech Recugnition Using Multimodal Techniques, Doctral Thesis, Kannur University, Kerala, Mar 2017.