Meet Inspiring Speakers and Experts at our 3000+ Global Conference Series Events with over 1000+ Conferences, 1000+ Symposiums
and 1000+ Workshops on Medical, Pharma, Engineering, Science, Technology and Business.

Explore and learn more about Conference Series : World's leading Event Organizer

Back

AKM Mahbubur Rahman

Eyelock LLC, USA

Title: Application of Computer Vision and Machine Learning in assistive technologies. EmoAssist : an example

Biography

Biography: AKM Mahbubur Rahman

Abstract

Disabilities related congenital blindness, vision loss or partial-sight disturbs not only one's physical body, but the trajectory of one's social interactions due to lack of perception of partner's facial behavior, head pose, and body movements. It is well documented amongst the literature that 'sight loss can lead to depression, loneliness, and anxiety. The complex emotional states are recognized by the sighted people disproportionately by processing the visual cues from the eye and mouth region of the face. For instance, social communications with eye to eye contact provide information about concentration, confidence, and engagement. Smiles are universally recognized as signs of pleasure and welcome. In contrast, looking away for a long time is perceived as lack of concentration, break of engagement, or boredom. However, the visually impaired people have no access to these cues from the eye and mouth regions. Hence, this non-verbal information are less likely to be communicated through the voice. Additionally, if the interlocutor is silent (listening), the blind individual would have no clue about his interlocutor's mental state. The scenario might be more complex where a group of people including visually impaired persons are interacting in a discussion, debates, etc. The disability of perceiving emotions and epistemic states can be improved by Computer Vision & Machine Learning based assistive technology solution that is capable of processing facial behavior, head pose, facial expressions, and physiological signals in real-time. A practical and portable system is desired that would predict VAD dimensions as well as the facial events from interlocutor's facial behavior and head pose in natural environments (for instance: conversation in a building corridor, asking questions to a stranger in a street, discussing topics of interest in a university campus, etc). Building social assistive technologies using computer vision and machine learning techniques is rather new and unexplored that poses complex research challenges. However, the challenges have been overcome to implement such kind of robust system for real-world deployment. Research challenges are identified and divided into three categories. a) System and face-tracker related challenges b) Classification and prediction related challenges c) Deployment related issues. This paper presents the design and implementation of EmoAssist: a smart-phone based system to assist in dyadic conversations. The main goal of the system is to provide access to more non-verbal communication options to people who are blind or visually impaired. The key functionalities of the system are to predict behavioral expressions (such a yawn, a closed lip smile, a open lip smile, looking away, sleepy, etc.) and 3-D affective dimensions (valence, arousal, and dominance) from visual cues in order to provide the correct auditory feedback or response. A number of challenges related to the data communication protocols, efficient tracking of the face, modeling of behavioralexpressions/affective dimensions, feedback mechanism and system integration were addressed to build an effective and functional system. In addition, orientation sensor information from the smart-phone was used to correct image alignment to improve the robustness for real world application . Empirical studies show that the EmoAssist can predict affective dimensions with acceptable accuracy (Maximum Correlation-Coefficient for valence: 0.76, arousal: 0.78, and dominance: 0.76) in natural dyadic conversation. The overall minimum and maximum response-times are (64.61 milliseconds) and (128.22 milliseconds), respectively. The integration of sensor information for correcting the orientation improved (16% in average) the accuracy in recognizing behavioral expressions. A usability study with ten blind people in social interaction shows that the EmoAssist is highly acceptable with an Average acceptability rating using of 6:0 in Likert scale (where 1 and 7 are the lowest and highest possible ratings, respectively).