•  
  •  
 

Abstract

One of the most common causes of road accidents globally is driver drowsiness and it needs solutions that are reliable and can be applicable in numerous real-life situations. We present this paper with the aim of developing a deep-learning system that is capable of reliably detecting drowsiness in diverse and varied conditions across different drivers, environments, and sensor types. Our system is known as Multimodal Attention Network (MMAN), which combines information of eye and head movement, heart-rate and breathing pattern, and vehicle-dynamics signal. MMAN has a gradient-reversal layer that enables the layer to be domain-adaptive such that it does not depend on the differences between datasets and drivers. Attention modules assist the model in interpreting its actions by laying more emphasis on the most effective modalities. We tested MMAN using three public datasets, including NTHU, UTA, and YawDD, and a new Middle Eastern Drowsiness Dataset (MEDD) acquired by us. Our protocols were subject-independent and leave-one-domain-out. MMAN obtained an average accuracy of approximately 95 per cent and a macro-F1 score of 0.94 which is 3-8 percent better than unimodal and static-fusion baselines. On-board testing established that MMAN is capable of operating in real-time, and the average inference latency is approximately 40 ms per frame, which is suitable to be used in vehicles. These findings demonstrate that multimodal attention combined with domain adaptation can result in an interpretable, fair and deployable driver-monitoring system that can contribute to making vehicles smarter and safer.

Share

COinS