PriMonitor: An adaptive tuning privacy-preserving approach for multimodal emotion detection

PriMonitor: An adaptive tuning privacy-preserving approach for multimodal emotion detection

2 February 2024 | Lihua Yin¹ · Sixin Lin¹ · Zhe Sun¹ · Simin Wang¹,² · Ran Li¹ · Yuanyuan He³
PriMonitor is an adaptive tuning privacy-preserving approach for multimodal emotion detection. With the rise of edge computing and the Internet of Vehicles (IoV), deep learning-based driver assistance applications have become increasingly popular. Multimodal emotion detection systems have been integrated to enhance driving safety. However, the use of in-vehicle cameras and microphones raises concerns about the extensive collection of driver privacy data. Applying privacy-preserving techniques to a single modality is insufficient to prevent privacy re-identification when combined with other modalities. PriMonitor addresses these challenges by proposing a generalized random response-based differential privacy method that enhances the speed and data availability of text privacy protection while ensuring privacy preservation across multiple modalities. To determine suitable weight assignments within a given privacy budget, pre-aggregator and iterative mechanisms are introduced. PriMonitor effectively mitigates privacy re-identification due to modal correlation while maintaining a high level of accuracy in multimodal models. Experimental results validate the efficiency and competitiveness of the approach. Keywords: Privacy preserving, Multimodal learning, Edge computing, Internet of vehicles. The rapid development of smart vehicles and high-speed 5G communication technologies has led to the emergence of edge computing services and applications for the Internet of Vehicles (IoV). Multimodal emotion detection plays a crucial role in identifying drivers' emotional instability during autonomous driving scenarios. According to a 2019 study by the American Automobile Association (AAA) Foundation for Traffic Safety, 80% of drivers confessed to experiencing significant anger while driving, with 8 million drivers reporting instances of extreme road rage. Consequently, numerous automobile manufacturers have developed Driver Monitoring Systems (DMS) to discern drivers' emotions. The recognition of drivers' emotional states empowers us to implement measures aimed at enhancing driving safety, such as deploying intelligent assisted driving to temporarily assume control of the vehicle. As illustrated in Figure 1, the process of multimodal emotion detection in DMS involves gathering diverse and emotion-rich data from in-vehicle cameras and microphones. This data is then processed at edge nodes, where efficient features are extracted and various emotions are identified on cloud servers. The resulting analysis, including labeled angry emotions, is essential for enabling safety assistance functions. Studying users' emotions through multimodal data has become a prominent research area, with artificial intelligence playing a crucial role in understanding human behavior. Researchers are endeavoring to improve the usability of multimodal emotion analysis by focusing on areas such as extending continuous scenarios, establishing connections between modalities, and overcoming limitations in partial modality data. However, these methods may inadvertently reveal stripped driver privacy information, including identity, gender, age, and speech content, even in scenarios where one modality lacks protection, due to the combination with other unprotected modalities. Most existing privacy-preserving methods primarily focus on single-modality rather than multi-modality. The adversarial training-based approach considers deep learning tasks and privacy attack defenses as optimization targets within.PriMonitor is an adaptive tuning privacy-preserving approach for multimodal emotion detection. With the rise of edge computing and the Internet of Vehicles (IoV), deep learning-based driver assistance applications have become increasingly popular. Multimodal emotion detection systems have been integrated to enhance driving safety. However, the use of in-vehicle cameras and microphones raises concerns about the extensive collection of driver privacy data. Applying privacy-preserving techniques to a single modality is insufficient to prevent privacy re-identification when combined with other modalities. PriMonitor addresses these challenges by proposing a generalized random response-based differential privacy method that enhances the speed and data availability of text privacy protection while ensuring privacy preservation across multiple modalities. To determine suitable weight assignments within a given privacy budget, pre-aggregator and iterative mechanisms are introduced. PriMonitor effectively mitigates privacy re-identification due to modal correlation while maintaining a high level of accuracy in multimodal models. Experimental results validate the efficiency and competitiveness of the approach. Keywords: Privacy preserving, Multimodal learning, Edge computing, Internet of vehicles. The rapid development of smart vehicles and high-speed 5G communication technologies has led to the emergence of edge computing services and applications for the Internet of Vehicles (IoV). Multimodal emotion detection plays a crucial role in identifying drivers' emotional instability during autonomous driving scenarios. According to a 2019 study by the American Automobile Association (AAA) Foundation for Traffic Safety, 80% of drivers confessed to experiencing significant anger while driving, with 8 million drivers reporting instances of extreme road rage. Consequently, numerous automobile manufacturers have developed Driver Monitoring Systems (DMS) to discern drivers' emotions. The recognition of drivers' emotional states empowers us to implement measures aimed at enhancing driving safety, such as deploying intelligent assisted driving to temporarily assume control of the vehicle. As illustrated in Figure 1, the process of multimodal emotion detection in DMS involves gathering diverse and emotion-rich data from in-vehicle cameras and microphones. This data is then processed at edge nodes, where efficient features are extracted and various emotions are identified on cloud servers. The resulting analysis, including labeled angry emotions, is essential for enabling safety assistance functions. Studying users' emotions through multimodal data has become a prominent research area, with artificial intelligence playing a crucial role in understanding human behavior. Researchers are endeavoring to improve the usability of multimodal emotion analysis by focusing on areas such as extending continuous scenarios, establishing connections between modalities, and overcoming limitations in partial modality data. However, these methods may inadvertently reveal stripped driver privacy information, including identity, gender, age, and speech content, even in scenarios where one modality lacks protection, due to the combination with other unprotected modalities. Most existing privacy-preserving methods primarily focus on single-modality rather than multi-modality. The adversarial training-based approach considers deep learning tasks and privacy attack defenses as optimization targets within.
Reach us at info@futurestudyspace.com