Hypergraph-based Multi-View Action Recognition using Event Cameras

Hypergraph-based Multi-View Action Recognition using Event Cameras

28 Mar 2024 | Yue Gao, Senior Member, IEEE, Jiaxuan Lu, Siqi Li, Yipeng Li, Shaoyi Du, Member, IEEE
This paper introduces HyperMV, a multi-view event-based action recognition framework that addresses the challenges of information deficit and semantic misalignment in multi-view event data. HyperMV converts discrete event data into frame-like representations and extracts view-related features using a shared convolutional network. It constructs a multi-view hypergraph neural network by treating segments as vertices and using rule-based and KNN-based strategies to create hyperedges, capturing relationships across viewpoints and temporal features. A vertex attention hypergraph propagation mechanism is introduced for enhanced feature fusion. The paper also presents the largest multi-view event-based action dataset, THU $^{MV-EACT\_50}$, comprising 50 actions from 6 viewpoints, surpassing existing datasets by over tenfold. Experimental results show that HyperMV significantly outperforms baselines in both cross-subject and cross-view scenarios, and also exceeds the state-of-the-art in frame-based multi-view action recognition. The proposed framework effectively fuses features from different viewpoints and temporal segments, leveraging high-order associations between viewpoint and temporal features. The THU $^{MV-EACT\_50}$ dataset provides a valuable resource for evaluating algorithms in multi-view event-based action recognition. The paper also discusses related work in frame-based and event-based action recognition, graph and hypergraph neural networks, and datasets for action recognition. The method is evaluated on two datasets, DHP19 and THU $^{MV-EACT\_50}$, showing significant improvements in cross-subject and cross-view scenarios. The results demonstrate the effectiveness of the proposed framework in multi-view event-based action recognition.This paper introduces HyperMV, a multi-view event-based action recognition framework that addresses the challenges of information deficit and semantic misalignment in multi-view event data. HyperMV converts discrete event data into frame-like representations and extracts view-related features using a shared convolutional network. It constructs a multi-view hypergraph neural network by treating segments as vertices and using rule-based and KNN-based strategies to create hyperedges, capturing relationships across viewpoints and temporal features. A vertex attention hypergraph propagation mechanism is introduced for enhanced feature fusion. The paper also presents the largest multi-view event-based action dataset, THU $^{MV-EACT\_50}$, comprising 50 actions from 6 viewpoints, surpassing existing datasets by over tenfold. Experimental results show that HyperMV significantly outperforms baselines in both cross-subject and cross-view scenarios, and also exceeds the state-of-the-art in frame-based multi-view action recognition. The proposed framework effectively fuses features from different viewpoints and temporal segments, leveraging high-order associations between viewpoint and temporal features. The THU $^{MV-EACT\_50}$ dataset provides a valuable resource for evaluating algorithms in multi-view event-based action recognition. The paper also discusses related work in frame-based and event-based action recognition, graph and hypergraph neural networks, and datasets for action recognition. The method is evaluated on two datasets, DHP19 and THU $^{MV-EACT\_50}$, showing significant improvements in cross-subject and cross-view scenarios. The results demonstrate the effectiveness of the proposed framework in multi-view event-based action recognition.
Reach us at info@study.space