20 Mar 2024 | Weiyi Lv1*, Yuhang Huang2*, Ning Zhang3 Ruei-Sung Lin3 Mei Han3 Dan Zeng1†
DiffMOT is a real-time multiple object tracker designed to handle non-linear motion patterns, which are common in complex scenarios such as dance and sports. The tracker introduces a novel Decoupled Diffusion-based Motion Predictor (D²MP) to model the entire distribution of various motion patterns and predict individual object motions based on historical data. D²MP optimizes the diffusion process with fewer sampling steps, achieving real-time performance at 22.7 FPS. On the DanceTrack and SportsMOT datasets, DiffMOT outperforms state-of-the-art (SOTA) trackers with 62.3% and 76.2% HOTA metrics, respectively. The tracker's effectiveness is demonstrated through quantitative comparisons and qualitative visualizations, showing superior performance in handling non-linear motions compared to traditional Kalman Filter-based predictors. DiffMOT's strong generalization capabilities are also highlighted, as it can be applied to new scenarios without retraining.DiffMOT is a real-time multiple object tracker designed to handle non-linear motion patterns, which are common in complex scenarios such as dance and sports. The tracker introduces a novel Decoupled Diffusion-based Motion Predictor (D²MP) to model the entire distribution of various motion patterns and predict individual object motions based on historical data. D²MP optimizes the diffusion process with fewer sampling steps, achieving real-time performance at 22.7 FPS. On the DanceTrack and SportsMOT datasets, DiffMOT outperforms state-of-the-art (SOTA) trackers with 62.3% and 76.2% HOTA metrics, respectively. The tracker's effectiveness is demonstrated through quantitative comparisons and qualitative visualizations, showing superior performance in handling non-linear motions compared to traditional Kalman Filter-based predictors. DiffMOT's strong generalization capabilities are also highlighted, as it can be applied to new scenarios without retraining.