Scene Adaptive Sparse Transformer (SAST) is proposed for event-based object detection. SAST enables window-token co-sparsification, significantly enhancing fault tolerance and reducing computational overhead. It leverages innovative scoring and selection modules, along with Masked Sparse Window Self-Attention (MS-WSA), to achieve scene-aware adaptability. SAST dynamically optimizes sparsity level according to scene complexity, maintaining a remarkable balance between performance and computational cost. Evaluation results show that SAST outperforms all other dense and sparse networks on two large-scale event-based object detection datasets (1Mpx and Gen1). SAST achieves high performance with low computational cost by adaptively selecting important objects and optimizing sparsity based on scene complexity. The method is efficient and effective, demonstrating strong scene adaptability and performance.Scene Adaptive Sparse Transformer (SAST) is proposed for event-based object detection. SAST enables window-token co-sparsification, significantly enhancing fault tolerance and reducing computational overhead. It leverages innovative scoring and selection modules, along with Masked Sparse Window Self-Attention (MS-WSA), to achieve scene-aware adaptability. SAST dynamically optimizes sparsity level according to scene complexity, maintaining a remarkable balance between performance and computational cost. Evaluation results show that SAST outperforms all other dense and sparse networks on two large-scale event-based object detection datasets (1Mpx and Gen1). SAST achieves high performance with low computational cost by adaptively selecting important objects and optimizing sparsity based on scene complexity. The method is efficient and effective, demonstrating strong scene adaptability and performance.