The paper introduces the Scene Adaptive Sparse Transformer (SAST) for event-based object detection, addressing the high computational costs of Transformer-based approaches while maintaining low power consumption. SAST enables window-token co-sparisification, enhancing fault tolerance and reducing computational overhead. It features innovative scoring and selection modules, along with Masked Sparse Window Self-Attention (MS-WSA), which dynamically optimizes sparsity levels based on scene complexity. Experimental results on the 1Mpx and Gen1 datasets demonstrate SAST's superior performance and efficiency compared to other dense and sparse networks. The method's adaptability is further validated through ablation studies and visualizations, showing its ability to focus on important objects and adjust sparsity levels accordingly.The paper introduces the Scene Adaptive Sparse Transformer (SAST) for event-based object detection, addressing the high computational costs of Transformer-based approaches while maintaining low power consumption. SAST enables window-token co-sparisification, enhancing fault tolerance and reducing computational overhead. It features innovative scoring and selection modules, along with Masked Sparse Window Self-Attention (MS-WSA), which dynamically optimizes sparsity levels based on scene complexity. Experimental results on the 1Mpx and Gen1 datasets demonstrate SAST's superior performance and efficiency compared to other dense and sparse networks. The method's adaptability is further validated through ablation studies and visualizations, showing its ability to focus on important objects and adjust sparsity levels accordingly.