FoleyCrafter is a novel framework designed to generate high-quality, synchronized sound effects for silent videos. It addresses the limitations of existing methods by integrating a pre-trained text-to-audio model with two key components: a semantic adapter and a temporal controller. The semantic adapter uses parallel cross-attention layers to align audio generation with video features, ensuring semantically relevant sound effects. The temporal controller, including an onset detector and a timestamp-based adapter, enhances precise audio-video synchronization. FoleyCrafter is trained using video-audio pairs and text prompts, allowing for controllable and diverse audio generation. Extensive experiments on standard benchmarks demonstrate its effectiveness in both semantic alignment and temporal synchronization, outperforming state-of-the-art methods. The framework is available at <https://github.com/open-mmlab/FoleyCrafter>.FoleyCrafter is a novel framework designed to generate high-quality, synchronized sound effects for silent videos. It addresses the limitations of existing methods by integrating a pre-trained text-to-audio model with two key components: a semantic adapter and a temporal controller. The semantic adapter uses parallel cross-attention layers to align audio generation with video features, ensuring semantically relevant sound effects. The temporal controller, including an onset detector and a timestamp-based adapter, enhances precise audio-video synchronization. FoleyCrafter is trained using video-audio pairs and text prompts, allowing for controllable and diverse audio generation. Extensive experiments on standard benchmarks demonstrate its effectiveness in both semantic alignment and temporal synchronization, outperforming state-of-the-art methods. The framework is available at <https://github.com/open-mmlab/FoleyCrafter>.