This paper addresses the challenge of learning scene segmentation that can generalize well to foggy scenes, particularly in safety-critical applications like autonomous driving. Existing methods typically require both annotated clear and foggy images to train a curriculum domain adaptation model, but they can only generalize to the specific foggy domains seen during training. To overcome this limitation, the authors propose a novel bi-directional wavelet guidance (BWG) mechanism that does not require any foggy images in the training stage and can generalize to any unseen foggy scenes. The BWG mechanism enhances content representation, decorrelates urban scene styles, and decorrelates fog styles by using the Haar wavelet transformation to separate low-frequency and high-frequency components. The low-frequency components are focused on content enhancement, while the high-frequency components are shifted to decorrelate urban and fog styles. This approach is integrated into existing mask-level Transformer segmentation pipelines. Extensive experiments on four foggy-scene segmentation datasets show that the proposed method significantly outperforms existing domain generalized and curriculum domain adaptation methods. The source code is available at https://github.com/BiQiWHU/BWG.This paper addresses the challenge of learning scene segmentation that can generalize well to foggy scenes, particularly in safety-critical applications like autonomous driving. Existing methods typically require both annotated clear and foggy images to train a curriculum domain adaptation model, but they can only generalize to the specific foggy domains seen during training. To overcome this limitation, the authors propose a novel bi-directional wavelet guidance (BWG) mechanism that does not require any foggy images in the training stage and can generalize to any unseen foggy scenes. The BWG mechanism enhances content representation, decorrelates urban scene styles, and decorrelates fog styles by using the Haar wavelet transformation to separate low-frequency and high-frequency components. The low-frequency components are focused on content enhancement, while the high-frequency components are shifted to decorrelate urban and fog styles. This approach is integrated into existing mask-level Transformer segmentation pipelines. Extensive experiments on four foggy-scene segmentation datasets show that the proposed method significantly outperforms existing domain generalized and curriculum domain adaptation methods. The source code is available at https://github.com/BiQiWHU/BWG.