StyleShot is a novel method for style transfer that achieves high-quality stylized images without test-time style-tuning. The method is based on Stable Diffusion and incorporates a style-aware encoder and a content-fusion encoder. The style-aware encoder is designed to extract rich and expressive style embeddings from reference images, while the content-fusion encoder enhances the integration of content and style. StyleShot is trained on a style-balanced dataset called StyleGallery, which contains a diverse range of image styles. The method is evaluated on a benchmark called StyleBench, which includes 73 distinct styles across 490 reference images. Experimental results show that StyleShot outperforms existing state-of-the-art methods in text and image-driven style transfer. The method is effective in capturing a wide range of styles, including 3D, flat, abstract, and fine-grained styles. The style-aware encoder is trained with a decoupling strategy to extract multi-level patch embeddings, and the content-fusion encoder is designed to better integrate content and style. The method also demonstrates the ability to learn fine-grained styles and is effective in transferring styles to content images. The results show that StyleShot achieves superior performance in style transfer tasks compared to existing methods.StyleShot is a novel method for style transfer that achieves high-quality stylized images without test-time style-tuning. The method is based on Stable Diffusion and incorporates a style-aware encoder and a content-fusion encoder. The style-aware encoder is designed to extract rich and expressive style embeddings from reference images, while the content-fusion encoder enhances the integration of content and style. StyleShot is trained on a style-balanced dataset called StyleGallery, which contains a diverse range of image styles. The method is evaluated on a benchmark called StyleBench, which includes 73 distinct styles across 490 reference images. Experimental results show that StyleShot outperforms existing state-of-the-art methods in text and image-driven style transfer. The method is effective in capturing a wide range of styles, including 3D, flat, abstract, and fine-grained styles. The style-aware encoder is trained with a decoupling strategy to extract multi-level patch embeddings, and the content-fusion encoder is designed to better integrate content and style. The method also demonstrates the ability to learn fine-grained styles and is effective in transferring styles to content images. The results show that StyleShot achieves superior performance in style transfer tasks compared to existing methods.