6 Nov 2017 | Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
This paper revisits the fast stylization method introduced by Ulyanov et al. (2016) and demonstrates that a small change in the generator architecture, specifically replacing batch normalization with instance normalization, significantly improves the quality of generated images. The instance normalization layer is applied both during training and testing, allowing for the removal of instance-specific contrast information from the content image, which simplifies the generation process. This modification results in high-quality images comparable to those produced by the slower optimization-based method of Gatys et al. (2016) but can be achieved in real-time on standard GPU hardware. The authors also provide experimental results showing that both generator architectures from Ulyanov et al. (2016) and Johnson et al. (2016) benefit from this change, with the residuals architecture of Johnson et al. (2016) being slightly more efficient. The paper concludes by suggesting further experimentation with similar ideas for image discrimination tasks.This paper revisits the fast stylization method introduced by Ulyanov et al. (2016) and demonstrates that a small change in the generator architecture, specifically replacing batch normalization with instance normalization, significantly improves the quality of generated images. The instance normalization layer is applied both during training and testing, allowing for the removal of instance-specific contrast information from the content image, which simplifies the generation process. This modification results in high-quality images comparable to those produced by the slower optimization-based method of Gatys et al. (2016) but can be achieved in real-time on standard GPU hardware. The authors also provide experimental results showing that both generator architectures from Ulyanov et al. (2016) and Johnson et al. (2016) benefit from this change, with the residuals architecture of Johnson et al. (2016) being slightly more efficient. The paper concludes by suggesting further experimentation with similar ideas for image discrimination tasks.