9 Feb 2017 | Vincent Dumoulin & Jonathon Shlens & Manjunath Kudlur
This paper presents a learned representation for artistic style, demonstrating that a single, scalable deep network can parsimoniously capture the artistic style of a diversity of paintings. The network uses conditional instance normalization to learn multiple styles simultaneously, allowing for the arbitrary combination of artistic styles. The model reduces each style image into a point in an embedding space, enabling the generation of new painting styles by combining the learned styles. The approach is flexible and comparable to single-purpose style transfer networks, both qualitatively and in terms of convergence properties. The model's embedding space representation permits the arbitrary combination of artistic styles in novel ways not previously observed. The paper also shows that the trained network can generalize across painting styles, and that incorporating a new style is efficient. The conditional instance normalization approach allows for the arbitrary combination of artistic styles, as demonstrated by the ability to blend very distinct painting styles through convex combinations of the learned γ and β parameters. The work suggests the existence of a learned representation for artistic styles that is flexible enough to capture a diversity of the painted world. The results show that the model can generate pastiches that are comparable to those produced by individual networks, and that the learned representation is useful in arbitrarily combining artistic styles. The paper also discusses the implications of the learned representation for future research in understanding the computation behind style transfer networks and the representation of images in general.This paper presents a learned representation for artistic style, demonstrating that a single, scalable deep network can parsimoniously capture the artistic style of a diversity of paintings. The network uses conditional instance normalization to learn multiple styles simultaneously, allowing for the arbitrary combination of artistic styles. The model reduces each style image into a point in an embedding space, enabling the generation of new painting styles by combining the learned styles. The approach is flexible and comparable to single-purpose style transfer networks, both qualitatively and in terms of convergence properties. The model's embedding space representation permits the arbitrary combination of artistic styles in novel ways not previously observed. The paper also shows that the trained network can generalize across painting styles, and that incorporating a new style is efficient. The conditional instance normalization approach allows for the arbitrary combination of artistic styles, as demonstrated by the ability to blend very distinct painting styles through convex combinations of the learned γ and β parameters. The work suggests the existence of a learned representation for artistic styles that is flexible enough to capture a diversity of the painted world. The results show that the model can generate pastiches that are comparable to those produced by individual networks, and that the learned representation is useful in arbitrarily combining artistic styles. The paper also discusses the implications of the learned representation for future research in understanding the computation behind style transfer networks and the representation of images in general.