This paper introduces a new class of loss functions, called Deep Perceptual Similarity Metrics (DeePSiM), for image generation tasks. Traditional loss functions based on image space distance often lead to over-smoothed results. DeePSiM instead computes distances between image features extracted by deep neural networks, which better reflect perceptual similarity and lead to sharper, more natural images. The method combines feature space similarity with adversarial training to achieve better results.
The paper demonstrates three applications: autoencoder training, variational autoencoder modification, and deep convolutional network inversion. In all cases, the generated images are sharp and resemble natural images. The approach uses a combination of feature loss, adversarial loss, and pixel space loss to train the models. The feature loss is computed using a comparator network, while the adversarial loss is derived from a discriminator network that distinguishes generated images from real ones. The pixel space loss helps stabilize training.
The paper also compares the proposed method with existing approaches, showing that DeePSiM outperforms traditional loss functions in terms of image quality and perceptual similarity. The method is applied to various image generation tasks, including image compression, generative modeling, and feature inversion. The results show that the method preserves fine details and produces realistic images, even when trained on low-dimensional image representations.
The paper also discusses related work, including other image generation methods and perceptual similarity metrics. It highlights the importance of using deep learned feature representations over traditional hand-designed metrics. The authors conclude that DeePSiM is a promising approach for image generation tasks, offering better results than traditional methods. The method is applicable to a wide range of image generation tasks and can be used in various applications, including image compression, generative modeling, and feature inversion.This paper introduces a new class of loss functions, called Deep Perceptual Similarity Metrics (DeePSiM), for image generation tasks. Traditional loss functions based on image space distance often lead to over-smoothed results. DeePSiM instead computes distances between image features extracted by deep neural networks, which better reflect perceptual similarity and lead to sharper, more natural images. The method combines feature space similarity with adversarial training to achieve better results.
The paper demonstrates three applications: autoencoder training, variational autoencoder modification, and deep convolutional network inversion. In all cases, the generated images are sharp and resemble natural images. The approach uses a combination of feature loss, adversarial loss, and pixel space loss to train the models. The feature loss is computed using a comparator network, while the adversarial loss is derived from a discriminator network that distinguishes generated images from real ones. The pixel space loss helps stabilize training.
The paper also compares the proposed method with existing approaches, showing that DeePSiM outperforms traditional loss functions in terms of image quality and perceptual similarity. The method is applied to various image generation tasks, including image compression, generative modeling, and feature inversion. The results show that the method preserves fine details and produces realistic images, even when trained on low-dimensional image representations.
The paper also discusses related work, including other image generation methods and perceptual similarity metrics. It highlights the importance of using deep learned feature representations over traditional hand-designed metrics. The authors conclude that DeePSiM is a promising approach for image generation tasks, offering better results than traditional methods. The method is applicable to a wide range of image generation tasks and can be used in various applications, including image compression, generative modeling, and feature inversion.