Artificial Intelligence is developing and outperforms Man in more and more areas. But Art is beyond that, isn’t it? The creation of art would allegedly need a sensibility that only humans have, and not even all off them! Well, don’t be so sure anymore…
Nowadays, many techniques are being developed, and they become more and more impressive automating the creative process. There are many different use cases, from the automatic generation of music to the creation of unique artwork from the heartbeats of a person (Project: Payed by heartbeats). Among these techniques, one allows you to apply the unique style of your favorite painter to one of your photos: this is Neural Style Transfer.
AI can have different roles in the creation of art. It can be a technical helper, or it can act as a muse that stimulates the artist's creativity. But for Style Transfer, the computer acts as a copyist of famous artists, able to create new works in recognizable styles on its own.
The principle of this algorithm is indeed to apply the style of an image to another image. Very often, we try to apply the style of a painting to a photograph, for example the characteristic style of a Mondrian composition to your souvenir photo of the Eiffel Tower. In addition to the artistic and creative side, the scientific applications are numerous. Let's detail the main lines of its functioning!
Basically, the Style Transfer is to obtain an image by preserving the content of an image but applying the style of another image. As you can see, the first challenge is to answer the question: how to extract the style or the content of an image?
As we mentioned, it is about Deep Learning, which is based on a stack of layers of neurons (they have been invented in order to reproduce the human brain). These layers of neurons receive many images as input to train themselves to recognize the characteristics of the image. The deeper the layer, the more it focuses on particular details of the image. A low-level layer only represents features such as the edges of the image while details such as the shape of the eyes are spotted by a deeper layer of neurons.
Generally, for images, we use these neural networks to do classification, or in other words image recognition. The technology used behind these neural networks is that of the Convolutional Neural Networks. In short, we go from one layer to another by particular mathematical operations, the main one being convolution.
But let's skip this technical aspect and get back to what interests us: Style Transfer! What we want is to be able to recognize the style or the content of an image. To do this, we just need to use a neural network that is already trained, i.e. that is already capable of recognizing different characteristics of an image at different depths, and to extract the characteristics recognized by some layers of this network. In particular we can extract the style of an image and its content.
Once we know how to access the style and content of an image, we can ask ourselves the following question: how can we generate an image that contains both the content of one and the style of the other?
Everything is based on the minimization of two distances: one that describes the difference between the content of two images and another that describes the difference of style. Obviously, we cannot always minimize both distances at the same time, but we will try to minimize the sum of the distances weighted by their relative importance. We classically use the mean square error for the content, and a less common error based on Gram matrices for the style. Finally, we use classical Machine Learning methods – backpropagation and gradient descent – on a base image to minimize the distances to the initial content and style images respectively.
That's it! You can now associate any photo with a work of art and observe the result in a few seconds! (Have a try : https://deepai.org/machine-learning-model/fast-style-transfer )
Benjamin Bouf & Théo Duez [Automatants P2023]
Comments