Change and Difference of models - CycleCoopNet: 基於合作學習的神經網路進行圖片轉換

descriptor using a modified contrastive divergence algorithm. The descriptor uses the output of the generator and revises it to the correct answer that the current descriptor has learned. Then the generator can learn from the revise results by MCMC(Markov chain Monte Carlo) teaching algorithm. This is also the most different part of our work and other previous work.

Next, there will be a problem that we should solve is how to define what is ”correct results”. That is, although the results we generate might be divergence, in the beginning, we will not accept all of the generate results such as irrational pictures, these diverse results need to be checked matching our prediction. Therefore, we add a recovered generator to our model. All of the generated images can be transformed into the original picture by our recovered generator. That is to say, the image generated by each of our generators must be able to restore as close as possible to the original picture by our recovered generator.

These generated pictures are the ”correct results” that we expected in our training.

Therefore, our main contribution is to propose a model that uses the supervised learn-ing strategy to do the Image-to-Image Translation. The descriptor learnlearn-ing from the finite observed data and help us to revise the generator results, that is, the descriptor can pro-vide infinite labeled data for our generator model training. Using a supervised learning strategy is more stable than the unsupervised learning strategy shows in our experiment.

Also, this model can generate diverse results because of the infinite labeled data. This model can generate many kinds of design form a single draft and this is more similar to real-life experience. Our work also provides a new way of generating diverse pictures and more stable results than previous work.

1.2 Change and Difference of models

Generative Adversarial Networks (GAN) GAN [4] is a model using two neural network models to do machine learning. One model is a generator, using random latent factors to generate the picture. Another model is discriminator, which can tell us one picture is real or fake. Fake pictures mean the picture is generated by the generator,

‧

Figure 1: This figure shows how we get the idea of CycleCoopNet, and the change of models.

not a picture from the real world. We compare the pictures we generated with the real picture. The goal of our generator is to cheat discriminator, that is, generated the picture looks real. By contrast, the goal of our discriminator is to tell the picture is real or fake. So discriminator tries to get rid of the cheating of the generator. These two models will upgrade during the training, once the generator success to cheat the generator, the discriminator will upgrade its ability to avoid being cheated. On the other hand, once the generator can not cheat the discriminator, the generator will upgrade its ability to cheat the discriminator. After training many epochs, that is, generator cheats discriminator and discriminator tell the picture is fake pictures generated by the generator many times.

We will have a nice generator to generate pictures, with the bonus of nice discriminator, that can tell us the pictures are real or fake.

Cooperative Learning Networks (CoopNet) CoopNet [1] also use two neural network models training generator to generate pictures. One model is the generator, this generator is as same as the generator of GAN. Another model is a descriptor, the descriptor can help us to compare the latent factors of pictures. In the work of GAN [4], we know a picture can be generated by latent factors, and different latent factors mapping to different pictures. Descriptor transforms our generated pictures and real pictures to two latent factors, and we compare these two latent factors to calculate the loss and update the descriptor model. This is also how descriptor learning in the training.

The generator learns from the descriptor learning, we use Langevin revision dynam-ics [5] to revise the pictures generated from the generator. Then we think the revised

‧

picture as a true answer in the learning. We compare the revised picture and generated pictures to update our generator. The likelihoods of both models involve intractable inte-grals, and the gradients of both log-likelihoods involve intractable expectations that can be approximated by Markov chain Monte Carlo (MCMC) [1]. The learning of the genera-tor model is based on how the MCMC in changes the pictures generated by the generagenera-tor model. We can use a metaphor like a descriptor model (teacher) distills its knowledge to the generator model (student) via MCMC, and we call it MCMC teaching.

By MCMC sampling, our descriptor model has the benefit of FRAME (Filters, Ran-dom field, And Maximum Entropy) models [6]. A FRAME model is a ranRan-dom field model that defines a probability distribution on the image space. Our images can be generated from the probability distribution by the model learning from the observed data. The probability distribution is the result of maximum entropy distribution, that is, this result can reproduce the statistical properties of filter responses in the observed images. And also because of the maximum entropy, this distribution is the most random distribution that corresponds to the observed statistical properties of filter responses. Our generated images sampled from this distribution can be considered typical images that share the statistical properties of the observed images.

Using this descriptor to help generator learning. The descriptor learns from the finite amount of observed data, and the generator learns from virtually infinite amounts of revised data. The generator accumulates the MCMC transitions of the descriptor via MCMC teaching and reproduces the MCMC transitions by direct ancestral sampling. In other words, the descriptor distills its MCMC algorithm into the generator. This also makes our training turning the unsupervised learning of the generator [7] into supervised learning.

Cycle-Consistent Adversarial Networks (CycleGAN) CycleGAN [2] use two GANs, and let the output of one GAN can be the input of another GAN. This can make one GAN have an ability to transform the picture from A domain to B domain. On the other hand, we use another GAN to recover the picture from B domain to A domain.

‧

These paired GANs we called cycleGAN. If we use A domain as a picture style and B domain as another picture style. We can do Image-to-Image translation.

CycleCoopnet Our model we called CycleCoopnet. As we know the power of Coop-net, we try to use Coopnet to generate Image-to-Image Translation models. We use two CoopNets, and let the output of one Coopnet can be the input of another Coopnet. This idea is from CycleGAN, we know we can generate pictures from one style to another style by one generator. We use another generator to transform the style back and try to make the result of the transform cycle can be consistent with the origin picture.

在文檔中 CycleCoopNet: 基於合作學習的神經網路進行圖片轉換 - 政大學術集成 (頁 11-14)