Generative Adversarial Network (GAN) – Shahabuddin Amerudin @ UTM

A Generative Adversarial Network (GAN) is a type of deep learning model that consists of two main parts: a generator and a discriminator. The generator is a neural network that takes a random noise as input and generates new data, such as images, that are intended to be similar to the input dataset. The discriminator is another neural network that receives both the generated data and real data from the input dataset, and attempts to distinguish between the two. The goal of the generator is to produce data that the discriminator can’t tell apart from the real data, while the goal of the discriminator is to correctly identify which data is real and which is generated.

The training process of a GAN consists of iteratively updating the parameters of the generator and discriminator in order to improve their performance. During the training process, the generator produces a set of new data, the discriminator is trained to evaluate the realism of the data, and the generator is updated to produce more realistic data. The process is repeated until the generator produces data that is indistinguishable from the real data.

At a high level, the generator and discriminator are playing a two-player minimax game, where the generator tries to produce realistic data and the discriminator tries to identify the generated data.

The generator of a GAN is typically a deep neural network such as a Convolutional Neural Network (CNN) or a Deconvolutional Neural Network (DeconvNet) that are used to produce images. The discriminator is also typically a deep neural network such as a CNN that is used to classify the images as either real or fake.

The GAN architecture allows the generator to learn the underlying probability distribution of the input data and generate new samples from that distribution. This makes GANs powerful models for tasks such as image generation, text-to-image synthesis, and video prediction.

It’s worth noting that GANs are difficult to train because the generator and discriminator can easily get stuck in a state where the generator produces poor quality samples, and the discriminator is not able to improve

any further. This is known as the “mode collapse” problem, where the generator produces samples that are very similar to one another and do not capture the diversity of the input dataset. There are several techniques that can be used to overcome this problem, such as using different architectures for the generator and discriminator, adjusting the learning rates and optimizers, and using techniques such as batch normalization and dropout to stabilize the training process.

Another issue with GANs is that the training process can be quite unstable, and it can be difficult to get the generator and discriminator to converge to a stable equilibrium. This can lead to the generator producing poor quality samples, and the discriminator not being able to improve. To overcome this, it is common to use techniques such as weight initialization, regularization, and gradient clipping to stabilize the training process.

Additionally, GANs are also sensitive to the choice of the input noise distribution and the architecture of the generator and discriminator. It’s important to carefully choose the input noise distribution and experiment with different architectures to find the one that works best for a specific dataset or task.

It’s worth noting that, GANs have seen a lot of progress in recent years and there are many variations and extensions to the original GAN architecture that have been developed to improve its performance and stability. These include Wasserstein GANs (WGANs), Least Squares GANs (LSGANs), and Spectral Normalization GANs (SNGANs)

In summary, Generative Adversarial Networks (GANs) are a powerful type of deep learning model that can be used to generate new data, such as images, from a given input dataset. GANs consist of two main parts: a generator and a discriminator, which are trained to work together to produce realistic data. GANs are powerful models but they can be difficult to train and require careful tuning to achieve good results.