I've enjoyed learning about AI Engineering and the many technologies that make up this field. For three months, it's felt like I'm discovering a new concept every day. I know it's almost impossible to learn everything you come across, so I've learnt to give myself grace. Something I've 'discovered' recently has been the Machine Learning (ML) sub-field. I mean, I've heard the term in the past, but for obvious reasons, I never thought too much about it. But I'm taking AI seriously now, and I see how ML, together with deep learning, neural networks, and many other concepts and terminologies, work together to make the AI industry what it is today.

This may be a little vain (yeah, just a little), but I've found ML—for want of a better word—sexy. The idea, the practical application, the effect, and the real-world use case have me honestly smitten. I want to know more about this—you know, take a peek under the hood.

So I'm researching ML concepts, and I come across the Lex Friedman podcast on YouTube where he interviewed Ian Goodfellow, who is essentially the father of Generative Adversarial Networks (GANs). That hour-long video gave me real joy and plenty of insight into a concept I hadn't come across all my life up until that point. One really amazing thing about that interview was Goodfellow talking about how the idea of GANs hit him during a drunken conversation with friends at a bar in 2014. I’m just going to say I know where I'd be tonight, haha. Ian Goodfellow and his team at the University of Montreal also wrote an original paper introducing GANs to the world.

https://youtu.be/Z6rxFNMGdn0?si=xc5jXftNbz5T2zH5&embedable=true

Naturally, I went down a GANs rabbit hole and boy, did I learn a lot. I even tried my hand at a project so I could practice some of the things I was learning about. In this piece, I'll be telling you about GANs at their core level and some of the important techniques and concepts. I'll also do a walkthrough/explainer on the project I took on and the thinking that guided that.

What is a GAN?

A generative adversarial network is a particular type of machine learning model that is trained on some sets of data, like images, audio, or texts, to make them look real. Most machine learning models were discriminative before GANs. What this means is that these models were mostly used for classification or regression tasks. GANs "changed the game", ushering in the era of creativity in machine learning.

Prominent figure in the field of artificial intelligence and machine learning, and the Chief Scientist at Meta AI, Yann LeCun, described GANs as “the most interesting idea in the last 10 years in Machine Learning” — I agree. At the heart of GANs is a deceptively simple idea: generate samples that look realistic by pitting two neural networks against each other in a game of deception vs detection.

When a GAN creates a new image, e.g a cat, it's using a neural network to produce a cat that has never existed before. It's not like you'd be compositing different photos of cats together and then end up, for example, having the final image of the cat you generate take the eye off one cat and the ear off another. This is a more digestible process where you train the neural network on a lot of data, and it generates images of entirely new cats using a probability distribution of the data it has.

Remember how I mentioned earlier how the idea behind GANs is putting two neural networks together? Yes, GANs at their core have two models playing a competitive game:

How GANs Work?

This is how GANs work.

The generator first produces nonsense because it has no idea what real data looks like. Naturally, this results in the discriminator easily spotting the fakes. Over time, the generator learns how to fool the discriminator better, and the discriminator adjusts, adapts and gets sharper at spotting better fakes. This back-and-forth 'duel' makes the generator's outputs more and more indistinguishable from real data. In the end, we end up with websites like this one where a totally fake—but very realistic—human face pops up on every refresh.

Basically, this is like a cat-and-mouse game. The generator in this instance is the mouse, and it keeps trying to sneak in fakes. The discriminator is the cat that gets sharper at catching these fakes. This duel is the essence of Generative Adversarial Networks (GANs).

Application of GANs

GANs are used a lot in generative design, and companies like Adobe use them to build Photoshop tools, IBM uses them for data augmentation, Google uses them for text generation, while social media platforms like Instagram, Snapchat, and TikTok use them to create image filters.

The Math Behind GANs

  1. The Minimax Value Function (V(D, G))

This minimax value function is a core mathematical representation of the GAN framework that both networks compete over:

The Minimax Value Function

min_G max_D V(D, G) = E_{x~p_data(x)} [log D(x)] + E_{z~p_z(z)} [log (1 - D(G(z)))]

The minimax value function above is at the heart of GANs.

  1. Discriminator Loss

The discriminator's goal is to correctly classify real data as real and generated data as fake. It maximizes the value function this way. The loss here can be represented as:

J_D = - (1/m) Σ log D(x_i) - (1/m) Σ log(1 - D(G(z_i)))

  1. Generator Loss

The generator's goal is to correctly minimize fake data labeled as fake. This means that it wants to fool the discriminator into believing its generated samples are real. It maximizes the value function this way. The loss function can be represented as:

J_G = - (1/m) Σ log(1 - D(G(z_i)))

Why GANs Are Hard to Train

GANs are famously unstable because of the following challenges:

Common Fixes

Over the years, Researchers have invested in tricks like:

GANs need constant tuning as they are powerful but temperamental. In other words, you'd have to tune a lot before your model can reach the evaluation metrics they need for accuracy.

Types of GANs

Evaluation Metrics

GANs do not have 'accuracy' like classifiers. They are instead judged on quality and diversity. The following are metrics used in evaluating the accuracy of GANs:

Why I Chose This Project

I knew the applications of GANs included (but were not limited to) creating hyper-realistic faces, imaginary landscapes, and even deepfakes of celebrities. But I wanted something that I could relate to. Something close to home and grounded in an African context.

Two ideas stood out:

So I thought, what if a CycleGAN trained to translate old houses into modern duplex concepts could imagine these transformations?

Could we take a photo of an old Lagos house and instantly see a modern duplex version, not drawn by an architect but generated by an algorithm?

These are what is possible with Generative Adversarial Networks, and so for this project, I built LagosGAN, a two-part experiment in synthetic creativity. The goal with this monorepo containing these projects was not to replace designers or architects, but to explore how AI might act as a sketch partner by providing visual starting points that spark creativity and conversation.

Data & Ethics

The soul of any GAN is data. To build the dataset for both models, I curated licensed and openly available African album art for AfroCover. For Lagos2Duplex, I collected photos of old Lagos houses and modern duplexes. To achieve this, I built a script that crawled some important platforms. The result of this scraping wasn't 100% accurate, but I managed to gather almost 3000 images, enough to pick a core sample size from.

Every dataset entry was checked for usage rights and had its source Metadata tagged to it. To keep the boundaries clear, I built Dataset Cards that documented sources, licenses, and limitations. Let me be explicit in this article: these images are concept art, not construction drawings.

Building LagosGAN

When I started the project, I had a simple question in mind: What does AI-generated creativity that celebrates African culture rather than just remixing Western datasets look like? The start of the answer to this question was a single workspace hosting two very different but complementary GAN projects:

AfroCover (StyleGAN2-ADA)

With this project, I curated roughly 1,200 album covers, cleaned them up to 256x256 resolution, and fine-tuned a StyleGAN-Adaptive Discriminator Augmentation (ADA) Generator to stabilize learning. StyleGAN is perfect for this because it is famous for generating ultra-sharp, high-resolution images. It works by controlling “styles” at different layers of the network. Things like color palettes, patterns, and textures.

A few practical notes:

In the Gradio demo, the AfroCover tab samples from the generator. Every time you hit “Generate,” a fresh hybrid of vibrant palettes, bold typography, and geometric motifs is produced. Under the hood, the model loads its weights straight from the Hugging Face model repo (theelvace/afrocover/latest.pt), which makes updates as simple as pushing a new checkpoint.

Lagos2Duplex (CycleGAN)

I chose CycleGAN for the Lagos2Duplex project and built:

The setup had Domain A = old Lagos houses, and Domain B = modern duplexes. There was a heavy reliance on cycle-consistency loss (set to 10× the adversarial loss), ensuring that A → B → A reconstructs the original.

The Monorepo Glue

Everything, ranging from the data prep scripts, training configs, checkpoints, and docs, lives in one report, so that shipping updates can be done quickly. Some highlights in this repo include:

Evaluation

Working on this project has made me realize how tricky evaluating GANs is. Accuracy here doesn’t make sense. Instead, metrics that capture realism and diversity are used. I already talked about some of these metrics—FID, KID, LPIPS. These were what I tracked, but numbers only tell part of the story. With this in mind, I ran a small human test by showing the outputs to some people and asking two questions:

For AfroCover, 75% of respondents said they could see one of the covers being used for an album/song cover. 40% said the duplexes looked “real enough to spark ideas.”

Not perfect, but promising.

Lessons Learned

Conclusion

I decided to learn about GANs and how they work because I thought they were a very pivotal part of the evolution of the AI/ML field. The podcast that features Ian Goodfellow also had me sold. It's been both a very frustrating and very fun experience building LagosGAN. This was less about chasing state-of-the-art metrics and more about exploring what happens when a technology like GANs is put to work on African problems.

Hugging Face Space — https://huggingface.co/spaces/theelvace/lagos-gan-demo