AI Video Has Come a Long Way — I Have the Receipts

From a single MidJourney image in 2023 to full cinematic multi-shot videos today, here’s how AI video tools have evolved and what I’m actually using right now.

Madison

4 min read·Apr 17, 2026·Summarizing Mad's Youtube

marketing

AI Video Has Come a Long Way — I Have the Receipts

I built an AI video overnight in November 2023 that technically beat Russell Brunson's own team at his own competition. We didn't win — but we should have.

Here's the thing: the tools I used back then? They're barely recognizable compared to what's available right now. And most people still don't know this workflow exists.

AI video isn't a gimmick. It's a production tool — and if you're not using it in your marketing yet, you're leaving serious leverage on the table.

The Funnel Games Origin Story

Back in November 2023, ClickFunnels ran something called the Funnel Games for certified funnel builders. The challenge: build a landing page from scratch — sales copy, design, video — driving to Russell's Magnetic Marketing offer. The catch? We basically had overnight to do it.

My team — me, Susan, Andrea, and Nicole — decided to go all in. I'd been experimenting with AI video tools and had an idea: what if we created an animated narrator character? Something cinematic, storytelling-focused. Not a standard talking-head ad. A full AI-generated figure telling the story of marketing's past, present, and future.

The character was a single MidJourney-generated image. I took that image into HeyGen, uploaded an 11 Labs audio file of the voice I'd written, and that still photo started talking. Moving. Telling a story.

We built the entire thing — script, video, landing page — in one night.

And we got more purchases than anyone else. Including Russell's team.

We technically came in second on a different metric (page-to-checkout traffic volume), but on what actually matters — people buying — we won. The lesson I walked away with: a great story, even from an AI character, converts.

What Those Tools Looked Like Back Then

To be honest? A little janky. Beautiful in concept, clunky in execution.

The original workflow:

MidJourney → generate a high-quality character image
11 Labs → write a script, generate the audio with a custom voice
HeyGen → animate the still image with the audio

The result: a talking head where the face and mouth moved, but the body and beard were almost completely static. You had to add snow effects in Premiere Pro. Flicker the candle flames manually. Rotate the background slightly so the eye didn't catch how frozen everything was.

It was impressive for 2023. And it also very clearly looked like an AI video if you watched it for more than 30 seconds.

What's Possible Now

Here's what blew me away when I revisited this: I used the exact same source image for the 2026 version. Same character. And the difference is night and day.

With HeyGen's Avatar 5 (released just weeks ago), the entire body moves naturally. The beard shifts. The coat breathes. The hands gesture. You can tell something's still AI-generated if you know what to look for, but the uncanny valley is basically gone.

But HeyGen isn't even where I'm doing most of my video work now.

The New Stack: Higsfield + Cling + 11 Labs

My current workflow is built around Higsfield as the front-end interface, running Cling 3.0 as the underlying model. Here's why this combo is different:

Character reference sheets. You build a reference sheet — multiple angles of your character, color palette, detail shots of clothing — and Cling uses that as a consistent anchor. It's not perfect, but it's dramatically better than trying to prompt your way back to the same character across different scenes.

Multi-shot generation. Instead of a single talking-head clip, I can generate multiple distinct shots — a wide shot, a close-up, a different angle — that all feel like the same character in the same world. Cut them together and it starts feeling like an actual production.

Longer clips. One of the original frustrations with AI video was the 5-to-8-second limit. That's basically useless for storytelling. Higsfield gets you much more runway per clip when you want sustained motion.

The voice situation is the one place I still prefer a separate workflow: generate the audio in 11 Labs first, then bring it into Higsfield. The native voices inside Cling are fine, but 11 Labs gives me more control over pacing, pauses, and emotion. For consistent character voices — like my Funnel Baby — I've cloned a voice in Cling and keep using that same file every time.

Funnel Baby Is the Best Example

If you've seen any of my Funnel Baby content, that character is entirely AI-generated — the image, the motion, the voice. Every clip is shot-by-shot, 15 seconds at a time, stitched together. The voice has been the same cloned file since the first video. The character reference ensures she looks consistent from clip to clip.

The whole thing costs pennies per clip. We're talking less than a dollar for a 30-second video. And what used to take hours of manual compositing in Premiere Pro — adding movement, flickering lights, layering snow effects — happens in the generation itself.

This is the shift: AI video went from requiring a video editor to requiring a storyteller.

The Bottom Line

The question isn't whether AI video is good enough. It is. The question is whether you're willing to learn the workflow.

If I had the current toolset back in November 2023, we wouldn't have just technically beaten Russell's team. We would have won the whole thing — and it wouldn't have been close.

The tools are there. The barrier is lower than it's ever been. All it takes is one afternoon to get your first character set up, your first voice cloned, and your first real clip out the door.

Start there. The rest gets easier fast.