
As generative AI tools evolve, businesses are asking the same question:
Can AI really create video content that’s good enough to use?
To find out, we ran an experiment. Our team set out to create a 45-second educational children’s TV show using only AI tools—on a total software budget of just $50.
Why a kids' show? Because it’s a challenging, expressive, and high-expectation format—and it allowed us to explore educational use cases in a playful, visually demanding setting.
This article shares our full process and learnings. It's a breakdown of where today’s AI tools shine, where they fall short, and what’s possible right now for businesses looking to produce fast, flexible, lower-cost content—without compromising on creativity.

The Goal:
Test the current state of AI video generation by producing an entire short film using generative tools—script, visuals, music, voiceover, and animation—with no traditional production involved.
We wanted to learn:
-
Which tools are truly usable today?
-
How consistent and realistic can AI video outputs be?
-
Where are the trade-offs—cost, quality, control?
-
Could this approach work for branded content, internal comms, education, or healthcare storytelling?
Our 10-Step AI Video Workflow:
Tools Used:
-
Screenplay: ChatGPT
-
Storyboard: Storyboarder.ai
-
Title Sequence Design: CoPilot
-
Title Sequence Animation: Runway
-
Dr Jazz Bones Voice: Google Veo 3 (cloned in ElevenLabs)
-
Videos: Runway, Google Veo 2, Google Veo 3
-
Title & Bloopers Music: Suno
-
Editing: Premiere (manual)
Total Cost: $50
Process and Workflow:
We began by giving the brief to ChatGPT, which promptly generated a suitable screenplay on its first attempt. We then input that script into Storyboarder.ai, an AI tool that converts screenplays into shot lists and generates accompanying storyboard images. While not without flaws—particularly the “consistent character” feature, which failed to maintain a female doctor’s appearance across shots—it produced a usable storyboard.
Next, we focused on defining the short’s style and character design. Using CoPilot, we generated images of the doctor character based on the ChatGPT script, as well as a doctor’s office backdrop. The results were mixed; while some images were promising, we ultimately found the built-in tools of AI video platforms more effective for our needs.
Our first video platform was Runway. We used it to build a more detailed storyboard, referencing images from Storyboarder.ai for composition. Runway’s ability to transpose elements—such as characters, backgrounds, and visual styles—between reference images was particularly useful. After considerable experimentation, we achieved a consistent enough aesthetic to proceed. While Runway isn’t the most powerful for generating full videos—especially when prompts lack visual references—it excels at producing stills. That said, some final-cut clips were generated in Runway, proving it’s more than just a storyboarding tool.
The standout platform for video generation was Google’s Veo 2 and Veo 3 models. These tools outperformed Runway in consistency, clarity, and prompt interpretation. They can adapt or build upon reference images and are especially effective when generating content without visual references. Veo 3 is a significant upgrade over Veo 2—not only can it generate audio (including lip-synced speech), but its output is more realistic and refined. Its clarity, clip consistency, and in-time audio generation are impressive, though video resolution and output dimensions remain unpredictable.
Veo 2 remains useful for silent visuals, scenes without human characters, or when reference images are needed for consistency. Veo 3 excels at producing realistic individual clips, but unless you subscribe to its highest tier, you can’t feed it multiple reference images (called “ingredients”), which can make consistency across clips challenging.
To manage this, we used Veo 2 and Runway for scenes requiring consistent backgrounds, and Veo 3 for the initial clip (which we then used as a reference in Runway) and for cutaways in non-descript environments. We also focused on a skeleton character—easier to keep consistent visually. Veo was even able to generate legible text after a few tries, which was impressive.
A downside of both Veo and Runway is the need for frequent regenerations and prompt refinements, which quickly consume credits and can become costly. Each attempt—successful or not—uses the same amount of credits. Fortunately, we used two free trials of Veo and subscribed to a mid-tier Runway plan. For more complex projects, costs could rise significantly.
One standout result from Veo 3 was a video of the doctor character speaking directly to the camera with accurate lip-sync and clear pronunciation. It was visually and audibly convincing. However, generating multiple consistent clips of the same character would be difficult. To address this, we cloned the Veo 3 voice using ElevenLabs, allowing us to generate narration in the doctor’s voice for the rest of the video. As usual with ElevenLabs, some intonation tweaks required punctuation adjustments and multiple regenerations, but the results were strong.
The title image was generated in CoPilot, extended to a 16:9 ratio in Runway, and then animated using Runway.
For music, we used Suno. We requested a children’s TV theme with the lyrics “Dr Jazz Bones and Benny,” and it generated several convincing tracks within seconds. Slightly soulless, perhaps—but then again, most background music is.
Finally, we assembled the video manually in Premiere, with minimal editing. Everything you see and hear—aside from the editing—was created using generative AI. And we’re likely not far from AI handling some of that editing too.
Reflections:
Overall, there’s a degree of jankiness, even in the best clips, but the progress in AI video generation is remarkable. Sure, the anatomy isn’t perfect—we had elbows with patellas and the occasional visual glitch. It can also get expensive and has an environmental impact.
But if you need a specific piece of stock footage or a 10-second talking-head clip, Google Veo 3 feels like magic. For longer productions with consistent characters and settings on a tight budget, it can be frustrating. Flexibility is key—if you’re rigid about your vision, you’ll burn through credits fast. But if you adapt to the tools’ strengths and limitations, you can create something that doesn’t scream “AI-generated.” A more constrained creative approach can actually improve overall quality. Compromise is part of the process.
Takeaways:
1. Yes, AI Can Make Entire Videos. But Not Without Compromise.
We were genuinely surprised by how far the tools have come—especially Google Veo 3, which generated realistic video clips with clear, lip-synced speech. But building a whole, consistent sequence of scenes still takes a lot of trial, error, and prompt tuning.
2. Flexibility Is Essential
If you’re rigid about visual continuity or shot-specific accuracy, you’ll burn through credits fast. Being creatively agile—choosing a character that’s easy to render (like a skeleton) or simplifying backdrops—dramatically improves success rates and visual cohesion.
3. Voice and Music Are Now Plug-and-Play
ElevenLabs made cloning a believable voice easy, and Suno delivered custom theme music instantly. For branded videos or internal content with modest audio needs, these tools are ready for prime time.
4. Storyboard-First Still Wins
Combining Storyboarder.ai, Runway, and Veo gave us better visual alignment and shot consistency. The more you plan visually upfront, the better AI can follow your lead.
5. Costs Scale Fast
We kept to a $50 budget by leveraging free trials and mid-tier plans. But regenerating visuals (especially with premium tools like Veo 3) can quickly escalate costs for larger or more polished projects.
What This Means for Healthcare & Enterprise Marketing:
AI video creation is no longer science fiction. For healthcare communications, education, internal training, or light-touch promotional content, the current generation of tools can offer:
-
Faster production cycles
-
Lower costs vs. traditional animation or filming
-
Creative agility for testing ideas and formats
-
Personalised or contextualised visuals and voiceovers
However, it’s not yet a plug-and-play replacement for high-end video. For now, it’s best used for short-form content, experimental formats, and cost-effective visuals where speed, not perfection, is the goal.
Final Thought:
We’re only scratching the surface of what’s possible with AI video tools. The progress is fast—and while there's still friction (inconsistency, cost, limited control), the creative potential is real.
As these tools mature, they’ll unlock new ways for brands—especially in sectors like healthcare—to communicate, educate, and inspire.
If you're curious about how AI content creation could work for your team or your brand, let’s talk.
Now Watch the Bloopers!
