Leveraging generative A.I. to visualize a cinematic universe inspired by The Wonderful Wizard of Oz (1990)
Discord
Midjourney
Pika.art
When powerful consumer AI hit global markets in 2022, I felt overwhelmed; the emergence of these tools carried some obvious and heavy implications about the future.
I had just begun work on a personal writing project: a sequel trilogy to L. Frank Baum’s The Wonderful Wizard of Oz, narrated from a science fiction lens for a mature reading audience—
Like Gregory McGuire’s Wicked, but in outer space
.
As I dove into the writing process, I eventually made an important connection:
💭Generative-AI could potentially help me realize this vision from a truly cinematic perspective.
What if this project is actually the perfect sandbox for overcoming feelings of futureshock about generative-AI?
I engineered my first image prompts without referring to any Midjourney documentation.
Some of the initial images were nice to look at, but they ignored the majority of the prompt.
PROMPT:
Midjourney and similar tools require the use of specific prompt structures and the use of established syntax in order to achieve the best results.
My generations improved significantly once I began to employ these built-in parameters, however, the use of designated structure and syntax were not necessarily a guarantee that a given prompt would succeed on the first pass.
Example cheat sheet detailing Midjourney’s syntax and structure best-practices (Credit: Tristan Wolff)
I made use of my knowledge of art history and visual culture to make precise references to historical periods, artistic styles and mediums, artists and filmmakers, and the titles of their work.
Forms
Style
The process of layering elements by naming them throughout the body of the prompt produced a series of phantasmagoria from some Art Basel of the future
In addition to producing the desired “green tornado”, I was able to generate an abundance of variations that interpreted the root prompt in unique ways
Midjourney and similar tools enable the user to deploy images as references within the body of the prompt
The author can steer the prompt to fuse, combine, juxtapose or otherwise blend elements from multiple images in combination with text prompting.
With generative tools, it’s possible to rapidly create an enormous amount and variety of content riffing on a common theme or style of image.
When deriving such variation, the user produces a branch-like map of outputs as they participate in steering the model towards a desired outcome.
It was most often through a process of small tweaks to the root prompt and executing variation that I arrived at the most impactful images.
An example of the abundance possible when generating variations.
The final image was produced over a series of working sessions, where the prompt was adjusted or completely re-written multiple times before I arrived at the image that appealed the most to my creative sensibilities and vision for the character.
I struggled to produce non-human skin tone by text alone; I eventually achieved the results I was looking for by loading image references into the body of the prompt.
The winged monkeys are not a common creature of fantasy as compared to, for example, a majestic unicorn
Representations of unicorns likely appeared in abundance during the model training process, unlike L. Frank Baum’s winged creations, meaning that the model can easily reproduce that concept.
To successfully generate the winged monkeys, I had to creatively steer the prompt and select variations for nearest accuracy until I arrived at the truest desired combination of elements.
Initial generations humorously misinterpreted the cultural reference to “flying monkeys”.
Follow-up generations made use of an image reference from the 1939 MGM film, but also drew on unwanted sources of inspiration thematically and stylistically.
A third pass attempted to make references via text to visual culture and style, but the images suffered from AI sheen: an image quality of generative images I identified that seems to recreate a textural quality of modern digital painting.
In the next phase I pivoted to a different style, attempting to make reference to 17th century baroque oil painting.
I was also attempting to imbue the winged monkeys with qualities of other species of animal such as the colouring of pigeons and the tails of rats with varying degrees of success.
In the final phase, I achieved the desired formal and stylistic qualities and began to generate an abundance of variations, experimenting with shifting the background locale.
The model had a tendency to output generations that appeared too artificially youthful and conventionally attractive, despite the intentional use of age descriptors in the body of the prompt.
Initial generations had a tendency to appear overly youthful despite the deliberate inclusion of descriptive language regarding age.
A set of variations eventually took on a naturally aged appearance; further variations were prompted from that set.
A final image depicts the character of Glinda the Good with a more naturally aged appearance.
Uncanny valley is a phenomenon that can be described as: a hypothesized psychological and aesthetic relation between an object’s degree of resemblance to a human being and the emotional response to the object [source].
The experience of witnessing uncanny valley can be described as a physical sense of eerieness or feeling unsettled by the representation.
Human representations containing qualities of the uncanny valley is a common critique of generative images.
The presence of uncanny valley in these images ranges from:
There is a limit to how many parcels of information a model can process.
Prompt engineers are tasked with reducing the volume of text in the prompt, while still providing the necessary information for the model to successfully generate what the user envisions.
This can become challenging when attempting to describe a phenomenon that does not exist as an easily reproducible image, and which may be difficult to describe in words.
For this series of generations, I focused on making reference to the work of video artist Matthew Barney, and specifically his visionary film series, Cremaster Cycle.
In the second novel, The Fortress In The Sand, Dorothy will construct a Yellow Brick Fortress at the edge of the Deadly Desert, an important locale in the Oz lore.
In the first novel, I explore Queen Ozma’s use of humanoid robots to keep the Emerald City secure from the Nome King, the story’s main antagonist.
In the second novel, Dorothy will continue this trend by creating a modular robotic hivemind known as Blockette, that will be responsible for keeping the Yellow Brick Fortress secure.
Much more randomness; needed to rework prompt and re-roll the generation many times to achieve results I was happy with for the abilities of the freeware tool I was using.
In the first novel of my trilogy, Dorothy Gale will spend a large portion of her adult life piecing together an esoteric machine, created from parts collected during her global travels in search of the Wizard’s geneological ancestors.
In the second novel, Dorothy will lure the winged monkeys into her Yellow Brick Fortress in order to fashion them into emissaries of the Deadly Desert.
In the first novel, Queen Ozma will make use of an army of robot guards to keep the Emerald City and her throne safe from the Nome King and his forces.
Throughout her life on Earth, Dorothy experiences vivid dream sequences where the laws of physics break down, such as in this sequence taking place in the poppy fields of Flanders, Belgium.
In the third novel, Dorothy will be visited by a series of messengers, including the Ancestors of Oz, a mysterious liquid marble frieze with muttering faces, warping into and out of existence.
In the final novel, readers will be introduced to a mysterious dome world known as Oobliad, a world which is doomed to annhiliation in the forseeable future, as are all other dome worlds adrift in this proto-universe that is somehow adjacent to Oz’s and Earth’s universal planes.
Google’s Veo 3 Model broke new ground with the release of the Flow tool, which enables creators to generate cinematic sequences and stitch them together seamlessly using an interface designed with filmmakers at the forefront of its design.
This shortfilm was made entirely using Google’s Flow tool, which makes use of the Veo 3 generative model.
Novelty animal videos are not a new phenomenon unique to the age of social media.
Cute animals doing silly things was a hallmark component of America’s Funniest Home Videos, a syndicated TV program that debuted in 1989, well over thirty years ago. This type of media is an obvious target for AI content engineers to mimic and attempt to spread virally across the web.
This AI-generated video of bunnies jumping on a trampoline at night, “in the style of” security cam footage, was convincing to the untrained eye.
Due to the ease of production and replicability that’s already possible using the current generation of models, the sheer volume of AI-generated content already flooding the web has caused me to consider: could the volume of ai-generated content eventually overtake actual user-generated content made by more traditional means of production?
This BBC report details the rise of AI-generated social media influencers currently convincing millions, and even generating real income for the engineers responsible for these fake personas flooding social media.
Possibly even more troubling than fake AI-influencers selling wellness products, is the fact that real people can have their likeness co-opted to be used, for example, in smear campaigns as a form of political propaganda.
This CBC report details the viral spread of a convincing deepfake of Canada’s former Prime Minister, Justin Trudeau.
An important question:
Must we now question every single thing we see online by default? Is this simply the new normal?
The emergence of generative-AI as a new creative tool has been highly divisive, with relative acceptance and adoption by some, and complete pushback and deliberate non-use by others.
Oliver Richman is an indie musician participating in an ongoing “song-a-day” challenge
In the case of UX and product design and research, advocates of AI tools capable of replicating user interfaces, such as Base44 and Lovable, suggest that generative-AI is “democratizing design”.
I might venture to challenge this notion if the real result is a devaluation of the discipline of product and UX design.
Democratization feels like a contemporary buzzword that is distracting from the very real disenfranchisement of individuals due to the loss of available opportunities for economic mobility as a result of the emergence of the tools.
Opinion:
Capability alone does not equal
1.vision
2.taste level
3.experience
If adoption is inevitable, I believe creative thinkers still have the edge in this process of “democratization”, but we will have to be brave and visionary throughout this uncomfortable process; innovative leaders at the forefront of the ethical use of AI.
The sudden emergence and rapid proliferation of generative tools has placed a measurable strain on natural resources.
The data centres, where a sudden and massive influx of AI-generated content is now being housed, use large quantities of fresh water to cool their hardware.
Additionally, the communities where these data centres are being built are being directly impacted.
This infographic from the Cap Gemini Research Institute describes some of the environmental impact of generative-AI
Economic research suggests that AI and automation will replace a significant number of jobs over the coming years.
This article from the World Economic Forum discusses perceptions of job displacement and the value of labour in the AI-driven economy.
Additionally, big-tech including Meta, Google and OpenAI all trained their models off of content that was non-consensually scraped from across the web; vast quantities of artwork, photography, audio, video and written content were used without a single creator being compensated for the use of their IP in the model-training process.
L. Frank Baum wrote thirteen sequel novels, and several short stories and spin-offs after publishing the original novel in May of the year 1900 before his passing in 1919.
All of this IP exists in the public domain; it is a rich library of characters, objects and settings from which I am able to derive an evolution of the series from a new perspective and with the ability to eventually publish the derivative work.
I began hunting down physical copies of these sequel novels, as well as other media directly related to or inspired by the Oz lore.
My quest to locate copies of this literature lives in parallel to my plot device of Dorothy scouring the globe for pieces of the Wizard’s machine.
Like Captain Ahab in Hermann Melville’s classic novel, Moby Dick, I eventually tracked down an early printing of The Wonderful Wizard of Oz in a local used bookstore. I left a glowing 5-star review.
My project is inspired by 20th century science fiction literature, written by authors including Alice Bradley Sheldon (aka James Tiptree Jr.) and Larry Niven.
An inspired idea is nothing without strong technical execution.
To improve writer’s craft and to develop understanding about the mechanics of writing fiction, I am listening to successful authors who I admire discuss their work and process.
I am also consulting guides for specific information about fiction best-practices, literary structures and writing processes.
When the idea for this project first occurred to me, I was going through a creative drought. I was creating for others, but not for myself. I felt like I was losing sight of why I create in the first place.
When the idea for this project first occurred to me, I was going through a creative drought. I was creating for others, but not for myself. I felt like I was losing sight of why I create in the first place.
The idea for the writing project occurred to me when I came across a copy of The Wonderful Wizard of Oz in a Little Free Library. I have been fascinated by the story since childhood. This chance encounter with a copy of the novel is what reignited my passion to create.
Much like the strategy of facing down my futureshock by engaging with generative-AI, engaging with these topics and themes is allowing me the opportunity to confront and process internalized fears about global affairs through a creative rather than consumptive lens.
© Copyright. Matthew Crans. 2025.