tutorials
Create your first avatar
How we crafted Elsa from the idea to the concept — a walkthrough of the full avatar creation process inside MITO.
Every character starts with a decision about what they carry.
For Elsa, it was restraint. A performer who communicates through stillness more than movement. Before we opened MITO, we wrote three sentences about her. That’s the actual prompt architecture — not the technical one, but the creative one.
Here’s how we built her.
Step 1: Define the character before the model
Write three sentences. Not about appearance — about behavior.
- How does she enter a room?
- What does she notice first?
- What does she never say out loud?
These constraints become your prompt anchors. They’re what keeps consistency across generations.
Step 2: Reference before generation
Don’t start with a blank prompt. Gather 5–8 visual references. Not for copying — for establishing the chromatic and textural grammar you’re working within.
In Elsa’s case: high-contrast editorial photography, Scandinavian winter light, minimal costuming. The references told us what not to generate as much as what to target.
Step 3: First generation — establish the base
Start wide. Generate 12–16 variants at low detail. You’re not looking for the final image — you’re looking for the one that has the right energy. The face you’ll build from.
Filter down to two or three candidates. The rest are compost.
Step 4: Iterate on the detail layer
Now go narrow. Take your best candidate and run 8–10 variations focusing on a single variable at a time: lighting angle, expression micro-adjustments, background relationship.
This is where most people rush. Don’t. The iteration layer is where the character becomes real.
Step 5: Lock and document
Once you have your avatar, extract the key prompt parameters. Document them somewhere permanent. Every future generation of Elsa should reference this foundation.
You’re not creating a static image. You’re creating a character spec.
The AI tool for cinematic video.
Generate, direct, and publish professional videos — powered by the best AI models.