Code Safari

Chapter 90·Beginner·10 min read

Prompting AI Image Generators: A Practical Guide That Isn't Magic Words

How to write image prompts that work — a practical, mechanism-based guide. Structure a prompt like a brief, use style and lighting vocabulary deliberately, set seed, steps and guidance like you know what they do, and iterate like a director instead of a slot-machine player.

July 19, 2026

You now know the machinery: words become embeddings, every region of the canvas consults them at every denoising step, and sliders like guidance and steps have precise meanings. This chapter turns that knowledge into technique. None of it is magic words — it's mechanism, applied.

If you've read our prompt engineering guide for text models, the philosophy transfers directly: be specific, iterate deliberately, and understand what the model actually receives.

Write a brief, not a spell

The internet is full of "secret Midjourney phrases." Ignore them. A prompt is compressed art direction — the model fills every decision you don't make, so a good prompt simply makes the decisions that matter. Cover what a director would:

SlotQuestion it answersExample
SubjectWho/what, doing what?"an elderly clockmaker inspecting a pocket watch"
SettingWhere, when?"in a cluttered workshop, dusk"
Medium/stylePhoto? Painting? Whose sensibility?"35mm documentary photograph"
LightingThe single biggest mood lever"single warm desk lamp, deep shadows"
FramingCamera distance and angle"close-up, shallow depth of field"

Assembled: "An elderly clockmaker inspecting a pocket watch, cluttered workshop at dusk, 35mm documentary photograph, single warm desk lamp with deep shadows, close-up with shallow depth of field."

That's it. No incantations — five decisions, stated plainly. The corgi-astronaut version of this from chapter 1 worked for exactly the same reason.

Style vocabulary: a few strong levers

Because the encoder was trained on captioned images from the real world, words that photographers and artists actually use map to strong, well-learned patterns. A small vocabulary goes far:

  • Lighting: golden hour, overcast, soft studio light, harsh noon sun, neon, candlelit, backlit, volumetric light
  • Medium: oil painting, watercolor, charcoal sketch, 3D render, film photograph, risograph, ukiyo-e
  • Camera: macro, wide-angle, telephoto compression, drone shot, fisheye, tilt-shift
  • Texture/finish: film grain, glossy, matte, weathered, pristine

Precision beats volume. "Cinematic, epic, stunning, masterpiece, 8K, best quality" is mostly noise — vague superlatives map to nothing specific, while one concrete term like "backlit through fog" redirects the whole image. (Some older Stable Diffusion checkpoints did respond to quality-spam because their communities trained on it; modern models largely don't need it.)

The director's loop: fix the seed

Here's the workflow upgrade that separates deliberate work from slot-machine pulls, straight from chapter 2: the seed determines the "marble block," so freeze it.

Generate a few seeds
Pick the best composition
Freeze that seed
Change ONE thing per regen
Iterating like a director: one variable at a time against a frozen seed.
  1. Generate a small batch with random seeds — you're auditioning compositions.
  2. Pick the most promising one and note its seed.
  3. Now iterate with the seed fixed: adjust the lighting phrase, swap the medium, nudge guidance. Each regen changes only what you changed, so you can see cause and effect.

Without a fixed seed, every regeneration reshuffles everything and you can't tell whether your prompt edit helped. With one, you're directing.

Settings, now that you know what they mean

Every one of these maps to machinery from earlier chapters:

SettingWhat it really isSensible default
StepsNumber of denoising passes20–30; more is mostly slower
Guidance (CFG)How hard the prompt-direction is exaggerated5–9; raise for literalism, expect frying past ~12
SeedThe starting noise — the marble blockRandom to explore, fixed to iterate
Aspect ratioShape of the latent canvasMatch intent — it changes composition, not just crop
Negative promptThe steer-away baselineSparingly: name observed problems, don't paste rituals

Two notes. Aspect ratio is compositional: a portrait canvas doesn't crop a landscape image, it composes a different image — models learned that tall frames hold portraits and towers, wide frames hold panoramas. And negative prompts work best reactively: add "watermark" when you see watermarks, rather than opening with a fifty-term pasted ritual that muddies the guidance signal.

Know what prompting can't fix

The most practical skill is recognising failures that are mechanical limits, not prompt problems:

  • Swapped or bleeding attributes ("red cube on blue sphere" comes back recoloured) — the encoder's meaning-soup, from chapter 4. Workarounds: simplify the scene, or generate elements separately and composite.
  • Garbled small text — latent compression, from chapter 3. Workaround: use a recent model with strong text rendering, keep text large and short, or add it in an editor.
  • Anatomy at the margins (hands, crowds, distant faces) — see the limits chapter. Workarounds: inpainting/regional edits, or a different seed.

Rewriting your prompt for the ninth time won't fix what the architecture can't represent. Knowing when to stop prompting and switch tools — inpaint, upscale, edit — is the skill.

Recap

  • A prompt is a brief, not a spell: decide subject, setting, medium, lighting, framing — the model fills everything you leave open.
  • Lead with what matters: early tokens weigh more and long prompts truncate.
  • Use concrete style vocabulary (lighting and medium words especially); skip vague superlative-spam.
  • Freeze the seed to iterate — one variable per regeneration turns generation into direction.
  • Settings are machinery you now understand: steps ≈ 20–30, CFG ≈ 5–9, aspect ratio composes, negative prompts reactively.
  • Recognise mechanical limits (attribute swaps, tiny text, hands) and switch to editing tools instead of prompt-thrashing.

Still images are one denoising loop. Video asks the same machinery to hold the world steady across hundreds of frames — a much harder trick. Continue to How AI video generation works.

Prompting AI Image Generators: A Practical Guide That Isn't Magic Words | Code Safari