Chapter 15·Beginner·11 min read
Few-Shot Prompting: Teaching an LLM by Example
What is few-shot prompting? A practical guide to teaching an LLM with examples — how many to use, how to choose and format them, why examples beat instructions for format and style, and the common pitfalls to avoid.
June 29, 2026
In the last two chapters you learned to write clear instructions (basic prompting) and to prompt with no examples (zero-shot). Now we add the single most powerful upgrade for consistency: showing the model what you want. This is few-shot prompting, and for format and style tasks it's often the difference between "close" and "exactly right."
What few-shot prompting is
Few-shot prompting means including a handful of worked examples — input paired with the ideal output — directly in your prompt, before the real input. The model sees the pattern and continues it.
Classify the sentiment.
Review: "Loved it, works perfectly." → Positive Review: "Broke after a day." → Negative Review: "It's okay, nothing special." → Neutral Review: "Best purchase I've made all year." →
The model, having seen three examples, completes the fourth with "Positive." You didn't describe the categories or the format — you demonstrated them. (One example is "one-shot"; several is "few-shot.")
Why it works: in-context learning
Here's the genuinely surprising part. The model is not retrained when you give it examples. Its parameters are frozen during use. Instead, it picks up the pattern from the examples in your prompt, on the spot, and applies it. This ability is called in-context learning, and it's one of the abilities that emerged from scaling up — nobody explicitly programmed it.
Mechanically, the attention mechanism lets the new input "look back" at your examples and mimic their pattern. That's why putting examples in the prompt is enough.
When to use few-shot over zero-shot
Reach for few-shot when zero-shot is inconsistent — especially for:
| Situation | Why examples help |
|---|---|
| Specific output format | Shows the exact shape instead of describing it |
| Particular style or tone | Demonstrates voice better than adjectives |
| Custom labels / categories | Defines your scheme by example |
| Ambiguous tasks | Pins down what you actually mean |
| Consistency across runs | The pattern anchors the output shape |
If zero-shot already nails it, don't add examples — they cost tokens for no gain. Few-shot is the escalation, not the default.
How many examples?
More is not better. The sweet spot for most tasks is 2 to 5 clean examples.
The first few examples deliver most of the benefit; after that you hit diminishing returns while burning more context-window tokens on every call. Add examples until the pattern is unambiguous, then stop.
Choosing good examples
The examples are the instruction, so their quality is the ceiling on your results.
- Be consistent. Every example should follow the same format exactly. Inconsistent examples teach inconsistency.
- Be correct. A wrong example will be faithfully copied. Mistakes in your examples become mistakes in the output.
- Cover the tricky cases. If your task has an edge case — an empty input, a special category, a tie — show it. The model generalises from what you demonstrate.
- Be representative. Pick examples that look like the real inputs you'll send, not idealised toy cases.
- Watch order and balance. For classification, don't list all the positives first; mix the labels so the model doesn't infer a spurious pattern from ordering.
Format your examples clearly
The model relies on structure to tell where one example ends and the next begins. Use a clear, consistent delimiter and layout:
Input: [the input] Output: [the ideal output]
Input: [the next input] Output: [the next ideal output]
Consistent labels (Input:/Output:) and a separator (---) make the pattern unmistakable. When you reuse this structure across many calls, you've essentially built a prompt template — which is a whole chapter of its own.
The cost trade-off
Every example sits in the context window on every single call, costing tokens (and money, and a little speed) each time. So:
- Use the fewest examples that make the pattern reliable.
- If you're sending thousands of calls, those example tokens add up fast — consider whether fine-tuning would be cheaper at high volume, since it bakes the behaviour in instead of re-sending examples.
For most use, though, a few examples in the prompt is the simplest, most flexible win.
Recap
- Few-shot prompting includes a handful of input→output examples so the model copies the pattern.
- It works via in-context learning — the model infers the pattern from your prompt, with no retraining.
- Use it when zero-shot is inconsistent, especially for format, style, custom labels, and consistency.
- 2–5 clean examples is usually the sweet spot; benefit plateaus quickly.
- Examples must be consistent, correct, representative, and cover edge cases — they are the instruction.
- Examples cost context tokens on every call; at high volume, fine-tuning may be cheaper.
Examples teach the model what to produce. But for hard problems, you also want to shape how it thinks. That's the next leap. Continue to Chain of Thought prompting.