Code Safari

Chapter 18·Intermediate·11 min read

Fine-Tuning LLMs: Adapting a Model to Your Task

What is fine-tuning in AI? A clear guide to fine-tuning large language models — what it does, how it differs from prompting and RAG, when to use LoRA and parameter-efficient methods, and when fine-tuning is the wrong tool.

June 29, 2026

In the previous chapter we saw how a model is trained into a general-purpose assistant. But "general" isn't always what you want. Maybe you need a model that writes in your company's exact voice, classifies your support tickets, or speaks fluent legalese. Fine-tuning is how you specialise a model — and knowing when not to use it is just as valuable.

What fine-tuning actually is

Fine-tuning means taking an already-trained model and continuing its training on a focused set of your own examples. You're not starting from scratch — you're nudging a model that already knows language toward your specific task, domain, or style.

Pretrained model
+ your examples
Continue training
Specialised model
Fine-tuning builds on a pretrained model rather than starting over

Mechanically it's the same loop as training — predict, compare, nudge parameters — just on a smaller, targeted dataset. The result is a model whose parameters have shifted toward your use case.

The crucial distinction: prompting vs fine-tuning vs RAG

This is where people get confused and waste money. There are three ways to make a model do what you want, and they operate at different levels:

ApproachWhat it changesBest for
PromptingThe input (instructions/examples)Most tasks; fast, free to iterate
RAGThe input (retrieved knowledge)Giving the model fresh or private facts
Fine-tuningThe model's parametersBaking in behaviour: style, format, task patterns

If you find yourself pasting the same long instructions into every prompt, fine-tuning can bake that behaviour in so you stop repeating it. But if you need the model to know new facts, fine-tuning is the wrong tool — see below.

What fine-tuning is good at (and bad at)

Fine-tuning shines for behaviour and form, and disappoints for facts.

Good fits:

  • A consistent tone or voice (your brand, a persona).
  • A specific output format every time (always valid JSON, a fixed report structure).
  • A narrow, repetitive task (classify tickets into 8 categories, extract fields from invoices).
  • A domain style (medical notes, legal drafting) where general phrasing isn't enough.

Poor fits:

  • Teaching new facts. Models are bad at reliably memorising specifics from a small fine-tune, and they'll happily hallucinate around the gaps. Use RAG for knowledge.
  • Fast-changing information. Re-training every time facts change is absurd; retrieve them instead.

LoRA: why fine-tuning got affordable

Fine-tuning every one of a model's billions of parameters is expensive and storage-heavy. Parameter-efficient fine-tuning fixes that — and LoRA (Low-Rank Adaptation) is the most popular flavour.

The idea: freeze the original model and train only a small set of add-on weights (adapters) that adjust its behaviour. You're tuning a tiny fraction of the parameters instead of all of them.

Full fine-tune
100%
LoRA
~1%
LoRA trains a tiny fraction of the parameters (illustrative)

The payoff is large:

  • Far cheaper and faster to train.
  • Tiny to store — adapters are megabytes, not gigabytes.
  • Swappable — keep multiple adapters for different tasks and load the one you need.

LoRA is why fine-tuning, once the preserve of big labs, is now within reach of small teams.

Data quality is everything

A fine-tune is only as good as its examples. The most common failure isn't the technique — it's the data.

  • Consistency beats volume. A few hundred clean, consistent examples usually beat thousands of noisy ones. The model will faithfully learn whatever patterns are in your data, including the mistakes.
  • Cover the real distribution. Include the edge cases and formats you actually expect at runtime.
  • Mind contradictions. If two examples answer the same input differently, you're teaching the model to be inconsistent.

This echoes a theme from the Machine Learning guide: data quality decides everything.

A decision checklist

Before fine-tuning, ask:

  1. Have I exhausted prompting? Better instructions and few-shot examples solve more than people expect.
  2. Is this a knowledge problem? If so, use RAG, not fine-tuning.
  3. Is the behaviour stable and repetitive? Fine-tuning rewards consistency; it's wasted on one-offs.
  4. Do I have clean, representative data? If not, fix that first — it's the ceiling on results.

If you answer "prompting's not enough, it's about behaviour not facts, the task is stable, and my data is clean," then fine-tuning is the right call.

Recap

  • Fine-tuning continues training a pretrained model on your examples to specialise it.
  • It changes the model's parameters — unlike prompting and RAG, which change the input.
  • It's best for behaviour (tone, format, task patterns) and poor for facts — use RAG for knowledge.
  • LoRA and other parameter-efficient methods make it cheap by tuning a small set of add-on weights.
  • Data quality is the ceiling — clean, consistent, representative examples matter more than volume.
  • Try prompting → RAG → fine-tuning, in that order.

We've trained and specialised the model. Now, what actually happens in the moment you hit "send" and watch words stream back? That's inference. Continue to LLM inference: how a model generates text.

Fine-Tuning LLMs: Adapting a Model to Your Task | Code Safari