What 'Learning From Data' Actually Means in Machine Learning

Machine learning isn't magic and it isn't ordinary programming. Instead of writing rules, you show a model examples and it discovers the rules itself. Here's what 'learning from data' really means — in plain English.

"Machine learning" sounds like a machine that thinks. It isn't. Underneath the buzzwords is one shift in how we build software — a shift so simple you can hold the whole idea in your head, yet powerful enough to drive everything from spam filters to self-driving cars to the AI that writes text.

This guide is about that shift, with no math. This first chapter is the foundation: what it actually means for a computer to learn from data. Once you have this, the rest — the different styles of learning, why models fail, how they get built and shipped — follows naturally.

Two ways to make a computer do something

For most of computing history, software worked one way. A person figured out the rules, wrote them down as code, and the computer followed them exactly. Want to convert Celsius to Fahrenheit? You know the rule, you type it in, done.

This is traditional programming: you supply the rules and the data, and the computer produces the answers.

Machine learning flips this around. You supply the data and the answers, and the computer produces the rules.

Rules + Data → Answers

(traditional)

Data + Answers → Rules

(machine learning)

Traditional programming vs. machine learning — what you give and what you get back

That's the entire idea. In machine learning you don't tell the computer how to solve the problem. You show it lots of examples of the problem already solved, and it figures out the how by itself.

Why not just write the rules?

A fair question: if we already know the right answers in our examples, why not just write the rules directly and skip all this?

Because for a huge class of problems, nobody can write the rules. Try it. Write down a complete set of instructions for "is this photo a cat?" You might start with "has pointy ears, has whiskers, has fur." But then: dogs have fur, foxes have pointy ears, a cat curled up has no visible ears, a sphynx cat has no fur, a cat from behind has no whiskers in view. Every rule you add spawns ten exceptions.

Humans recognise a cat instantly without being able to explain the rule. That's exactly the kind of fuzzy, exception-riddled problem machine learning was made for.

Good fit for hand-written rules	Good fit for machine learning
Calculating tax on an order	Recognising a face in a photo
Sorting a list alphabetically	Flagging a fraudulent transaction
Checking if a password is 8+ chars	Deciding if a review is positive
Anything with clear, stable logic	Anything fuzzy, messy, or full of exceptions

The dividing line is this: if you can state the rule cleanly, just code it. If the rule lives in your intuition and resists being written down, that's where learning from examples earns its keep.

What does "learning" actually mean here?

Let's make "learning" concrete, because it's doing a lot of work in the phrase.

Imagine you want a program that flags spam email. You don't write rules. Instead you collect 10,000 emails that people have already marked as "spam" or "not spam." You hand all of them to a learning algorithm.

The algorithm goes through the examples looking for patterns that separate the two groups. Maybe spam emails use the word "free" far more often. Maybe they have lots of links, or all-caps subject lines, or come from addresses no one has seen before. The algorithm doesn't know what any of these things mean — it just notices which patterns reliably go with the "spam" label.

The output of all that pattern-finding is called a model.

Two words you'll hear constantly:

Training — the process of feeding examples to the algorithm so it builds the model. This is the "learning" part.
Inference (or prediction) — using the finished model on a brand-new input to get an answer. This is the "doing" part.

You train once (sometimes for weeks, on huge computers), then run inference millions of times (fast, cheap, on a phone).

Memorising is not learning

Here's the trap that separates a useless model from a useful one.

Suppose our spam algorithm got lazy and simply memorised all 10,000 emails — storing each one with its label. Test it on those same 10,000 emails and it scores a perfect 100%. Looks brilliant.

Now a brand-new email arrives, one it never saw. The memoriser has nothing — it only knows the exact emails it stored. It falls apart on the one job that matters.

A real model does something harder and more valuable: it captures the underlying pattern so it can handle inputs it has never seen. That ability — to perform well on new, unseen data — is called generalisation, and it is the entire point of machine learning.

Sees examples

Memorises them → fails on anything new

OR learns the pattern → handles new cases

The difference that matters: memorising the answers vs. learning the pattern behind them

We'll come back to this tension hard in Training, testing, and why models overfit — because a model that memorises instead of generalising is the single most common way machine learning goes wrong.

A model is only as good as its examples

If the computer learns everything from the examples you give it, then the examples are everything. This has a blunt consequence:

A model can only learn what's in its data. It cannot learn what you never showed it, and it will faithfully learn any mistakes you did show it.

Train a "cat detector" only on photos of black cats, and it may decide that "cat" means "black." Show it a ginger cat and it shrugs. Train a hiring model on a company's past decisions, and if those decisions were biased, the model learns the bias — and applies it at scale, with a veneer of objectivity.

This is why practitioners obsess over data. The glamorous part is the algorithm; the part that actually decides success or failure is usually the examples. We'll devote a whole chapter to it in Features, labels, and why data quality is everything.

A tiny end-to-end picture

Let's tie it together with the spam filter, start to finish:

Gather examples — 10,000 emails, each labelled spam or not spam.
Train — hand them to a learning algorithm, which finds the patterns that separate the two and bundles them into a model.
Test — check the model on emails it never saw during training, to see if it actually generalised.
Use (inference) — point the model at your live inbox; for each new email it predicts spam or not.
Watch and refresh — spammers change tactics, so yesterday's patterns go stale; you retrain on fresh examples.

Every machine learning project, however fancy, is a version of this loop. The model that powers a self-driving car and the one that recommends your next video are both just this — examples in, patterns out, predictions on new inputs.

Why this matters for everything else

You might have come here from our companion guide on how generative AI works. Here's the connection: a chatbot like ChatGPT or Claude is, at heart, a machine learning model. It was shown an enormous number of examples of text and learned the patterns of language. Everything in this guide — examples, patterns, generalisation, data quality — is the bedrock that generative AI is built on.

Get these foundations and you'll understand not just today's headline AI, but the recommendation engines, fraud detectors, and forecasting tools quietly running underneath modern life.

Recap

Traditional programming: humans write the rules. Machine learning: the computer learns the rules from examples.
Use ML for fuzzy problems where the rules resist being written down — recognising, predicting, classifying.
A model is the learned rulebook; training builds it, inference uses it.
The goal is generalisation — performing on new inputs — not memorising the examples.
A model is only as good as its data; biased or thin examples produce a biased, thin model.

Learning from examples is the core. But there's more than one way to learn from data — depending on whether your examples come with answers, come without them, or arrive as feedback from trial and error. That's the map we draw next: The three flavors of machine learning.