Chapter 19·Intermediate
LLM Inference Explained: How a Model Generates Text
01 / 06
What inference is
Inference is the model in action: it takes your prompt and produces a response one token at a time. Training builds the model; inference runs it.