LLM Inference Explained: How a Model Generates Text

Chapter 19·Intermediate

01 / 06

What inference is

Using the trained model to generate text.

Inference is the model in action: it takes your prompt and produces a response one token at a time. Training builds the model; inference runs it.