Context Window — Plain-English Definition + 2026 Numbers

The plain-English definition

The context window is the total amount of text an AI model can hold in its short-term memory during a single conversation. It includes everything: the system prompt, your messages, the model's replies, and any documents you've pasted in. When you exceed it, the AI starts forgetting the earliest parts.

Why it matters

A small context window means the AI can only handle short, focused conversations. A large one means you can paste an entire book and ask questions across it. This is the difference between an AI that's useful for quick chats and one that's useful for serious work with long documents.

2026 context window sizes

Claude Sonnet: 200,000 tokens (~150,000 words / ~500 pages)
ChatGPT Plus (GPT-5 class): 128,000 tokens (~96,000 words / ~320 pages)
Gemini 2.5 Pro: 1,000,000+ tokens in some configurations (~750,000 words)
ChatGPT free: Smaller — typically 8,000–16,000 tokens

Numbers shift quarterly as models update. The relative ordering tends to stay similar.

What "tokens" means

Models count text in tokens — chunks roughly ¾ of a word. So a 100,000-token context fits about 75,000 words. The Bible is ~750,000 words for reference.

The practical limits

Bigger context windows aren't always better. Models can technically read 200,000 tokens but their quality at the very end of that window degrades. The "needle in a haystack" tests show even the best models miss details buried deep in long inputs.

Practical advice: paste only what's relevant. Don't dump entire databases when a focused excerpt would work better.

The plain-English definition

Why it matters

2026 context window sizes

What "tokens" means

The practical limits

Related terms