Context length is the actual number of Tokens consumed in a single model call, measured against the model's maximum Context Window. The phrase has two related meanings — sometimes we use it to describe the model's ceiling ('200K context'), sometimes to mean the actual total of a specific request. API cost, Latency and KV Cache size all scale directly with this number; long context is never free. Engineering-wise, 'how much context can we fit?' is only half the question — the other half is 'does every token earn its place?', because irrelevant context dilutes the model's attention.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2018
Context Length
The total token count consumed in a single model call, used against the model's context-window limit.
- EN — English term
- Context Length
- TR — Turkish term
- Bağlam Uzunluğu