What is the context window?

4 months ago 2
ARTICLE AD BOX
AI models don’t work   words; alternatively  they work   chunks of characters called tokens.

AI models don’t work words; alternatively they work chunks of characters called tokens. | Photo Credit: Solen Feyissa/Unsplash

A: In the discourse of artificial quality (AI), specifically ample connection models (LLMs) similar GPT-5 and Claude, the discourse model is the maximum magnitude of substance the exemplary tin see astatine immoderate 1 clip portion generating a response.

AI models don’t work words; alternatively they work chunks of characters called tokens. Typically 1 token is astir equivalent to 0.75 words (in English), truthful 1,000 tokens volition correspond astir 750 words. So erstwhile a exemplary has a discourse model of 8,000 tokens, for example, it means it tin grip astir 6,000 words of information astatine once.

Each discourse model needs to clasp 3 things simultaneously: the rules telling the AI however to behave; the past of the existent chat; and the abstraction required for the AI to make its adjacent answer.

If the bounds is 8,000 tokens and your speech past is 7,900 tokens long, the AI lone has 100 tokens left. If a speech exceeds the discourse window, the exemplary mightiness commencement deleting the oldest parts of the conversation.

The discourse model is linked to the computational resources disposable to the model. If you summation the discourse model magnitude by 2x, the powerfulness required increases by astir 4x. So models with larger windows are overmuch much costly to run.

Sometimes adjacent if a exemplary tin judge a lakh tokens, it whitethorn conflict to find a portion of accusation buried successful the middle. This is called the ‘lost successful the middle’ phenomenon.

Published - January 12, 2026 06:00 americium IST

Read Entire Article