Why AI language models choke on too much text

Dec 20, 2024 0 Comment

Large language models represent text using tokens, each of which is a few characters. Short words are represented by a single token (like “the” or “it”), whereas larger words may be represented by several tokens (GPT-4o represents “indivisible” with “ind,” “iv,” and “isible”).

When OpenAI released ChatGPT two years ago, it had a memory—known as a context window—of just 8,192 tokens. That works out to roughly 6,000 words of text. This meant that if you fed it more than about 15 pages of text, it would “forget” information from the beginning of its context. This limited the size and complexity of tasks ChatGPT could handle.

Today’s LLMs are far more capable:

Read full article

Comments

Tech

Why AI language models choke on too much text

Leave a Reply Cancel reply

Disruption to science will last longer than the US government shutdown

Internet Archive’s legal fights are over, but its founder mourns what was lost

Research roundup: 6 cool science stories we almost missed

Inside the marketplace for vaccine medical exemptions

You Missed

Disruption to science will last longer than the US government shutdown

Internet Archive’s legal fights are over, but its founder mourns what was lost

Research roundup: 6 cool science stories we almost missed

Inside the marketplace for vaccine medical exemptions

Why AI language models choke on too much text

Related Post

Leave a Reply Cancel reply

You Missed