How AI Sees Your Words

Quick Recap

In the previous lesson we learned the most important idea about Large Language Models.

An LLM does not “know” facts the way humans do. Instead, it generates text by predicting what piece of language should come next.

It repeats this process many times, slowly building a full answer.

But this raises an important question.

If a model is predicting pieces of language, what exactly are those pieces?

To understand that, we need to look at how computers break language into smaller parts.

Why Computers Cannot Read Words

Imagine you show a computer this sentence:

The sky is blue

To you, this sentence is easy to understand. You immediately recognize the words and their meaning.

But a computer does not see words the way humans do. Inside a computer, everything must eventually become numbers. Letters, pictures, music, and videos are all stored as numbers.

Language is no different.

So before a model can work with text, the sentence must first be broken into smaller pieces that a computer can process.

Breaking Language Into Pieces

Large Language Models do not read full sentences at once. Instead, they split text into small units.

These units are called tokens.

A token can represent:

a whole word
part of a word
punctuation
or a short fragment of text

For example, the sentence:

Why is the sky blue?

might be split into pieces like this:

Why | is | the | sky | blue | ?

These pieces become the building blocks the model works with.

Why Tokens Matter

Remember from the previous lesson that an LLM generates text by predicting what comes next.

The model is not predicting full sentences.

It is predicting tokens, one piece at a time.

Imagine the model is generating this answer:

The sky appears blue because sunlight scatters in the atmosphere

Internally, the model produces the response step by step:

The → sky → appears → blue → because → sunlight → scatters → in → the → atmosphere

Each step is a prediction of the next token.

After one token is generated, the model looks at the sentence again and predicts the next token.

Then the next.

And the next.

This process continues until the response is complete.

A Simple Way to Picture It

Think of tokens as small pieces of language, similar to the pieces of a puzzle.

A Large Language Model does not place the whole puzzle at once. Instead, it places one piece at a time, always choosing the piece that best fits the pattern it has learned from training.

Because the model has seen enormous amounts of writing, it often chooses pieces that produce natural and understandable sentences.

The Key Idea

Computers cannot work directly with words. They must break language into smaller pieces first.

These pieces are called tokens.

Large Language Models generate text by predicting tokens one by one and assembling them into sentences.

Understanding tokens is the first step toward understanding how language becomes something a neural network can process.

Summary

Before a model can generate language, text must be split into small units called tokens.

A Large Language Model predicts one token at a time and gradually builds a full response.

In the next lesson, we will see how tokens are converted into numbers and vectors, allowing neural networks to learn patterns in language.

Foundational Intelligence

Phase 1. Core mechanics

Quick Recap

Why Computers Cannot Read Words

Breaking Language Into Pieces

Why Tokens Matter

A Simple Way to Picture It

The Key Idea

Summary

Mark your place before you move on.

Foundational Intelligence

Phase 1. Core mechanics

Foundational Intelligence

Quick Recap

Why Computers Cannot Read Words

Breaking Language Into Pieces

Why Tokens Matter

A Simple Way to Picture It

The Key Idea

Summary

Mark your place before you move on.

What Is an LLM?

How Words Become Numbers