Phase 1. Core mechanics

How Words Become Numbers

Computers cannot learn from text unless it is converted into numbers. In this lesson we explore how tokens are transformed into numerical representations called embeddings, allowing neural networks to learn patterns in language.

Phase progress
3 of 4
Course progress
3 of 4
track Foundations
lesson Lesson 3 of 4
duration 15 min
level Beginner
Course contents
Course contents

Foundational Intelligence

4 lessons · 1 hr 3 min
Phase 1

Phase 1. Core mechanics

Understand the basic loop that turns text into predictions and predictions into full answers.

4 lessons · 1 hr 3 min
  1. 1
    What Is an LLM?
    Next
  2. 2
    How AI Sees Your Words
    Open
  3. 3
    How Words Become Numbers
    Current
  4. 4
    How AI Decides What to Pay Attention To
    Open

Quick Recap

In the previous lesson we learned that Large Language Models do not read full sentences directly.

Instead, text is broken into small pieces called tokens.

When an LLM generates a response, it predicts one token at a time and gradually builds a sentence.

But this raises another important question.

If tokens are pieces of language, how does a computer actually understand them?

Computers cannot think in words. They can only process numbers.

So before a model can learn from language, those tokens must first be converted into numbers.


Why Computers Need Numbers

Imagine showing a computer the word:

cat

To a human, the meaning is obvious. You picture an animal.

But a computer does not see meaning. It only sees characters stored as data.

For a neural network to learn patterns in language, words must be translated into a numerical form that mathematics can operate on.

This numerical representation is called an embedding.


From Words to Vectors

An embedding is a list of numbers that represents a piece of language.

For example, a token might be represented like this:

cat → [0.21, -0.44, 0.73, 0.10, ...]

These numbers do not describe the word directly. Instead, they place the word inside a mathematical space where similar meanings end up closer together.

For example, words like:

  • cat
  • dog
  • rabbit

will often appear near each other in this space because they are used in similar contexts.

Meanwhile, unrelated words like:

  • airplane
  • database
  • volcano

will appear in very different regions.

This structure allows the model to learn relationships between words.


A Simple Way to Picture It

Imagine a large map where every word is represented by a point.

Words that appear in similar contexts are placed closer together.

Words used in very different situations are placed far apart.

This map is called a vector space.

Embeddings are the coordinates that place each word on that map.


Why Embeddings Matter

Once tokens are converted into embeddings, the neural network can begin learning patterns.

It can recognize that words with similar meanings tend to appear in similar places in the vector space.

This allows the model to:

  • understand relationships between words
  • recognize patterns in language
  • predict which token should come next

Without embeddings, a neural network would only see random characters.

Embeddings give language a mathematical structure the model can learn from.


The Key Idea

Computers cannot learn directly from words.

Tokens must first be converted into numerical representations called embeddings.

These embeddings place words inside a mathematical space where relationships between language patterns can be learned.


Summary

Before a model can learn language, tokens must be converted into numbers.

These numerical representations are called embeddings. They allow neural networks to analyze patterns in language and make predictions about what token should come next.

In the next lesson, we will explore how transformer models use attention to decide which parts of a sentence matter most when generating te

Finish this lesson

Mark your place before you move on.

Lesson 4 of 4 · How AI Decides What to Pay Attention To. Progress is stored on this device so the course can show what to continue next.