A learned dense vector that maps a discrete token to a continuous representation in d_model-dimensional space.

type: glossary title: "Embedding" tags: ["glossary", "representation", "input"] created: 2025-01-01

Embedding

Definition: A learned dense vector representation that maps a discrete token (or other categorical input) to a continuous vector space; in Transformers, the input embedding layer converts each token index to a vector of dimension $d_{\text{model}}$ (512 in the base Transformer).

Used in: Transformer, Positional Encoding, Self-Attention

Do not confuse with: Positional Encoding (which is added to embeddings but encodes position, not token identity) or hidden states (which are the transformed representations produced by encoder/decoder layers, not the raw input embedding lookup).