🔤 Tokenizer Visualization

See how text is split into tokens for language models

📝 Enter Text

🧩 Tokens

Word start
Continuation
Punctuation
Space/Special

🔢 Token IDs

[...]
0
Characters
0
Tokens
0
Chars/Token
Fun fact: GPT-2/3/4 use ~50,000 tokens. On average, 1 token ≈ 4 characters or ¾ of a word in English.