TokenRate
Guide · Fundamentals3 min read

How Many Tokens in 1,000 Words?

Quick reference guide for converting between words, characters, pages, and AI tokens for any major language model.

Published · Updated

TL;DR

1,000 words of English text is approximately 1,333 AI tokens — based on a ratio of ~1.33 tokens per word, or ~4 characters per token. A single page (250 words) is ~333 tokens, a 10-page document (2,500 words) is ~3,333 tokens, and a full novel (80,000 words) is ~106,667 tokens. The ratio varies less than 5% across the major tokenizers used by OpenAI (cl100k_base / o200k_base), Anthropic, and Google.

The Quick Answer

1,000 words of English text ≈ 1,333 tokens.

This is based on the widely-used estimate of 4 characters per token and an average English word length of ~5.3 characters including spaces. The exact number varies by text type: simple, common words tokenize more efficiently than technical jargon or code.

Word-to-Token Reference Table

Here are common conversions for English prose:

• 100 words → ~133 tokens • 250 words (1 page) → ~333 tokens • 500 words → ~667 tokens • 1,000 words → ~1,333 tokens • 2,000 words → ~2,667 tokens • 5,000 words (short story) → ~6,667 tokens • 10,000 words (long article) → ~13,333 tokens • 80,000 words (novel) → ~106,667 tokens

Character-to-Token Conversion

For precise estimation: 1 token ≈ 4 characters (including spaces and punctuation).

• 1,000 characters → ~250 tokens • 10,000 characters → ~2,500 tokens • 100,000 characters → ~25,000 tokens

This 4:1 ratio holds well for typical English text. Code, markdown, and other structured text often have slightly different ratios.

Does the Model Affect Token Count?

Yes — each provider uses a different tokenizer, which means the same text may produce a slightly different token count across models:

• OpenAI uses cl100k_base (GPT-4, GPT-3.5) or o200k_base (GPT-4o) • Anthropic uses a custom tokenizer optimized for Claude • Google's Gemini uses SentencePiece

In practice, the differences are small (typically <5%) for standard English text. Our 4 chars/token estimate works as a reliable approximation for all major models.

Token Cost for 1,000 Words

How much does it cost to send 1,000 words to different models?

• Claude Opus 4: 1,333 tokens × $15/1M = $0.020 • Claude Sonnet 4: 1,333 tokens × $3/1M = $0.004 • GPT-4o: 1,333 tokens × $2.50/1M = $0.003 • Gemini 2.0 Flash: 1,333 tokens × $0.10/1M = $0.0001 • Claude Haiku 4: 1,333 tokens × $0.25/1M = $0.0003

Use the TokenRate calculator to compare these costs across all models instantly.

Primary sources

Frequently Asked Questions

Is 1,000 words always exactly 1,333 tokens?

No — 1,333 is the average for English prose. Technical writing, code, and non-English text can vary by ±20%. Common short words tokenize more efficiently than rare or compound words.

How do I count tokens exactly for my text?

Use OpenAI's tiktoken library (cl100k_base for GPT-4, o200k_base for GPT-4o), Anthropic's count_tokens endpoint, or Google's Gemini count_tokens method. For ad-hoc estimation, multiply word count by 1.33.

Why do Chinese and Japanese cost more per word?

Most tokenizers were trained primarily on English. Non-Latin scripts often use 1–2 tokens per character versus 0.25 tokens per character for English — meaning the same semantic content can cost 4–8× more in tokens.

Try the TokenRate Calculator

Paste any text and see the exact token count and cost across every major model — Claude, GPT-4o, Gemini, and more.

Open Calculator →