The Quick Answer
This is based on the widely-used estimate of 4 characters per token and an average English word length of ~5.3 characters including spaces. The exact number varies by text type: simple, common words tokenize more efficiently than technical jargon or code.
Word-to-Token Reference Table
• 100 words → ~133 tokens • 250 words (1 page) → ~333 tokens • 500 words → ~667 tokens • 1,000 words → ~1,333 tokens • 2,000 words → ~2,667 tokens • 5,000 words (short story) → ~6,667 tokens • 10,000 words (long article) → ~13,333 tokens • 80,000 words (novel) → ~106,667 tokens
Character-to-Token Conversion
• 1,000 characters → ~250 tokens • 10,000 characters → ~2,500 tokens • 100,000 characters → ~25,000 tokens
This 4:1 ratio holds well for typical English text. Code, markdown, and other structured text often have slightly different ratios.
Does the Model Affect Token Count?
• OpenAI uses cl100k_base (GPT-4, GPT-3.5) or o200k_base (GPT-4o) • Anthropic uses a custom tokenizer optimized for Claude • Google's Gemini uses SentencePiece
In practice, the differences are small (typically <5%) for standard English text. Our 4 chars/token estimate works as a reliable approximation for all major models.
Token Cost for 1,000 Words
• Claude Opus 4: 1,333 tokens × $15/1M = $0.020 • Claude Sonnet 4: 1,333 tokens × $3/1M = $0.004 • GPT-4o: 1,333 tokens × $2.50/1M = $0.003 • Gemini 2.0 Flash: 1,333 tokens × $0.10/1M = $0.0001 • Claude Haiku 4: 1,333 tokens × $0.25/1M = $0.0003
Use the TokenRate calculator to compare these costs across all models instantly.