TokenRate

Llama 3.1 70B Pricing

Balanced

Meta · 128K tokens context

Llama 3.1 70B from Meta costs $0.400 per 1 million input tokens and $0.400 per 1 million output tokens as of May 2026 (live OpenRouter data). The model supports a 128,000-token context window (approximately 96,000 words) with a 4K-token maximum output. A typical 1,000-token request costs $0.0004 in input charges; a 10,000-token request costs $0.0040.

Llama 3.1 70B pricing and capability summary
Input price$0.400 / 1M tokens
Output price$0.400 / 1M tokens
Output / input ratio1.0×
Context window128,000 tokens (~96,000 words)
Maximum output4,096 tokens
Cost per 1K tokens (input)$0.0004
TierBalanced
Last verified

Llama 3.1 70B is the mid-size open-weight model in the 3.1 family — a popular sweet spot for production workloads that need GPT-4o-mini-class quality at open-weight prices.

Live pricing from OpenRouter

Input Price

$0.400

per 1 million tokens

Output Price

$0.400

per 1 million tokens

Context Window

128K tokens

max 4K output

Cost Examples

Request TypeTokensInput CostOutput Cost
1,000 word article1,333$0.000533$0.00016
10-page document (2,500 words)3,333$0.00133$0.0004
1,000 lines of code5,000$0.002$0.0006
100K token document100,000$0.04$0.012

Output cost estimated at 30% of input token count. Use the calculator for exact figures.

Strengths

  • Affordable hosted pricing under $1/1M
  • Strong general-purpose quality
  • Open weights

Limitations

  • Below frontier on hardest tasks
  • Quality varies by host

Best Use Cases

Production chat at scale
Self-hosted inference on a single H100 node
Fine-tuning base

Calculate Llama 3.1 70B Costs

Use the TokenRate calculator to convert any budget, token count, or text into exact Llama 3.1 70B costs — and compare across all models.

Open Calculator →

Llama 3.1 70B — FAQ

Related Models

Related Guides