TokenRate

Llama 3.1 8B Pricing

Fast

Meta · 128K tokens context

Llama 3.1 8B from Meta costs $0.020 per 1 million input tokens and $0.030 per 1 million output tokens as of May 2026 (live OpenRouter data). The model supports a 128,000-token context window (approximately 96,000 words) with a 4K-token maximum output. A typical 1,000-token request costs $0.0000 in input charges; a 10,000-token request costs $0.0002.

Llama 3.1 8B pricing and capability summary
Input price$0.020 / 1M tokens
Output price$0.030 / 1M tokens
Output / input ratio1.5×
Context window128,000 tokens (~96,000 words)
Maximum output4,096 tokens
Cost per 1K tokens (input)$0.0000
TierFast
Last verified

Llama 3.1 8B is the smallest open-weight Llama 3.1 model — extremely cheap to host or call, and good enough for classification, extraction, and basic chat.

Live pricing from OpenRouter

Input Price

$0.020

per 1 million tokens

Output Price

$0.030

per 1 million tokens

Context Window

128K tokens

max 4K output

Cost Examples

Request TypeTokensInput CostOutput Cost
1,000 word article1,333$0.0000267$0.000012
10-page document (2,500 words)3,333$0.0000667$0.00003
1,000 lines of code5,000$0.0001$0.000045
100K token document100,000$0.002$0.0009

Output cost estimated at 30% of input token count. Use the calculator for exact figures.

Strengths

  • Among the cheapest hosted LLMs
  • Easy to self-host (fits on a single consumer GPU)
  • 128K context

Limitations

  • Limited reasoning
  • Not for complex generation

Best Use Cases

Edge inference
Classification
Bulk extraction

Calculate Llama 3.1 8B Costs

Use the TokenRate calculator to convert any budget, token count, or text into exact Llama 3.1 8B costs — and compare across all models.

Open Calculator →

Llama 3.1 8B — FAQ

Related Models

Related Guides