TokenRate

Llama 3.1 405B Pricing

Flagship

Meta · 128K tokens context

Llama 3.1 405B from Meta costs $2.70 per 1 million input tokens and $2.70 per 1 million output tokens as of May 2026. The model supports a 128,000-token context window (approximately 96,000 words) with a 4K-token maximum output. A typical 1,000-token request costs $0.0027 in input charges; a 10,000-token request costs $0.0270.

Llama 3.1 405B pricing and capability summary
Input price$2.70 / 1M tokens
Output price$2.70 / 1M tokens
Output / input ratio1.0×
Context window128,000 tokens (~96,000 words)
Maximum output4,096 tokens
Cost per 1K tokens (input)$0.0027
TierFlagship
Last verified

Llama 3.1 405B is Meta's largest open-weight model — competitive with GPT-4-class models on many benchmarks and uniquely available for self-hosting. Symmetric input/output pricing is common across hosted providers.

Reference pricing · updated 2026-05-22

Input Price

$2.70

per 1 million tokens

Output Price

$2.70

per 1 million tokens

Context Window

128K tokens

max 4K output

Cost Examples

Request TypeTokensInput CostOutput Cost
1,000 word article1,333$0.0036$0.00108
10-page document (2,500 words)3,333$0.009$0.0027
1,000 lines of code5,000$0.0135$0.00405
100K token document100,000$0.27$0.081

Output cost estimated at 30% of input token count. Use the calculator for exact figures.

Strengths

  • Open-weight: can be self-hosted
  • Frontier-class quality on many tasks
  • Symmetric in/out pricing simplifies cost modeling

Limitations

  • Slower than smaller Llama variants
  • No native multimodal
  • 128K context (matches GPT-4o but trails Gemini)

Best Use Cases

On-prem and air-gapped deployments
Fine-tuning for domain models
Avoiding vendor lock-in

Calculate Llama 3.1 405B Costs

Use the TokenRate calculator to convert any budget, token count, or text into exact Llama 3.1 405B costs — and compare across all models.

Open Calculator →

Llama 3.1 405B — FAQ

Related Models

Related Guides