TokenRate

NVIDIA API Pricing

NVIDIA publishes the open Nemotron family — models post-trained for reasoning, agentic workflows, and tool use, including Llama-Nemotron variants. They're designed to run efficiently on NVIDIA hardware and are freely available for self-hosting.

Official site: build.nvidia.com

Cheapest

Nemotron 3 Nano 30B A3B

$0.050/1M in

Flagship

Nemotron 3 Ultra

$0.500/1M in

Models

4 tracked

All tiers, latest pricing.

All NVIDIA Models

ModelTierInput / 1MOutput / 1MContext
Nemotron 3 Nano 30B A3Bfast$0.050$0.200262K
Nemotron 3 Superfast$0.085$0.4001M
Llama 3.3 Nemotron Super 49B V1.5balanced$0.400$0.400131K
Nemotron 3 Ultraflagship$0.500$2.201M

Model Details

Calculate NVIDIA API Costs

Use the TokenRate calculator to estimate exactly what NVIDIA models will cost for your workload.

Open Calculator →

Other Providers