Understanding LLM pricing: input, output, and cached tokens

Pricing IntelligenceFebruary 15, 20254 min read

How providers charge for tokens and what it means for your bill.

CostLynx TeamEditorialSource: CostLynx Editorial Analysis

Most frontier models bill by tokens: pieces of text after the provider’s tokenizer runs on your prompt and completion. Input (prompt) and output (completion) are often priced differently, with output frequently more expensive per token.

Cached or prompt-cached tokens can reduce cost when the provider reuses a prefix across requests. Not every model exposes caching the same way—check the provider’s docs for your exact model and region.

What to track in production

For FinOps, break down spend by team, project, and environment, then slice by model. Spikes in output tokens often mean longer answers or chain-of-thought style completions; spikes in input tokens may mean bloated system prompts or oversized context.

← Back to all posts