Free LLM Token Cost Calculator

Count your tokens. Compare every model's cost.

Paste any prompt to get its exact token count, then see what it costs on GPT-5, Claude, Gemini, and 280+ models — priced at each provider's cheapest live rate with zero platform markup.

Real tokenizer, in your browser280+ modelsFree, no signup

Processing high volume?

Enterprise plans include volume discounts, dedicated support, custom SLAs, and extended data retention.

Talk to Sales

Ready to cut your LLM costs?

Start for free with no platform fees. No credit card required.

How the LLM cost calculator works

Count the exact tokens in your prompt and price it across every major model in three steps, then see how much routing through LLM Gateway saves you.

  1. Step 1

    Paste your prompt or document

    Drop in real text, code, or a JSON payload. A BPE tokenizer runs in your browser to count the exact tokens — the same way the model bills you — with nothing uploaded.

  2. Step 2

    Set your output size and volume

    Choose how long a response you expect and how many requests you send. Or switch to Estimate mode to enter input and output token volumes directly across multiple models.

  3. Step 3

    Compare every model and save

    See your prompt ranked across GPT-5, Claude, Gemini, and 280+ models at each provider's cheapest live rate — then route through LLM Gateway to pay it automatically with zero markup.

Understanding LLM token costs

Every large language model bills by the token, the small chunks of text a model reads and writes. Roughly speaking, one token is about four characters of English, so 1,000 tokens is around 750 words — but the only accurate way to know is to tokenize the exact text. This calculator does that in your browser with a real BPE tokenizer, so the token counts match how the model actually bills you instead of a rough character estimate. Providers quote prices per million tokens, and they charge separately for the tokens you send (input) and the tokens the model generates (output).

Output tokens are usually two to four times more expensive than input tokens, so the ratio between your prompt size and response size has a big impact on your bill. A summarization workload that reads a lot and writes a little costs very differently from a code-generation workload that writes long responses. The calculator above keeps the two separate so your estimate reflects how you actually use each model.

Prices also vary by provider. A single popular model is often hosted by several providers at different rates, and those rates change as providers compete on price. Instead of locking yourself into one provider, LLM Gateway routes each request to the cheapest available provider for that model through one OpenAI-compatible API, with no platform markup. That is the gap the calculator shows: the official list price versus the lowest live price you would actually pay.

Use it to budget a new feature, compare GPT-4o against Claude or Gemini before you commit, or build the business case for switching providers. When the numbers look good, you can start for free and keep the same estimate in production.

Frequently asked questions

Everything you need to know about estimating and lowering your LLM token costs.

How do I count the tokens in my prompt?

Paste your text into the calculator and it counts the exact tokens in your browser using a real BPE tokenizer (the GPT-4o / o200k_base encoding), the same kind of tokenizer the models use to bill you. Nothing is uploaded — the counting happens locally. You instantly see the token count alongside characters and words, plus what that text costs on every major model.

How many tokens is 1,000 words or one page of text?

As a rule of thumb, 1,000 English words is roughly 1,300–1,500 tokens, and one token is about four characters, so 1,000 tokens is around 750 words. Code, JSON, and non-English text tokenize less efficiently and use more tokens per word, which is exactly why pasting your real text into the tokenizer gives a far more accurate count than a word-based estimate.

How is the cost of LLM tokens calculated?

Providers bill separately for input tokens (your prompt) and output tokens (the model's response), priced per million tokens. Your total cost is (input tokens × input price) + (output tokens × output price). This calculator counts your input tokens exactly, lets you set an expected output length, and runs that math for every model.

What is the difference between input and output tokens?

Input tokens are everything you send to the model, including your prompt, system message, and conversation history. Output tokens are what the model generates back. Output tokens almost always cost more than input tokens, which is why the split matters when you estimate spend.

Why do the same model's prices differ between providers?

Popular models are often served by several providers at different rates, and prices change as providers compete. LLM Gateway routes each request to the cheapest available provider for that model, so you pay the lowest live rate without changing any code.

Does LLM Gateway add a markup or platform fee?

No. LLM Gateway passes through provider pricing with zero platform markup, so you pay exactly what the provider charges (and less when a cheaper provider or volume discount is available). You only add a payment method once you start sending real traffic.

How accurate are these cost estimates?

Input token counts come from a real BPE tokenizer running on your exact text, so they closely match what providers measure. Costs use each model's current published per-token prices. The main variables are output length (you estimate it, since it isn't known until the model responds), prompt caching, reasoning tokens on thinking models, and any negotiated rates. Treat the numbers as a tight planning estimate rather than a final invoice.

Do different models count tokens differently?

Yes. Each model family has its own tokenizer, so the same text can produce slightly different counts. This tool standardizes on the GPT-4o (o200k_base) tokenizer, which is the modern OpenAI standard and lands within roughly ±15% of other families like Claude, Gemini, and Llama — close enough for accurate budgeting, since none of those providers ship a tokenizer that runs in the browser.

What is the cheapest way to call LLMs like GPT-4o, Claude, and Gemini?

Route through a gateway that compares providers and picks the lowest price per request. Because LLM Gateway supports 280+ models behind one OpenAI-compatible API, you can switch models or providers based on cost without rewriting your integration.

Is the token cost calculator free to use?

Yes, the calculator is completely free and requires no signup. You can compare as many models and token volumes as you like, then create a free LLM Gateway account when you are ready to start sending requests.