Skip to main content

Billing Rules and Quota Management

This document explains Real200's billing model, token pricing rules, quota management, and reconciliation processes.

Billing Model​

Real200 uses a pay-per-token billing model:

Cost = prompt_tokens × Input Price + completion_tokens × Output Price

Token Types​

TypeDescription
Prompt TokensTokens in your request to the model (input)
Completion TokensTokens in the model's response (output)
Total TokensInput + Output

Billing Cycle​

  • Billed in real-time per call
  • Deducted and quota updated immediately after each call
  • When quota is insufficient, the API returns a 402 Payment Required error

Pricing Rules​

Base Pricing​

Real200 pricing is based on official provider prices, adjusted by channel multipliers. Different channels may enjoy different discounts.

Smart Routing Pricing​

When smart routing is enabled, Real200 automatically selects the most cost-effective available provider without manual comparison.

:::tip Cost Savings

Through smart routing, Real200 typically saves 15–30% compared to using official APIs directly.

:::

Quota Management​

Account Balance​

  • Each account has a total balance (quota)
  • Balance can be increased via Top-up
  • When balance is insufficient, all API Key calls will be rejected

Key-Level Quotas​

You can set independent quota limits for each API Key:

Quota TypeDescription
Total QuotaMaximum amount this Key can consume over its lifetime
Monthly QuotaMaximum amount this Key can consume per month
Daily QuotaMaximum amount this Key can consume per day

Reconciliation and Billing​

Call Logs​

Real200 records detailed information for each call:

  • Call time
  • API Key used
  • Request/response token counts
  • Actual cost
  • Provider information
  • Latency and status code

Billing Export​

On the Logs page in the console, you can:

  • Filter by date range
  • Filter by API Key
  • Filter by model
  • Export as CSV

Common Questions​

How do I view real-time costs?

The console homepage shows today's costs, token usage, and call counts. The logs page shows detailed costs per call.

How are streaming responses billed?

Streaming and non-streaming responses are billed identically, based on the final prompt_tokens + completion_tokens.

Are failed calls billed?

If the request never reached the provider (e.g., blocked by risk control, routing failure), no token charges apply.