Billing Rules and Quota Management
This document explains Real200's billing model, token pricing rules, quota management, and reconciliation processes.
Billing Modelâ
Real200 uses a pay-per-token billing model:
Cost = prompt_tokens à Input Price + completion_tokens à Output Price
Token Typesâ
| Type | Description |
|---|---|
| Prompt Tokens | Tokens in your request to the model (input) |
| Completion Tokens | Tokens in the model's response (output) |
| Total Tokens | Input + Output |
Billing Cycleâ
- Billed in real-time per call
- Deducted and quota updated immediately after each call
- When quota is insufficient, the API returns a
402 Payment Requirederror
Pricing Rulesâ
Base Pricingâ
Real200 pricing is based on official provider prices, adjusted by channel multipliers. Different channels may enjoy different discounts.
Smart Routing Pricingâ
When smart routing is enabled, Real200 automatically selects the most cost-effective available provider without manual comparison.
:::tip Cost Savings
Through smart routing, Real200 typically saves 15â30% compared to using official APIs directly.
:::
Quota Managementâ
Account Balanceâ
- Each account has a total balance (quota)
- Balance can be increased via Top-up
- When balance is insufficient, all API Key calls will be rejected
Key-Level Quotasâ
You can set independent quota limits for each API Key:
| Quota Type | Description |
|---|---|
| Total Quota | Maximum amount this Key can consume over its lifetime |
| Monthly Quota | Maximum amount this Key can consume per month |
| Daily Quota | Maximum amount this Key can consume per day |
Reconciliation and Billingâ
Call Logsâ
Real200 records detailed information for each call:
- Call time
- API Key used
- Request/response token counts
- Actual cost
- Provider information
- Latency and status code
Billing Exportâ
On the Logs page in the console, you can:
- Filter by date range
- Filter by API Key
- Filter by model
- Export as CSV
Common Questionsâ
How do I view real-time costs?
The console homepage shows today's costs, token usage, and call counts. The logs page shows detailed costs per call.
How are streaming responses billed?
Streaming and non-streaming responses are billed identically, based on the final prompt_tokens + completion_tokens.
Are failed calls billed?
If the request never reached the provider (e.g., blocked by risk control, routing failure), no token charges apply.