Billing and usage

The platform supports multiple billing models: subscriptions, usage-based metering, prepaid credits, and tiered pricing. Usage is tracked so developers can charge correctly and enforce quotas.

Subscriptions and quotas

Users subscribe to a service (or a product/plan offered by the service). Subscriptions can include included pools, caps, and metered overage depending on the product. When a request is made via the gateway, the platform checks entitlement before forwarding; the gateway adds HMAC-signed context headers such as X-Tollara-Subscription-Status and X-Tollara-Service-Product-ID, and when the caller is a subscriber X-Tollara-Billing-Model, X-Tollara-Measurement-Type, and X-Tollara-Unit-Label. The gateway does not send a remaining-quota header; for numeric pre-flight (credits, caps, estimated cost), call core usage estimate (POST /billing/usage/estimate with a user JWT, or POST /service-keys/estimate-usage with key + secret). Header values are part of the signed HMAC material (see Request signing).

Developers can offer promotion coupons on eligible plans so customers receive a discount at checkout.

Usage reporting

For proxied requests (through the gateway), usage is recorded by the platform automatically. For non-proxied services (e.g. you call the usage API yourself), you report usage via POST .../api/usage/report with a signed body. For async jobs, you send progress and completion to the URLs returned in the async response, also with HMAC-signed bodies. See the Usage API and the SDK overview for report/progress/complete semantics.

Spending caps, prepaid checks, and estimates protect users from unexpected charges, but they are operational checks rather than a hard real-time guarantee for every concurrent request. If several requests arrive before earlier usage has been recorded, tollara.ai may mark later usage as unbillable instead of charging the subscriber above their cap.

Developers using non-proxied or SDK-reported usage should keep their own request logs with tollara.ai request ids where available. These logs are useful for support, refunds, and billing reconciliation.

Billing models

Supported models include subscription (recurring), usage-based (e.g. per request, per token), prepaid credits, and tiered pricing. The exact fields and behavior depend on how the developer configures products; the SDK and usage API support reporting units and optional metadata for the platform to apply the right billing logic.

Tiered metered pricing (graduated vs volume)

For metered subscriptions (billed after usage each period), you can define tiers: thresholds and a price per unit for each band. Stripe supports two ways to apply those tiers:

Graduated — Usage is split across thresholds. Units in each band are billed at that band's rate, and the charges add up (stacked). Example: first 10 units at $1, next 10 at $0.75 means 15 units cost 10×$1 + 5×$0.75.
Volume — Total usage in the billing period picks one tier. Every unit in that period is charged at that tier's rate. Example: if 15 units falls in the "11–20" band, all 15 units use that band's price.

When subscribing, users can often set an optional spending cap on metered charges. That cap limits how much they spend on usage; it does not change how tier math works.

In both modes, a tier's threshold is the minimum unit count where that tier's rate starts (inclusive). The next tier's threshold is the upper bound (exclusive, shown as <). The last tier has no upper bound (ends at inf). Tier 1 starts at 0.