Fair Use Policy

Last updated: April 20, 2026

KlicForge enforces fair use limits on all pricing plans to ensure reliable service and fair access for all users.

Overview

KlicForge is a shared, multi-tenant platform. Fair use limits exist to prevent accidental over-consumption and ensure reliable service for all users. These limits apply equally to all pricing plans — there are no exceptions based on plan tier.

Token Budgets

Your pricing plan includes a daily token budget. Tokens are counted across all agent requests:

  • Input tokens (user messages, system prompts, knowledge base context)
  • Output tokens (agent responses)
  • Tool execution tokens (function calls and results)
  • Vision processing tokens (image analysis, ~1,000 tokens per image)

Daily Token Budgets

PlanDaily BudgetBurst AllowanceOverage Cost
Sandbox50,000250,000 (5-hour)Soft throttle only
Startup200,0001,000,000 (5-hour)$0.01 per 1k tokens
Business750,0003,750,000 (5-hour)$0.005 per 1k tokens
EnterpriseNegotiatedNegotiatedPer SLA

Soft Throttling

When you exceed your daily token budget, we apply soft throttling rather than hard denial:

  • Requests are queued with 5–30 second added latency
  • Tool calls may be deprioritized or skipped to conserve tokens
  • Streaming responses are disabled
  • Vision (image analysis) requests are temporarily unavailable

Soft throttling lifts when your token usage drops below the daily budget (which resets at UTC midnight).

Rate Limits

In addition to token budgets, each plan enforces per-minute request rate limits to protect upstream providers:

Request Rate Limits

PlanPer MinutePer Hour
Sandbox10 req/min600 req/hour
Startup60 req/min3,600 req/hour
Business300 req/min18,000 req/hour
EnterprisePer SLAPer SLA

Note: Rate limits apply per agent, not per tenant. Running multiple agents spreads the load across your rate limit allowance.

Overage Billing

Sandbox tier: No overage charges. Soft throttling applies when budget is exceeded.

Startup & Business tiers: If you exceed your daily token budget on multiple consecutive days, overage charges apply:

  • Startup: $0.01 per 1,000 overage tokens
  • Business: $0.005 per 1,000 overage tokens

Overage charges appear on your next monthly invoice. If overages become chronic (>10 days per month), we may recommend upgrading your plan.

Upstream Provider Limits

KlicForge relies on third-party AI providers (Cloudflare, OpenAI, OpenRouter). These providers enforce their own rate limits. If we hit a provider limit, your requests are automatically queued and retried. You may experience slightly longer latency, but no errors.

Abuse & Suspension

We reserve the right to suspend your account if we detect abuse, including:

  • Deliberate circumvention of fair use limits
  • Resource exhaustion attacks
  • Unauthorized service reselling
  • Malicious use (spam, scraping, phishing)

For suspected abuse, we typically issue a warning before suspension. Suspended accounts may appeal via support@klicforge.ai.

Requesting Higher Limits

If your legitimate use case exceeds fair use budgets, contact our sales team at hello@klicforge.ai. We can approve higher limits or recommend an appropriate plan upgrade.

Best Practices

  • Optimize knowledge retrieval: Limit context to top 3–5 relevant chunks
  • Batch non-urgent requests: Run bulk operations during off-peak hours
  • Disable expensive features: Only enable vision analysis when necessary
  • Use smaller models: Claude Haiku or Llama for simple tasks; Opus/GPT-4 for complex reasoning
  • Pre-plan for peaks: Contact sales before anticipated traffic spikes

FAQ

Q: Can I prepay for higher token budgets?

Yes, Enterprise customers can. Startup/Business customers should contact sales for options.

Q: Do unused tokens roll over to next month?

No. Daily budgets reset at UTC midnight. Enterprise customers have monthly cushions.

Q: Can I split my budget across multiple agents?

Yes. Your budget is per-tenant. All agents in your workspace share the daily token pool.

Q: What happens if I exceed rate limits?

Requests are automatically queued and retried. You'll see longer latency but no errors.