Early adopter pricing — first 50 customers lock this rate for life.

Pricing

Predictable pricing. No token markup.

You bring your own API keys. We optimize every call.

All plans support the OpenAI-compatible endpoint (/api/v1/chat/completions).

Free

Free

Forever

Try routing with your own keys. Perfect for testing and small projects.

  • Up to 1,000 routed requests / month (enforced)
  • Up to 2 connected providers (enforced)
  • Deterministic routing: lowest_cost, balanced, max_reliability, fastest

Starter

For early-stage AI SaaS teams validating model optimization.

$49/ month
  • Up to 50,000 routed requests / month (enforced)
  • Up to 3 connected providers — OpenAI, Anthropic, Google (enforced)
  • Deterministic routing: lowest_cost, balanced, max_reliability, fastest
  • Streaming + non-streaming endpoints (POST /api/route, /api/route/stream)
  • Cost estimation per request + model comparison (alternatives)
  • Confidence scoring + task_type in response
  • Pre-stream fallback protection
  • Recommendation-only endpoint (POST /api/recommend)
  • Email support

Best for teams spending $1k–$5k/month on LLM APIs.

Most Popular

Growth

For production AI systems requiring reliability and visibility.

$149/ month
  • Up to 200,000 routed requests / month (enforced)
  • Unlimited connected providers (OpenAI, Anthropic, Google, Groq, DeepSeek)
  • Advanced cost controls (max_cost enforcement)
  • force_model override
  • Request logs stored per request (view in Control Center)
  • Latency metrics per request (metrics.latency_ms)
  • Fallback indicator (fallback_used in response)
  • Priority email support

Best for teams spending $5k–$25k/month on LLM APIs.

Scale

For high-volume vertical AI platforms and infrastructure teams.

$399/ month
  • Up to 750,000 routed requests / month (enforced)
  • Higher rate limits
  • 99.9% routing availability SLA
  • Request logs (view in Control Center)
  • Provider performance metrics (roadmap)
  • Dedicated Slack support
  • Early access to optimization updates

Best for teams spending $25k+/month on LLM APIs.

Contact Sales

What unlocks in Growth?

  • Full Control Center dashboard
  • Detailed request logs
  • Provider exposure tracking
  • Strategy enforcement visibility
  • Operational metrics per request

Starter gives you routing.
Growth gives you control.

Enterprise

Custom pricing

  • Unlimited routing volume
  • Dedicated infrastructure
  • Custom provider integrations
  • VPC / private deployment options
  • Custom SLA
  • Dedicated technical support
Contact sales

How Billing Works

  • StepBlend charges a fixed monthly subscription.
  • You provide your own API keys.
  • You are billed directly by OpenAI, Anthropic, Google, etc.
  • StepBlend never marks up compute costs.
  • No per-token billing from us.

Infrastructure pricing. No surprises.

Frequently Asked Questions

Do you charge per token?
No. You are billed directly by model providers using your own API keys.
What counts as a routed request?
Each call to the OpenAI-compatible endpoint (/api/v1/chat/completions) or the native API (/api/route or /api/route/stream) counts as one routed request. Fallback retries count as a single request.
Do I need multiple providers connected?
Full optimization works best with at least two providers connected.
Can I force a specific model?
Yes. Use force_model to override routing.
Is routing deterministic?
Yes. The same input, strategy, and constraints will produce the same model selection.
What happens when I hit my monthly request cap?
The API returns 429 with X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers. Your limit resets at the end of the current month (UTC).

Ready to optimize your AI spend?

Try the Optimizer free. Add your keys and see routing in action.

Try Optimizer