Documentation
Everything you need to start using Waterfall. OpenAI-compatible API -- change two lines and go.
Quick Start
Get up and running in under a minute. Waterfall is a drop-in replacement for the OpenAI API.
Get an API key
Create an account and grab your key from the dashboard.
Point your SDK at Waterfall
Set base_url to https://api.getwaterfall.org/v1 and use your Waterfall API key.
Send requests
We handle routing, fallbacks, and caching automatically. Use model: "auto" for smart routing, or specify a model directly.
API Reference
https://api.getwaterfall.org/v1AuthenticationAuthorization: Bearer wf-sk-...FormatOpenAI-compatible JSONCreate a chat completion. Fully compatible with the OpenAI chat completions API. Supports streaming, function calling, and all standard parameters.
Models
auto-- Smart routing to the cheapest capable model (default)deepseek/deepseek-chat-- Specify a model directlyanthropic/claude-sonnet-4-- Any supported model ID
See the models page for all available models and pricing.
Routing Strategies
Control how Waterfall selects models by passing a routing strategy header or using the model field.
autoDefaultFree models first, then cheapest capable paid model. Best for most use cases.
free-onlyOnly route to free models (Groq, Cerebras, Gemini free tier). Requests fail if no free model is available.
speed-firstRoute to the fastest available provider. Great for real-time applications.
quality-firstRoute to the best model available regardless of cost. Use when accuracy matters most.
Usage via header:
X-Waterfall-Strategy: free-onlyCode Examples
import openai
client = openai.OpenAI(
base_url="https://api.getwaterfall.org/v1",
api_key="wf-sk-your-key-here"
)
response = client.chat.completions.create(
model="auto", # Smart routing to cheapest capable model
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
print(response.choices[0].message.content)Install: pip install openai
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.getwaterfall.org/v1",
apiKey: "wf-sk-your-key-here",
});
const response = await client.chat.completions.create({
model: "auto", // Smart routing to cheapest capable model
messages: [{ role: "user", content: "Explain quantum computing" }],
});
console.log(response.choices[0].message.content);Install: npm install openai
curl https://api.getwaterfall.org/v1/chat/completions \
-H "Authorization: Bearer wf-sk-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Hello!"}]
}'Pricing & Limits
Pricing
Waterfall adds a flat 4% markup on token costs. No credit purchase fees, no hidden charges. Free models remain free.
Rate limits
Rate limits depend on the upstream provider. Waterfall automatically handles rate limit errors by falling through to the next provider in the cascade.
Caching
Identical requests are automatically cached. Cached responses are free and instant. Your dashboard shows your cache hit rate.
Ready to get started?
Get your API key