Model Prism
Multi-tenant LLM gateway with intelligent routing and cost control. Drop-in replacement for the OpenAI API — for your entire organization.
Everything you need
Model Prism bundles all critical features of a production-ready LLM gateway into a single, easily deployable service.
Intelligent Auto-Routing
Classifier-based routing directs every request to the optimal model — based on complexity, context and configured cost tiers. Rule sets and fallbacks for maximum control.
Multi-Tenant & RBAC
Complete tenant isolation: every team and customer gets their own API keys, quotas and permissions. RBAC at tenant and model level, LDAP/SSO integration.
Real-Time Cost Control
Token-accurate cost tracking per tenant, model and time period. Budget alerts, automatic throttling on overage and detailed analytics dashboards.
OpenAI-Compatible API
Drop-in replacement for the OpenAI API — no code changes required. Supports Chat Completions, Embeddings and Function Calling. Compatible with every OpenAI SDK.
Prompt Logging & Audit Trail
Complete audit log of all LLM requests — prompt, response, model, tokens, cost and timestamp. Exportable for compliance and debugging.
Horizontal Scaling
Stateless architecture for easy horizontal scaling. Kubernetes-ready, health checks, graceful shutdown. From single node to enterprise cluster.
Up and running in 5 minutes
Start Model Prism instantly with Docker Compose. No database setup, no complex configuration.
# 1. Download docker-compose.yml
curl -O https://raw.githubusercontent.com/ohara-systems/model-prism/main/docker-compose.yml
# 2. Configure API keys
cat > .env <<EOF
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
PRISM_ADMIN_KEY=your-secure-admin-key
EOF
# 3. Start
docker compose up -d
# 4. Test — drop-in for OpenAI API
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer your-tenant-key" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Hello, Model Prism!"}]
}' Response
{
"id": "chatcmpl-prism-7f3a2b1",
"model": "gpt-4o-mini", // routed automatically
"choices": [{ "message": { "content": "Hello! ..." } }],
"usage": { "total_tokens": 24, "cost_usd": 0.000014 }
} Supported Providers
Model Prism connects to all major LLM providers through unified adapters.
+ any OpenAI-compatible endpoint (vLLM, LM Studio, LocalAI, ...)