Open Source Self-hosted OpenAI-compatible v1.0 GA

Model Prism

Multi-tenant LLM gateway with intelligent routing and cost control. Drop-in replacement for the OpenAI API — for your entire organization.

View on GitHub Documentation

Everything you need

Model Prism bundles all critical features of a production-ready LLM gateway into a single, easily deployable service.

Intelligent Auto-Routing

Classifier-based routing directs every request to the optimal model — based on complexity, context and configured cost tiers. Rule sets and fallbacks for maximum control.

Multi-Tenant & RBAC

Complete tenant isolation: every team and customer gets their own API keys, quotas and permissions. RBAC at tenant and model level, LDAP/SSO integration.

Real-Time Cost Control

Token-accurate cost tracking per tenant, model and time period. Budget alerts, automatic throttling on overage and detailed analytics dashboards.

OpenAI-Compatible API

Drop-in replacement for the OpenAI API — no code changes required. Supports Chat Completions, Embeddings and Function Calling. Compatible with every OpenAI SDK.

Prompt Logging & Audit Trail

Complete audit log of all LLM requests — prompt, response, model, tokens, cost and timestamp. Exportable for compliance and debugging.

Horizontal Scaling

Stateless architecture for easy horizontal scaling. Kubernetes-ready, health checks, graceful shutdown. From single node to enterprise cluster.

Up and running in 5 minutes

Start Model Prism instantly with Docker Compose. No database setup, no complex configuration.

# 1. Download docker-compose.yml
curl -O https://raw.githubusercontent.com/ohara-systems/model-prism/main/docker-compose.yml

# 2. Configure API keys
cat > .env <<EOF
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
PRISM_ADMIN_KEY=your-secure-admin-key
EOF

# 3. Start
docker compose up -d

# 4. Test — drop-in for OpenAI API
curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer your-tenant-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello, Model Prism!"}]
  }'

Response

{
  "id": "chatcmpl-prism-7f3a2b1",
  "model": "gpt-4o-mini", // routed automatically
  "choices": [{ "message": { "content": "Hello! ..." } }],
  "usage": { "total_tokens": 24, "cost_usd": 0.000014 }
}

Full setup guide View on GitHub

Supported Providers

Model Prism connects to all major LLM providers through unified adapters.

OpenAI

Anthropic

Google

Mistral

Cohere

Ollama

+ any OpenAI-compatible endpoint (vLLM, LM Studio, LocalAI, ...)