Date: 2025-11-29 Architect: ding Version: 1.0 Project Type: web-app Project Level: 4 Status: Draft
This document defines the system architecture for Claude Code Hub. It provides the technical blueprint for implementation, addressing all functional and non-functional requirements from the PRD.
Related Documents:
docs/prd-claude-code-hub-2025-11-29.mddocs/product-brief-claude-code-hub-2025-11-29.mdClaude Code Hub (CCH) is an intelligent AI API proxy platform built on a Modular Monolith architecture using Next.js 15 with App Router and Hono. The system provides multi-provider management, intelligent load balancing, circuit breaker patterns, and comprehensive monitoring for AI coding tools like Claude Code and Codex.
The architecture emphasizes:
These requirements heavily influence architectural decisions:
| NFR | Requirement | Architectural Impact |
|---|---|---|
| NFR-001 | Proxy latency < 50ms overhead | Requires in-memory caching, connection pooling, streaming optimization |
| NFR-002 | 100+ concurrent sessions | Requires stateless design, Redis for session state, connection limits |
| NFR-003 | Reliable streaming | Requires proper backpressure handling, chunked transfer, timeout management |
| NFR-007 | High availability | Requires multi-provider failover, circuit breaker, Redis fail-open strategy |
| NFR | Requirement | Architectural Impact |
|---|---|---|
| NFR-004 | Secure authentication | API key hashing, constant-time comparison, rate limiting |
| NFR-005 | Data protection | TLS everywhere, key encryption at rest, masked logging |
| NFR-008 | Horizontal scalability | Stateless app design, shared Redis, connection pooling |
| NFR-011 | Code quality | TypeScript strict mode, Drizzle ORM for type safety |
Claude Code Hub follows a Modular Monolith pattern with clear internal boundaries:
┌─────────────────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Claude Code │ │ Codex CLI │ │ Cursor IDE │ │ Admin UI │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
└─────────┼────────────────┼────────────────┼────────────────┼────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ API GATEWAY LAYER │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Next.js 15 + Hono Router │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │
│ │ │ /v1/ │ │ /api/ │ │ /settings │ │ /dashboard│ │ │
│ │ │ messages │ │ actions │ │ (UI) │ │ (UI) │ │ │
│ │ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │ │
│ └────────┼──────────────┼──────────────┼──────────────┼───────────────┘ │
└───────────┼──────────────┼──────────────┼──────────────┼────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ GUARD PIPELINE LAYER │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ Auth │→│Version │→│ Probe │→│Session │→│Sensitive│→│ Rate │→... │
│ │ Guard │ │ Guard │ │Handler │ │ Guard │ │ Guard │ │ Limit │ │
│ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PROXY CORE LAYER │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Provider │ │ Format │ │ Response │ │
│ │ Selector │ │ Converter │ │ Handler │ │
│ │ (weighted LB) │ │ (Claude↔OpenAI) │ │ (streaming) │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ ┌────────┴────────────────────┴────────────────────┴────────┐ │
│ │ Circuit Breaker │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ CLOSED │←→│ OPEN │←→│HALF-OPEN│ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ UPSTREAM PROVIDERS │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Anthropic│ │ OpenAI │ │ Gemini │ │ Relay │ │ Custom │ │
│ │ (claude) │ │ (codex) │ │ (gemini) │ │(claude- │ │(openai- │ │
│ │ │ │ │ │ │ │ auth) │ │compatible│ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
│ │ PostgreSQL │ │ Redis │ │
│ │ ┌─────┐ ┌─────┐ ┌────┐│ │ ┌─────┐ ┌─────┐ ┌────┐│ │
│ │ │users│ │keys │ │prov││ │ │sess │ │rate │ │circ││ │
│ │ ├─────┤ ├─────┤ ├────┤│ │ │ions │ │limit│ │uit ││ │
│ │ │msgs │ │rules│ │conf││ │ └─────┘ └─────┘ └────┘│ │
│ │ └─────┘ └─────┘ └────┘│ └─────────────────────────┘ │
│ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
flowchart TB
subgraph Clients["Client Layer"]
CC[Claude Code CLI]
CX[Codex CLI]
CU[Cursor IDE]
WEB[Admin Dashboard]
end
subgraph Gateway["API Gateway (Next.js + Hono)"]
V1["/v1/* Proxy Endpoints"]
API["/api/actions/* Server Actions"]
UI["React Admin UI"]
end
subgraph Guards["Guard Pipeline"]
AUTH[Auth Guard]
VER[Version Guard]
SESS[Session Guard]
SENS[Sensitive Guard]
RATE[Rate Limit Guard]
PROV[Provider Selector]
end
subgraph Core["Proxy Core"]
FMT[Format Converter]
FWD[Request Forwarder]
RSP[Response Handler]
CB[Circuit Breaker]
end
subgraph Providers["Upstream Providers"]
ANT[Anthropic API]
OAI[OpenAI API]
GEM[Gemini API]
RELAY[Relay Services]
end
subgraph Data["Data Layer"]
PG[(PostgreSQL)]
RD[(Redis)]
end
CC & CX & CU --> V1
WEB --> UI
UI --> API
V1 --> AUTH --> VER --> SESS --> SENS --> RATE --> PROV
PROV --> FMT --> FWD --> RSP
FWD <--> CB
CB --> ANT & OAI & GEM & RELAY
AUTH & RATE --> RD
SESS --> RD
CB --> RD
API --> PG
RSP --> PG
Pattern: Modular Monolith with Guard Pipeline
Rationale:
Why Not Microservices:
Choice: Next.js 15 (App Router) + React 19 + Tailwind CSS + shadcn/ui
Rationale:
Trade-offs:
Key Libraries:
next-intl - Internationalization (i18n)zustand - Client state managementrecharts - Dashboard chartslucide-react - Icon libraryChoice: Hono + Next.js API Routes + Server Actions
Rationale:
Trade-offs:
Key Libraries:
undici - HTTP client for upstream requests (faster than node-fetch)zod - Runtime schema validationnext-safe-action - Type-safe Server Actions with OpenAPI generationChoice: PostgreSQL + Drizzle ORM
Rationale:
Trade-offs:
Key Features:
pg driverChoice: Redis (ioredis)
Rationale:
Trade-offs:
Fail-Open Strategy:
Choice: Docker + Docker Compose (Self-hosted)
Rationale:
Stack:
services:
app: # Next.js application
postgres: # PostgreSQL database
redis: # Redis cache
Future Consideration: Kubernetes Helm charts for enterprise deployments
| Service | Purpose | Integration |
|---|---|---|
| Anthropic API | Primary AI provider | Claude Messages API |
| OpenAI API | Codex provider | Responses API |
| Google Gemini | Alternative provider | Gemini API |
| LiteLLM | Pricing data sync | Periodic fetch |
| Various Relays | Alternative access | Claude-auth format |
Version Control: Git + GitHub
Package Manager: Bun (faster than npm/yarn)
Build Tool: Turbopack (Next.js built-in)
CI/CD: GitHub Actions
Code Quality:
Purpose: Ordered chain of request processors for cross-cutting concerns
Responsibilities:
Interfaces:
interface GuardStep {
name: string;
execute(session: ProxySession): Promise<Response | null>;
}
interface GuardPipeline {
run(session: ProxySession): Promise<Response | null>;
}
Dependencies:
FRs Addressed: FR-010, FR-011, FR-012, FR-021, FR-023
Pipeline Configuration:
// Full pipeline for chat requests
const CHAT_PIPELINE = [
"auth", // API key validation
"version", // Client version check
"probe", // Probe request handling
"session", // Session stickiness
"sensitive", // Content filtering
"rateLimit", // Rate limiting
"provider", // Provider selection
"messageContext", // Request logging
];
// Minimal pipeline for count_tokens
const COUNT_TOKENS_PIPELINE = ["auth", "version", "probe", "provider"];
Purpose: Intelligently select optimal upstream provider for each request
Responsibilities:
Algorithm:
1. Filter enabled providers
2. Filter by circuit breaker state (exclude OPEN)
3. Filter by effective provider group (key.providerGroup overrides user.providerGroup; key.providerGroup is admin-only; user.providerGroup is derived from Key groups on Key changes)
4. Check session cache for sticky provider
5. If no sticky: weighted random selection by weight
6. Return selected provider or null (all unavailable)
Interfaces:
interface ProviderSelection {
provider: Provider;
isSticky: boolean;
}
class ProxyProviderResolver {
static ensure(session: ProxySession): Promise<Response | null>;
}
Dependencies:
FRs Addressed: FR-003, FR-005, FR-007
Purpose: Bidirectional conversion between different AI API formats
Responsibilities:
Supported Conversions:
| From | To | Direction |
|---|---|---|
| Claude Messages | OpenAI Chat | Bidirectional |
| Claude Messages | Codex Response | Bidirectional |
| Claude Messages | Gemini | Bidirectional |
Interfaces:
interface FormatConverter {
convertRequest(request: any, targetFormat: string): any;
convertResponse(response: any, sourceFormat: string): any;
convertStreamChunk(chunk: any, sourceFormat: string): any;
}
FRs Addressed: FR-006, FR-026, FR-027, FR-028
Purpose: Prevent cascade failures by isolating failing providers
Responsibilities:
State Machine:
failure threshold reached
┌───────────────────────────────────────┐
│ ▼
┌───────┐ ┌─────────┐
│CLOSED │ │ OPEN │
└───┬───┘ └────┬────┘
│ │
│ success │ timeout expires
│ ▼
│ ┌───────────┐
└───────────────────────────────│ HALF-OPEN │
success threshold └───────────┘
Configuration (per provider):
{
circuitBreakerFailureThreshold: 5, // Failures before opening
circuitBreakerOpenDuration: 1800000, // 30 min open duration
circuitBreakerHalfOpenSuccessThreshold: 2 // Successes to close
}
FRs Addressed: FR-004, FR-007
Purpose: Process and transform upstream responses for clients
Responsibilities:
Streaming Architecture:
Upstream Provider (SSE)
│
▼
┌───────────────────┐
│ Response Handler │
│ - Parse chunks │
│ - Transform │
│ - Count tokens │
│ - Calculate cost │
└────────┬──────────┘
│
▼
Client (SSE)
FRs Addressed: FR-006, FR-013
Purpose: Log and track all proxy requests
Responsibilities:
Captured Data:
interface MessageRequestLog {
providerId: number;
userId: number;
key: string;
model: string;
sessionId: string;
durationMs: number;
costUsd: number;
inputTokens: number;
outputTokens: number;
statusCode: number;
errorMessage?: string;
providerChain: { id: number; name: string }[];
}
FRs Addressed: FR-013, FR-015, FR-016
Purpose: Type-safe RPC for admin operations
Modules:
providers.ts - Provider CRUD, testing, statsusers.ts - User managementkeys.ts - API key managementerror-rules.ts - Error rule configurationsensitive-words.ts - Content filter rulesstatistics.ts - Usage analyticsmodel-prices.ts - Pricing managementFeatures:
FRs Addressed: FR-001, FR-008, FR-009, FR-019-FR-025
┌─────────────────┐ ┌─────────────────┐
│ users │ │ providers │
├─────────────────┤ ├─────────────────┤
│ id (PK) │ │ id (PK) │
│ name │ │ name │
│ role │ │ url │
│ rpmLimit │ │ key (encrypted) │
│ dailyLimitUsd │ │ weight │
│ limit5hUsd │ │ priority │
│ limitWeeklyUsd │ │ providerType │
│ limitMonthlyUsd │ │ isEnabled │
│ createdAt │ │ circuitBreaker* │
│ deletedAt │ │ limits* │
└────────┬────────┘ │ timeouts* │
│ │ createdAt │
│ │ deletedAt │
│ └────────┬────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ keys │ │ message_request │
├─────────────────┤ ├─────────────────┤
│ id (PK) │ │ id (PK) │
│ userId (FK) │────▶│ userId (FK) │
│ key (unique) │ │ providerId (FK) │◀────┘
│ name │ │ key │
│ isEnabled │ │ model │
│ expiresAt │ │ sessionId │
│ limits* │ │ costUsd │
│ createdAt │ │ inputTokens │
│ deletedAt │ │ outputTokens │
└─────────────────┘ │ durationMs │
│ statusCode │
│ providerChain │
│ createdAt │
└─────────────────┘
┌─────────────────┐ ┌─────────────────┐
│ error_rules │ │ sensitive_words │
├─────────────────┤ ├─────────────────┤
│ id (PK) │ │ id (PK) │
│ pattern │ │ word │
│ matchType │ │ matchType │
│ category │ │ isEnabled │
│ isEnabled │ │ createdAt │
│ priority │ └─────────────────┘
└─────────────────┘
┌─────────────────┐ ┌─────────────────────┐
│ model_prices │ │ system_settings │
├─────────────────┤ ├─────────────────────┤
│ id (PK) │ │ id (PK) │
│ modelName │ │ siteTitle │
│ priceData (JSON)│ │ currencyDisplay │
│ createdAt │ │ enableAutoCleanup │
└─────────────────┘ │ cleanupRetention │
└─────────────────────┘
┌───────────────────────┐
│ notification_settings │
├───────────────────────┤
│ id (PK) │
│ enabled │
│ circuitBreakerWebhook │
│ dailyLeaderboard* │
│ costAlert* │
└───────────────────────┘
Indexing Strategy:
| Table | Index | Purpose |
|---|---|---|
| users | idx_users_active_role_sort |
User list query optimization |
| keys | idx_keys_user_id |
Foreign key lookup |
| providers | idx_providers_enabled_priority |
Provider selection |
| message_request | idx_message_request_user_date_cost |
Statistics aggregation |
| message_request | idx_message_request_session_id |
Session grouping |
| error_rules | unique_pattern |
Pattern uniqueness |
Soft Delete Pattern:
deletedAt columnWHERE deleted_at IS NULLJSON Columns:
providers.modelRedirects - Model name mappingproviders.allowedModels - Model whitelistmessage_request.providerChain - Request routing historymodel_prices.priceData - Flexible pricing structureRequest Flow:
1. Client Request
│
2. Auth Guard: Lookup key in PostgreSQL
│
3. Session Guard: Check Redis for sticky session
│
4. Rate Limit Guard: Check Redis counters (Lua script)
│
5. Provider Selector: Query enabled providers from cache
│
6. Forwarder: Send to upstream provider
│
7. Response Handler: Stream response to client
│
8. Message Service: Write to message_request table
Session Data Flow:
Redis Key: session:{session_id}
Value: { providerId: number, createdAt: timestamp }
TTL: 300 seconds (5 minutes)
Lookup: O(1) hash get
Creation: O(1) hash set with TTL
Rate Limit Data Flow:
Redis Keys:
- ratelimit:rpm:{userId}:{minute}
- ratelimit:5h:{userId}:{window}
- ratelimit:daily:{userId}:{day}
- ratelimit:weekly:{userId}:{week}
- ratelimit:monthly:{userId}:{month}
Operations: Lua scripts for atomic increment + check
Pattern: REST + Server Actions
Versioning: URL path versioning (/v1/)
Authentication:
Authorization: Bearer {key} or x-api-key: {key}Response Format: JSON (Claude API compatible for proxy endpoints)
| Method | Endpoint | Description | FR |
|---|---|---|---|
| POST | /v1/messages |
Claude Messages API | FR-026 |
| POST | /v1/messages/count_tokens |
Token counting | FR-026 |
| POST | /v1/chat/completions |
OpenAI Chat API | FR-027 |
| POST | /v1/responses |
Codex Response API | FR-028 |
| POST | /v1/generateContent |
Gemini API | FR-002 |
Request Headers:
Authorization: Bearer {api_key}
Content-Type: application/json
x-session-id: {optional_session_id}
x-client-version: {optional_version}
anthropic-version: 2023-06-01
Response Headers:
Content-Type: application/json (or text/event-stream for streaming)
x-request-id: {request_id}
Providers:
POST /api/actions - createProvider
GET /api/actions - listProviders
GET /api/actions - getProvider
PUT /api/actions - updateProvider
DEL /api/actions - deleteProvider
POST /api/actions - testProviderConnection
GET /api/actions - getProviderStats
Users:
POST /api/actions - createUser
GET /api/actions - listUsers
GET /api/actions - getUser
PUT /api/actions - updateUser
DEL /api/actions - deleteUser
GET /api/actions - getUserStats
Keys:
POST /api/actions - createKey
GET /api/actions - listKeys
PUT /api/actions - updateKey
DEL /api/actions - revokeKey
Statistics:
GET /api/actions - getOverview
GET /api/actions - getUsageByUser
GET /api/actions - getUsageByProvider
GET /api/actions - getUsageByModel
GET /api/actions - getLeaderboard
API Key Authentication:
// Key format: cch_xxx...xxx (32+ characters)
// Storage: SHA-256 hash in database
// Lookup: O(1) indexed query
async function authenticateKey(key: string): Promise<AuthResult> {
const hash = sha256(key);
const keyRecord = await db.query.keys.findFirst({
where: eq(keys.key, hash),
});
// Validate: exists, enabled, not expired
}
Admin Token Authentication:
// Environment variable: ADMIN_TOKEN
// Stored in HTTP-only cookie after login
// Validated on each admin request
Authorization Model:
Roles:
- admin: Full access to admin UI and API
- user: Access to personal stats and settings
Permissions:
- API key gives user-level access
- Admin token gives admin access
Requirement: Proxy overhead < 50ms for request processing
Architecture Solution:
Implementation Notes:
// Use undici for upstream requests (faster than node-fetch)
import { fetch } from "undici";
// Redis connection pooling
const redis = new Redis({ lazyConnect: true });
// Streaming response forwarding
return new Response(
new ReadableStream({
async start(controller) {
for await (const chunk of upstreamResponse.body) {
controller.enqueue(chunk);
}
},
})
);
Validation:
Requirement: Support 100+ concurrent streaming sessions
Architecture Solution:
Implementation Notes:
// Session stored in Redis with TTL
await redis.setex(
`session:${sessionId}`,
300,
JSON.stringify({
providerId,
createdAt: Date.now(),
})
);
// Provider concurrent session limit
if (activeSessions >= provider.limitConcurrentSessions) {
return new Response("Too many concurrent sessions", { status: 429 });
}
Validation:
Requirement: Reliable SSE streaming without message loss
Architecture Solution:
Implementation Notes:
// Streaming with backpressure
const stream = new ReadableStream({
async pull(controller) {
const chunk = await reader.read();
if (chunk.done) {
controller.close();
} else {
controller.enqueue(chunk.value);
}
},
});
// Idle timeout (provider-configurable)
setTimeout(() => {
if (!lastChunkTime || Date.now() - lastChunkTime > idleTimeout) {
stream.cancel();
}
}, idleTimeout);
Validation:
Requirement: Secure API key handling
Architecture Solution:
Implementation Notes:
// Key generation
function generateKey(): string {
return `cch_${randomBytes(32).toString("hex")}`;
}
// Key hashing
function hashKey(key: string): string {
return createHash("sha256").update(key).digest("hex");
}
// Constant-time comparison (via crypto.timingSafeEqual)
Validation:
Requirement: Service continuity despite provider failures
Architecture Solution:
Implementation Notes:
// Circuit breaker states in Redis
const state = await redis.hget(`cb:${providerId}`, "state");
if (state === "OPEN") {
// Skip this provider, try next
return selectNextProvider();
}
// Fail-open for Redis
try {
await checkRateLimit();
} catch (redisError) {
console.warn("Redis unavailable, allowing request");
// Continue without rate limiting
}
Validation:
API Key Authentication:
cch_ prefix + 32 hex charactersAuthorization: Bearer headerAdmin Authentication:
RBAC Model:
Roles:
├── admin
│ ├── Manage providers
│ ├── Manage users
│ ├── Manage keys
│ ├── View all statistics
│ └── Configure system
└── user
├── Use proxy endpoints
├── View own statistics
└── Manage own keys
Permission Enforcement:
// Guard middleware pattern
const adminOnly = async (ctx, next) => {
const token = ctx.cookies.get("adminToken");
if (token !== process.env.ADMIN_TOKEN) {
return ctx.json({ error: "Unauthorized" }, 401);
}
return next();
};
In Transit:
At Rest:
Input Validation:
// Zod schemas for all inputs
const createProviderSchema = z.object({
name: z.string().min(1).max(255),
url: z.string().url(),
key: z.string().min(10),
// ...
});
SQL Injection Prevention:
XSS Prevention:
Rate Limiting:
Logging Security:
Horizontal Scaling (Primary):
Load Balancer
│
┌───────────────┼───────────────┐
│ │ │
┌───┴───┐ ┌───┴───┐ ┌───┴───┐
│ CCH-1 │ │ CCH-2 │ │ CCH-3 │
└───┬───┘ └───┬───┘ └───┬───┘
│ │ │
└───────────────┼───────────────┘
│
┌───────────┴───────────┐
│ │
┌───┴───┐ ┌───┴───┐
│ Redis │ │Postgres│
│(Master)│ │(Primary)│
└───┬───┘ └───┬───┘
│ │
┌───┴───┐ ┌───┴───┐
│ Redis │ │Postgres│
│(Replica) │(Replica)│
└───────┘ └───────┘
Scaling Triggers:
Query Optimization:
N+1 Prevention:
Memory Efficiency:
Cache Layers:
┌─────────────────────────────────────┐
│ CDN (Static Assets) │ TTL: Long
├─────────────────────────────────────┤
│ Application Cache (In-Memory) │ TTL: Short
│ - Provider list │
│ - Error rules │
│ - Sensitive words │
├─────────────────────────────────────┤
│ Redis Cache │ TTL: Medium
│ - Session mappings │
│ - Rate limit counters │
│ - Circuit breaker state │
├─────────────────────────────────────┤
│ PostgreSQL │ Source of truth
└─────────────────────────────────────┘
Cache Invalidation:
Strategy: Round-robin with health checks
Health Check Endpoint: GET /api/health
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok"
}
}
No Single Points of Failure:
Graceful Degradation:
RPO (Recovery Point Objective): 1 hour RTO (Recovery Time Objective): 4 hours
Backup Strategy:
Recovery Procedure:
| Data | Frequency | Retention | Method |
|---|---|---|---|
| PostgreSQL | Daily | 30 days | pg_dump + S3 |
| WAL logs | Continuous | 7 days | WAL archiving |
| Config | On change | Unlimited | Git |
| Redis | N/A | Ephemeral | Rebuilt on restart |
Metrics (Prometheus format):
# Request metrics
cch_requests_total{endpoint, status}
cch_request_duration_seconds{endpoint}
# Provider metrics
cch_provider_requests_total{provider, status}
cch_provider_circuit_state{provider}
# Resource metrics
cch_active_sessions
cch_redis_connections
cch_db_connections
Alerting Thresholds: | Alert | Condition | Severity | |-------|-----------|----------| | High Error Rate | 5xx > 5% for 5min | Critical | | High Latency | p99 > 500ms for 5min | Warning | | Circuit Open | Any provider OPEN | Warning | | Database Down | Health check fail | Critical |
Upstream AI Providers:
interface ProviderAdapter {
formatRequest(session: ProxySession): Request;
parseResponse(response: Response): AsyncGenerator<Chunk>;
handleError(error: Error): ErrorClassification;
}
// Implementations
class AnthropicAdapter implements ProviderAdapter {}
class OpenAIAdapter implements ProviderAdapter {}
class GeminiAdapter implements ProviderAdapter {}
Webhook Integrations:
Module Communication:
Actions Layer ←→ Repository Layer ←→ Database
↓
Guard Pipeline ←→ Redis ←→ Circuit Breaker
↓
Format Converters ←→ Provider Adapters
Current: Synchronous request-response
Future Consideration: Event-driven for:
src/
├── app/ # Next.js App Router
│ ├── v1/ # Proxy endpoints
│ │ └── _lib/ # Proxy core
│ │ ├── proxy/ # Guard pipeline, forwarder
│ │ ├── converters/ # Format converters
│ │ └── gemini/ # Gemini-specific
│ ├── api/ # REST API
│ │ └── actions/ # OpenAPI documentation
│ ├── dashboard/ # Admin UI pages
│ └── settings/ # Settings pages
├── actions/ # Server Actions
├── repository/ # Database queries
├── drizzle/ # Schema + migrations
├── lib/ # Shared utilities
│ ├── rate-limit/ # Rate limiting
│ ├── circuit-breaker/ # Circuit breaker
│ └── session/ # Session management
├── types/ # TypeScript types
└── components/ # React components
Bounded Contexts:
Dependency Rules:
Unit Tests:
Integration Tests:
E2E Tests:
Coverage Target: 70%+
name: CI/CD
on: [push, pull_request]
jobs:
lint:
- bun install
- bun run lint
- bun run typecheck
test:
- bun install
- bun test
build:
- bun run build
- docker build
deploy:
- docker push
- deploy to staging/production
| Environment | Purpose | Database | Redis |
|---|---|---|---|
| Development | Local development | Local PostgreSQL | Local Redis |
| Staging | Pre-production testing | Staging DB | Staging Redis |
| Production | Live service | Production DB | Production Redis |
Current: Docker Compose (self-hosted)
version: "3.8"
services:
app:
image: claude-code-hub:latest
ports:
- "3000:3000"
environment:
- DSN=postgresql://...
- REDIS_URL=redis://...
- ADMIN_TOKEN=...
depends_on:
- postgres
- redis
postgres:
image: postgres:15
volumes:
- pgdata:/var/lib/postgresql/data
redis:
image: redis:7
volumes:
- redisdata:/data
Future: Kubernetes with Helm charts
Docker:
Docker Compose:
| FR ID | FR Name | Components | Status |
|---|---|---|---|
| FR-001 | Multi-Provider Management | Admin Actions, Repository | ✓ |
| FR-002 | Provider Type Support | Provider Adapters, Format Converters | ✓ |
| FR-003 | Intelligent Provider Selection | Provider Selector | ✓ |
| FR-004 | Circuit Breaker | Circuit Breaker Component | ✓ |
| FR-005 | Session Stickiness | Session Guard, Redis | ✓ |
| FR-006 | Format Conversion | Format Converters | ✓ |
| FR-007 | Automatic Retry | Guard Pipeline, Circuit Breaker | ✓ |
| FR-008 | User Management | Admin Actions, Repository | ✓ |
| FR-009 | API Key Management | Admin Actions, Repository | ✓ |
| FR-010 | Authentication | Auth Guard | ✓ |
| FR-011 | Rate Limiting | Rate Limit Guard, Redis | ✓ |
| FR-012 | Concurrent Session Limiting | Session Guard | ✓ |
| FR-013 | Request Logging | Message Service | ✓ |
| FR-014 | Real-Time Dashboard | Dashboard UI | ✓ |
| FR-015 | Usage Statistics | Statistics Actions | ✓ |
| FR-016 | Active Session Monitoring | Session Actions | ✓ |
| FR-017 | Provider Health Status | Provider Stats | ✓ |
| FR-019 | Admin Dashboard | Next.js UI | ✓ |
| FR-020 | Error Rules Management | Error Rules Actions | ✓ |
| FR-021 | Sensitive Words Filtering | Sensitive Guard | ✓ |
| FR-026 | Claude Messages API | Proxy Endpoints | ✓ |
| FR-027 | OpenAI Chat Completions | Proxy Endpoints, Converters | ✓ |
| FR-028 | Codex Response API | Proxy Endpoints, Converters | ✓ |
| NFR ID | NFR Name | Solution | Validation |
|---|---|---|---|
| NFR-001 | Proxy Latency | Hono, Redis, streaming | p95 < 50ms |
| NFR-002 | Concurrent Capacity | Stateless, Redis | 100+ sessions |
| NFR-003 | Streaming Reliability | Chunked transfer | No loss test |
| NFR-004 | Authentication Security | Key hashing | Security audit |
| NFR-005 | Data Protection | TLS, encryption | Compliance check |
| NFR-007 | High Availability | Circuit breaker, failover | Chaos testing |
| NFR-008 | Horizontal Scalability | Stateless design | Load testing |
| NFR-009 | Internationalization | next-intl | UI review |
| NFR-011 | Code Quality | TypeScript strict | Coverage report |
Decision: Modular Monolith
Trade-off:
Rationale: Proxy latency requirement (< 50ms) makes network hops expensive. Single team doesn't benefit from service boundaries.
Decision: Hono
Trade-off:
Rationale: Performance is critical for proxy use case. Hono's edge-first design aligns with low-latency goals.
Decision: Redis
Trade-off:
Rationale: Session stickiness requires very fast lookups on every request. 5-minute TTL makes persistence unnecessary.
Decision: PostgreSQL
Trade-off:
Rationale: Multi-tenant workload requires proper concurrent write handling. Future multi-database support planned.
| Issue | Impact | Mitigation |
|---|---|---|
| Redis single point | Medium | Redis Sentinel/Cluster for production |
| Large log volume | Medium | Auto-cleanup, retention policies |
| Provider API changes | Low | Adapter pattern for isolation |
| Key storage security | Medium | Consider HSM/KMS integration |
Assumptions:
Constraints:
Review Status:
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2025-11-29 | ding | Initial architecture |
Run /sprint-planning to:
Key Implementation Principles:
This document was created using BMAD Method v6 - Phase 3 (Solutioning)
To continue: Run /workflow-status to see your progress and next recommended workflow.
| Category | Choice | Alternatives Considered | Key Factor |
|---|---|---|---|
| Framework | Next.js 15 | Remix, Nuxt | SSR + API in one |
| Router | Hono | Express, Fastify | Performance |
| ORM | Drizzle | Prisma, TypeORM | Type safety + SQL |
| Database | PostgreSQL | MySQL, SQLite | JSON support + reliability |
| Cache | Redis | Memcached | Data structures + Lua |
| UI Library | shadcn/ui | MUI, Chakra | Customizable + lightweight |
Baseline Assumptions:
Resource Estimates:
| Component | CPU | Memory | Storage |
|---|---|---|---|
| App (per instance) | 2 cores | 2GB | - |
| PostgreSQL | 2 cores | 4GB | 50GB |
| Redis | 1 core | 1GB | 1GB |
Scaling Triggers:
Self-Hosted (Small Team):
Self-Hosted (Medium Team):
Cloud (AWS equivalent):
Note: Excludes upstream AI API costs (user-provided)