zijiren 321d4e4518 fix: claude stream tool call index save to state (#404) 2 ay önce
..
README.md 83759eaa0a feat: split think plugin (#224) 7 ay önce
README.zh.md 83759eaa0a feat: split think plugin (#224) 7 ay önce
cache.go 321d4e4518 fix: claude stream tool call index save to state (#404) 2 ay önce
config.go 83759eaa0a feat: split think plugin (#224) 7 ay önce

README.md

Cache Plugin Configuration Guide

Overview

The Cache Plugin is a high-performance caching solution for AI API requests that helps reduce latency and costs by storing and reusing responses for identical requests. It supports both in-memory caching and Redis, making it suitable for distributed deployments.

Features

  • Dual Storage: Supports both in-memory cache and Redis for flexible deployment options
  • Automatic Fallback: Automatically falls back to in-memory cache when Redis is unavailable
  • Content-Based Caching: Uses SHA256 hash of request body to generate cache keys
  • Configurable TTL: Set custom time-to-live for cached items
  • Size Limits: Configurable maximum item size to prevent memory issues
  • Cache Headers: Optional headers to indicate cache hits
  • Zero-Copy Design: Efficient memory usage through buffer pooling

Configuration Example

{
    "model": "gpt-4",
    "type": 1,
    "plugin": {
        "cache": {
            "enable": true,
            "ttl": 300,
            "item_max_size": 1048576,
            "add_cache_hit_header": true,
            "cache_hit_header": "X-Cache-Status"
        }
    }
}

Configuration Fields

Plugin Configuration

Field Type Required Default Description
enable bool Yes false Whether to enable the Cache plugin
ttl int No 300 Time-to-live for cached items (in seconds)
item_max_size int No 1048576 (1MB) Maximum size of a single cached item (in bytes)
add_cache_hit_header bool No false Whether to add a header indicating cache hit
cache_hit_header string No "X-Aiproxy-Cache" Name of the cache hit header

How It Works

Cache Key Generation

The plugin generates cache keys based on:

  1. Request pattern (e.g., chat completions)
  2. SHA256 hash of the request body

This ensures identical requests hit the cache while different requests don't interfere with each other.

Cache Storage

The plugin uses a two-tier caching strategy:

  1. Redis (if available): Primary storage for distributed caching
  2. Memory: Fallback storage or primary when Redis is not configured

Request Flow

  1. Request Phase:

    • Plugin checks if caching is enabled
    • Generates cache key from request body
    • Looks up cache (Redis first, then memory)
    • If hit, immediately returns cached response
    • If miss, continues to upstream API
  2. Response Phase:

    • Captures response body and headers
    • If response is successful, stores in cache
    • Respects size limits to prevent memory issues

Usage Example

{
    "plugin": {
        "cache": {
            "enable": true,
            "ttl": 60,
            "item_max_size": 524288,
            "add_cache_hit_header": true
        }
    }
}

Response Header Example

When add_cache_hit_header is enabled:

Cache Hit:

X-Aiproxy-Cache: hit

Cache Miss:

X-Aiproxy-Cache: miss