zijiren 321d4e4518 fix: claude stream tool call index save to state (#404)		2 ay önce
..
README.md	83759eaa0a feat: split think plugin (#224)	7 ay önce
README.zh.md	83759eaa0a feat: split think plugin (#224)	7 ay önce
cache.go	321d4e4518 fix: claude stream tool call index save to state (#404)	2 ay önce
config.go	83759eaa0a feat: split think plugin (#224)	7 ay önce

Cache Plugin Configuration Guide

Overview

The Cache Plugin is a high-performance caching solution for AI API requests that helps reduce latency and costs by storing and reusing responses for identical requests. It supports both in-memory caching and Redis, making it suitable for distributed deployments.

Features

Dual Storage: Supports both in-memory cache and Redis for flexible deployment options
Automatic Fallback: Automatically falls back to in-memory cache when Redis is unavailable
Content-Based Caching: Uses SHA256 hash of request body to generate cache keys
Configurable TTL: Set custom time-to-live for cached items
Size Limits: Configurable maximum item size to prevent memory issues
Cache Headers: Optional headers to indicate cache hits
Zero-Copy Design: Efficient memory usage through buffer pooling

Configuration Example

{
    "model": "gpt-4",
    "type": 1,
    "plugin": {
        "cache": {
            "enable": true,
            "ttl": 300,
            "item_max_size": 1048576,
            "add_cache_hit_header": true,
            "cache_hit_header": "X-Cache-Status"
        }
    }
}

Configuration Fields

Plugin Configuration

Field	Type	Required	Default	Description
`enable`	bool	Yes	false	Whether to enable the Cache plugin
`ttl`	int	No	300	Time-to-live for cached items (in seconds)
`item_max_size`	int	No	1048576 (1MB)	Maximum size of a single cached item (in bytes)
`add_cache_hit_header`	bool	No	false	Whether to add a header indicating cache hit
`cache_hit_header`	string	No	"X-Aiproxy-Cache"	Name of the cache hit header

How It Works

Cache Key Generation

The plugin generates cache keys based on:

Request pattern (e.g., chat completions)
SHA256 hash of the request body

This ensures identical requests hit the cache while different requests don't interfere with each other.

Cache Storage

The plugin uses a two-tier caching strategy:

Redis (if available): Primary storage for distributed caching
Memory: Fallback storage or primary when Redis is not configured

Request Flow

Request Phase:
- Plugin checks if caching is enabled
- Generates cache key from request body
- Looks up cache (Redis first, then memory)
- If hit, immediately returns cached response
- If miss, continues to upstream API
Response Phase:
- Captures response body and headers
- If response is successful, stores in cache
- Respects size limits to prevent memory issues

Usage Example

{
    "plugin": {
        "cache": {
            "enable": true,
            "ttl": 60,
            "item_max_size": 524288,
            "add_cache_hit_header": true
        }
    }
}

Response Header Example

When add_cache_hit_header is enabled:

Cache Hit:

X-Aiproxy-Cache: hit

Cache Miss:

X-Aiproxy-Cache: miss

README.md