소스 검색

fix: remove explicit cache_control for Google models in OpenRouter (#4487) (#4488)

* fix: remove explicit cache_control for Google models in OpenRouter (#4487)

- Remove all Google models from OPEN_ROUTER_PROMPT_CACHING_MODELS set
- This resolves 3+ minute lag when using google/gemini-2.5-pro-preview
- OpenRouter still provides automatic implicit ephemeral caching for these models
- Updated tests to handle intentional exclusion of Google models from explicit caching

Fixes #4487

* refactor: simplify OpenRouter caching test logic

- Replace hardcoded exclusion list with simple Google model filter
- Keep original validation logic but make it more maintainable
- Still ensures all our caching models are supported by OpenRouter
- Still verifies we exclude all Google models from explicit caching

* cleanup: remove unused excludedModels variable

- Variable was defined but never used
- Keeps the test logic clean and focused

* refactor: only exclude google/gemini-2.5-pro-preview from caching

- More surgical approach - only exclude the specific problematic model
- Keep other Google models in caching (they work fine)
- Add comment explaining the exclusion with issue reference
- Update test to only exclude the specific model

This targets just the model causing 3+ minute lag while preserving
caching benefits for other Google models that work properly.
Hannes Rudolph 6 달 전
부모
커밋
bf35dcd626
2개의 변경된 파일16개의 추가작업 그리고 7개의 파일을 삭제
  1. 0 1
      packages/types/src/providers/openrouter.ts
  2. 16 6
      src/api/providers/fetchers/__tests__/openrouter.spec.ts

+ 0 - 1
packages/types/src/providers/openrouter.ts

@@ -39,7 +39,6 @@ export const OPEN_ROUTER_PROMPT_CACHING_MODELS = new Set([
 	"anthropic/claude-3.7-sonnet:thinking",
 	"anthropic/claude-sonnet-4",
 	"anthropic/claude-opus-4",
-	"google/gemini-2.5-pro-preview",
 	"google/gemini-2.5-flash-preview",
 	"google/gemini-2.5-flash-preview:thinking",
 	"google/gemini-2.5-flash-preview-05-20",

+ 16 - 6
src/api/providers/fetchers/__tests__/openrouter.spec.ts

@@ -23,12 +23,22 @@ describe("OpenRouter API", () => {
 
 			const models = await getOpenRouterModels()
 
-			expect(
-				Object.entries(models)
-					.filter(([_, model]) => model.supportsPromptCache)
-					.map(([id, _]) => id)
-					.sort(),
-			).toEqual(Array.from(OPEN_ROUTER_PROMPT_CACHING_MODELS).sort())
+			const openRouterSupportedCaching = Object.entries(models)
+				.filter(([_, model]) => model.supportsPromptCache)
+				.map(([id, _]) => id)
+
+			const ourCachingModels = Array.from(OPEN_ROUTER_PROMPT_CACHING_MODELS)
+
+			// Verify all our caching models are actually supported by OpenRouter
+			for (const modelId of ourCachingModels) {
+				expect(openRouterSupportedCaching).toContain(modelId)
+			}
+
+			// Verify we have all supported models except intentionally excluded ones
+			const excludedModels = new Set(["google/gemini-2.5-pro-preview"]) // Excluded due to lag issue (#4487)
+			const expectedCachingModels = openRouterSupportedCaching.filter((id) => !excludedModels.has(id)).sort()
+
+			expect(ourCachingModels.sort()).toEqual(expectedCachingModels)
 
 			expect(
 				Object.entries(models)