瀏覽代碼

fix(logs): improve fake 200 error logs (#790)

* fix(logs): improve fake 200 error logs (#765)

* fix(proxy): 识别 200+HTML 假200并触发故障转移

* fix(utils): 收紧 HTML 文档识别避免误判

* fix(proxy): 非流式假200补齐强信号 JSON error 检测

* fix(utils): 假200检测兼容 BOM

* perf(proxy): 降低非流式嗅探读取上限

* fix(proxy): 客户端隐藏 FAKE_200_* 内部码

* fix(logs): 补齐 endpoint_pool_exhausted/404 错因展示

- endpoint_pool_exhausted 写入 attemptNumber,避免被 initial_selection/session_reuse 去重吞掉\n- 决策链/技术时间线补齐 resource_not_found 的失败态与说明\n- 更新 provider-chain i18n 文案并新增单测覆盖

* fix(proxy): 非流式 JSON 假200检测覆盖 Content-Length

- 对 application/json 且 Content-Length<=32KiB 的 2xx 响应也做强信号嗅探\n- 补齐 200+JSON error(带 Content-Length)触发故障转移的回归测试

* chore: format code (fix-issue-749-fake-200-html-detection-005fad3)

* fix(i18n): 修正 ru 端点池耗尽文案

- 修正俄语中 endpoint 的复数属格拼写(конечных точек)\n- 不影响 key,仅更新展示文案

* test(formatter): 补齐 resource_not_found 组合场景覆盖

- 覆盖 resource_not_found + retry_success 多供应商链路\n- 覆盖缺少 errorDetails.provider 的降级渲染路径

* fix(proxy): FAKE_200 客户端提示附带脱敏片段

* fix: 改进 FAKE_200 错误原因提示

* fix(proxy): verboseProviderError 回传假200原文

- fake-200/空响应:verboseProviderError 开启时在 error.details 返回详细报告与上游原文(不落库)\n- forwarder: 将检测到的原文片段挂到 ProxyError.upstreamError.rawBody\n- tests: 覆盖 verbose details 与 rawBody 透传

* fix(proxy): 强化 Content-Length 校验与假200片段防泄露

- forwarder: 将非法 Content-Length 视为无效,避免漏检 HTML/空响应\n- errors: FAKE_200 客户端 detail 二次截断 + 轻量脱敏(防御性)\n- tests: 覆盖非法 Content-Length 漏检回归

* docs(proxy): 说明非流式假200检测上限

* docs(settings): 补充 verboseProviderError 安全提示

* fix(proxy): verboseProviderError rawBody 基础脱敏

* chore: format code (fix-issue-749-fake-200-html-detection-b56b790)

* docs(settings): 说明 verboseProviderError 基础脱敏

* fix(proxy/logs): 假200 推断状态码并显著标记

* fix(i18n): 回退 verboseProviderErrorDesc 原始文案

* fix(stream): 404 资源不存在不计入熔断

---------

Co-authored-by: tesgth032 <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ding <[email protected]>

* fix: add missing import for inferUpstreamErrorStatusCodeFromText

The function inferUpstreamErrorStatusCodeFromText was used in
response-handler.ts but was not imported, causing a TypeScript
compilation error during build.

Fixed:
- Added inferUpstreamErrorStatusCodeFromText to imports from @/lib/utils/upstream-error-detection

CI Run: https://github.com/ding113/claude-code-hub/actions/runs/22033028838

* fix(proxy): deduplicate getFake200ReasonKey and strengthen client-facing sanitization

Extract duplicated getFake200ReasonKey() from SummaryTab and
ProviderChainPopover into a shared fake200-reason.ts utility,
eliminating the risk of silent drift when new FAKE_200_* codes are added.

Replace the 3-pattern manual sanitization in getClientSafeMessage()
with the existing sanitizeErrorTextForDetail() (6 patterns), closing
a gap where JWT tokens, emails, and password/config paths could leak
to clients via the FAKE_200 error detail path.

Add unit tests verifying JWT, email, and password sanitization.

* fix(proxy): address bugbot review comments on fake-200 error handling

- Add i18n for HTTP status prefix in LogicTraceTab (5 languages)
- Wrap verbose details gathering in try-catch to prevent cascading failures
- Truncate rawBody to 4096 chars before sanitization in error-handler
- Tighten not_found regex to require contextual prefixes, preventing false 404 inference
- Add debug logging to silent catch blocks in readResponseTextUpTo
- Add test assertion for fake200DetectedReason display

---------

Co-authored-by: tesgth032 <[email protected]>
Co-authored-by: tesgth032 <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Ding 6 天之前
父節點
當前提交
aacaae47f2
共有 30 個文件被更改,包括 1973 次插入78 次删除
  1. 13 0
      messages/en/dashboard.json
  2. 5 0
      messages/en/provider-chain.json
  3. 13 0
      messages/ja/dashboard.json
  4. 5 0
      messages/ja/provider-chain.json
  5. 13 0
      messages/ru/dashboard.json
  6. 5 0
      messages/ru/provider-chain.json
  7. 13 0
      messages/zh-CN/dashboard.json
  8. 5 0
      messages/zh-CN/provider-chain.json
  9. 13 0
      messages/zh-TW/dashboard.json
  10. 5 0
      messages/zh-TW/provider-chain.json
  11. 11 0
      src/app/[locale]/dashboard/logs/_components/error-details-dialog.test.tsx
  12. 14 2
      src/app/[locale]/dashboard/logs/_components/error-details-dialog/components/LogicTraceTab.tsx
  13. 10 0
      src/app/[locale]/dashboard/logs/_components/error-details-dialog/components/SummaryTab.tsx
  14. 15 0
      src/app/[locale]/dashboard/logs/_components/fake200-reason.ts
  15. 31 0
      src/app/[locale]/dashboard/logs/_components/provider-chain-popover.test.tsx
  16. 80 12
      src/app/[locale]/dashboard/logs/_components/provider-chain-popover.tsx
  17. 64 1
      src/app/v1/_lib/proxy/error-handler.ts
  18. 81 1
      src/app/v1/_lib/proxy/errors.ts
  19. 204 20
      src/app/v1/_lib/proxy/forwarder.ts
  20. 59 30
      src/app/v1/_lib/proxy/response-handler.ts
  21. 2 0
      src/app/v1/_lib/proxy/session.ts
  22. 107 0
      src/lib/utils/provider-chain-formatter.test.ts
  23. 59 4
      src/lib/utils/provider-chain-formatter.ts
  24. 107 1
      src/lib/utils/upstream-error-detection.test.ts
  25. 176 2
      src/lib/utils/upstream-error-detection.ts
  26. 9 0
      src/types/message.ts
  27. 184 0
      tests/unit/proxy/error-handler-verbose-provider-error-details.test.ts
  28. 53 0
      tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts
  29. 573 0
      tests/unit/proxy/proxy-forwarder-fake-200-html.test.ts
  30. 44 5
      tests/unit/proxy/response-handler-endpoint-circuit-isolation.test.ts

+ 13 - 0
messages/en/dashboard.json

@@ -244,6 +244,18 @@
       },
       "errorMessage": "Error Message",
       "fake200ForwardedNotice": "Note: For streaming requests, this failure may be detected only after the stream ends; the response content may already have been forwarded to the client.",
+      "fake200DetectedReason": "Detected reason: {reason}",
+      "fake200Reasons": {
+        "emptyBody": "Empty response body",
+        "htmlBody": "HTML document returned (likely an error page)",
+        "jsonErrorNonEmpty": "JSON has a non-empty `error` field",
+        "jsonErrorMessageNonEmpty": "JSON has a non-empty `error.message`",
+        "jsonMessageKeywordMatch": "JSON `message` contains the word \"error\" (heuristic)",
+        "unknown": "Response body indicates an error"
+      },
+      "statusCodeInferredBadge": "Inferred",
+      "statusCodeInferredTooltip": "This status code is inferred from response body content (e.g., fake 200) and may differ from the upstream HTTP status.",
+      "statusCodeInferredSuffix": "(inferred)",
       "filteredProviders": "Filtered Providers",
       "providerChain": {
         "title": "Provider Decision Chain Timeline",
@@ -315,6 +327,7 @@
         "prioritySelection": "Priority Selection",
         "attemptProvider": "Attempt: {provider}",
         "retryAttempt": "Retry #{number}",
+        "httpStatus": "HTTP {code}{inferredSuffix}",
         "sessionReuse": "Session Reuse",
         "sessionReuseDesc": "Reusing provider from session cache",
         "sessionReuseTitle": "Session Binding",

+ 5 - 0
messages/en/provider-chain.json

@@ -35,6 +35,7 @@
     "candidate": "{name}({probability}%)",
     "requestChain": "Request Chain:",
     "systemError": "System Error",
+    "resourceNotFound": "Resource Not Found (404)",
     "concurrentLimit": "Concurrent Limit",
     "http2Fallback": "HTTP/2 Fallback",
     "clientError": "Client Error",
@@ -46,6 +47,7 @@
     "retry_success": "Retry Success",
     "retry_failed": "Retry Failed",
     "system_error": "System Error",
+    "resource_not_found": "Resource Not Found (404)",
     "client_error_non_retryable": "Client Error",
     "concurrent_limit_failed": "Concurrent Limit",
     "http2_fallback": "HTTP/2 Fallback",
@@ -128,11 +130,13 @@
     "candidateInfo": "  • {name}: weight={weight} cost={cost} probability={probability}%",
     "selected": "✓ Selected: {provider}",
     "requestFailed": "Request Failed (Attempt {attempt})",
+    "resourceNotFoundFailed": "Resource Not Found (404) (Attempt {attempt})",
     "attemptNumber": "Attempt {number}",
     "firstAttempt": "First Attempt",
     "nthAttempt": "Attempt {attempt}",
     "provider": "Provider: {provider}",
     "statusCode": "Status Code: {code}",
+    "statusCodeInferred": "Status Code (inferred): {code}",
     "error": "Error: {error}",
     "requestDuration": "Request Duration: {duration}ms",
     "requestDurationSeconds": "Request Duration: {duration}s",
@@ -158,6 +162,7 @@
     "meaning": "Meaning",
     "notCountedInCircuit": "This error is not counted in provider circuit breaker",
     "systemErrorNote": "Note: This error is not counted in provider circuit breaker",
+    "resourceNotFoundNote": "Note: This error is not counted in the circuit breaker and will trigger failover after retries are exhausted.",
     "reselection": "Reselecting Provider",
     "reselect": "Reselecting Provider",
     "excluded": "Excluded: {providers}",

+ 13 - 0
messages/ja/dashboard.json

@@ -244,6 +244,18 @@
       },
       "errorMessage": "エラーメッセージ",
       "fake200ForwardedNotice": "注意:ストリーミング要求では、失敗判定がストリーム終了後になる場合があります。応答内容は既にクライアントへ転送されている可能性があります。",
+      "fake200DetectedReason": "検出理由:{reason}",
+      "fake200Reasons": {
+        "emptyBody": "レスポンス本文が空です",
+        "htmlBody": "HTML ドキュメントが返されました (エラーページの可能性)",
+        "jsonErrorNonEmpty": "JSON の `error` フィールドが空ではありません",
+        "jsonErrorMessageNonEmpty": "JSON の `error.message` が空ではありません",
+        "jsonMessageKeywordMatch": "JSON の `message` に \"error\" が含まれています (ヒューリスティック)",
+        "unknown": "レスポンス本文がエラーを示しています"
+      },
+      "statusCodeInferredBadge": "推定",
+      "statusCodeInferredTooltip": "このステータスコードは応答本文の内容(例: fake 200)から推定されており、上流の HTTP ステータスと異なる場合があります。",
+      "statusCodeInferredSuffix": "(推定)",
       "filteredProviders": "フィルタされたプロバイダー",
       "providerChain": {
         "title": "プロバイダー決定チェーンタイムライン",
@@ -315,6 +327,7 @@
         "prioritySelection": "優先度選択",
         "attemptProvider": "試行: {provider}",
         "retryAttempt": "再試行 #{number}",
+        "httpStatus": "HTTP {code}{inferredSuffix}",
         "sessionReuse": "セッション再利用",
         "sessionReuseDesc": "セッションキャッシュからプロバイダーを再利用",
         "sessionReuseTitle": "セッションバインディング",

+ 5 - 0
messages/ja/provider-chain.json

@@ -35,6 +35,7 @@
     "candidate": "{name}({probability}%)",
     "requestChain": "リクエストチェーン:",
     "systemError": "システムエラー",
+    "resourceNotFound": "リソースが見つかりません(404)",
     "concurrentLimit": "同時実行制限",
     "http2Fallback": "HTTP/2 フォールバック",
     "clientError": "クライアントエラー",
@@ -46,6 +47,7 @@
     "retry_success": "リトライ成功",
     "retry_failed": "リトライ失敗",
     "system_error": "システムエラー",
+    "resource_not_found": "リソースが見つかりません(404)",
     "client_error_non_retryable": "クライアントエラー",
     "concurrent_limit_failed": "同時実行制限",
     "http2_fallback": "HTTP/2 フォールバック",
@@ -128,11 +130,13 @@
     "candidateInfo": "  • {name}: 重み={weight} コスト={cost} 確率={probability}%",
     "selected": "✓ 選択: {provider}",
     "requestFailed": "リクエスト失敗(試行{attempt})",
+    "resourceNotFoundFailed": "リソースが見つかりません(404)(試行{attempt})",
     "attemptNumber": "試行 {number}",
     "firstAttempt": "初回試行",
     "nthAttempt": "試行{attempt}",
     "provider": "プロバイダー: {provider}",
     "statusCode": "ステータスコード: {code}",
+    "statusCodeInferred": "ステータスコード(推定): {code}",
     "error": "エラー: {error}",
     "requestDuration": "リクエスト時間: {duration}ms",
     "requestDurationSeconds": "リクエスト時間: {duration}s",
@@ -158,6 +162,7 @@
     "meaning": "意味",
     "notCountedInCircuit": "このエラーはプロバイダーサーキットブレーカーにカウントされません",
     "systemErrorNote": "注記:このエラーはプロバイダーサーキットブレーカーにカウントされません",
+    "resourceNotFoundNote": "注記:このエラーはサーキットブレーカーにカウントされず、リトライ枯渇後にフェイルオーバーします。",
     "reselection": "プロバイダー再選択",
     "reselect": "プロバイダー再選択",
     "excluded": "除外済み: {providers}",

+ 13 - 0
messages/ru/dashboard.json

@@ -244,6 +244,18 @@
       },
       "errorMessage": "Сообщение об ошибке",
       "fake200ForwardedNotice": "Примечание: для потоковых запросов эта ошибка может быть обнаружена только после завершения потока; содержимое ответа могло уже быть передано клиенту.",
+      "fake200DetectedReason": "Причина обнаружения: {reason}",
+      "fake200Reasons": {
+        "emptyBody": "Пустое тело ответа",
+        "htmlBody": "Получен HTML-документ (возможно, страница ошибки)",
+        "jsonErrorNonEmpty": "В JSON непустое поле `error`",
+        "jsonErrorMessageNonEmpty": "В JSON непустое `error.message`",
+        "jsonMessageKeywordMatch": "В JSON `message` содержит слово \"error\" (эвристика)",
+        "unknown": "Тело ответа указывает на ошибку"
+      },
+      "statusCodeInferredBadge": "Предположено",
+      "statusCodeInferredTooltip": "Этот код состояния выведен по содержимому тела ответа (например, fake 200) и может отличаться от HTTP-кода апстрима.",
+      "statusCodeInferredSuffix": "(предп.)",
       "filteredProviders": "Отфильтрованные поставщики",
       "providerChain": {
         "title": "Хронология цепочки решений поставщика",
@@ -315,6 +327,7 @@
         "prioritySelection": "Выбор по приоритету",
         "attemptProvider": "Попытка: {provider}",
         "retryAttempt": "Повтор #{number}",
+        "httpStatus": "HTTP {code}{inferredSuffix}",
         "sessionReuse": "Повторное использование сессии",
         "sessionReuseDesc": "Провайдер из кэша сессии",
         "sessionReuseTitle": "Привязка сессии",

+ 5 - 0
messages/ru/provider-chain.json

@@ -35,6 +35,7 @@
     "candidate": "{name}({probability}%)",
     "requestChain": "Цепочка запросов:",
     "systemError": "Системная ошибка",
+    "resourceNotFound": "Ресурс не найден (404)",
     "concurrentLimit": "Лимит параллельных запросов",
     "http2Fallback": "Откат HTTP/2",
     "clientError": "Ошибка клиента",
@@ -46,6 +47,7 @@
     "retry_success": "Повтор успешен",
     "retry_failed": "Повтор не удался",
     "system_error": "Системная ошибка",
+    "resource_not_found": "Ресурс не найден (404)",
     "client_error_non_retryable": "Ошибка клиента",
     "concurrent_limit_failed": "Лимит параллельных запросов",
     "http2_fallback": "Откат HTTP/2",
@@ -128,11 +130,13 @@
     "candidateInfo": "  • {name}: вес={weight} стоимость={cost} вероятность={probability}%",
     "selected": "✓ Выбрано: {provider}",
     "requestFailed": "Запрос не выполнен (Попытка {attempt})",
+    "resourceNotFoundFailed": "Ресурс не найден (404) (Попытка {attempt})",
     "attemptNumber": "Попытка {number}",
     "firstAttempt": "Первая попытка",
     "nthAttempt": "Попытка {attempt}",
     "provider": "Провайдер: {provider}",
     "statusCode": "Код состояния: {code}",
+    "statusCodeInferred": "Код состояния (выведено): {code}",
     "error": "Ошибка: {error}",
     "requestDuration": "Длительность запроса: {duration}мс",
     "requestDurationSeconds": "Длительность запроса: {duration}с",
@@ -158,6 +162,7 @@
     "meaning": "Значение",
     "notCountedInCircuit": "Эта ошибка не учитывается в автомате защиты провайдера",
     "systemErrorNote": "Примечание: Эта ошибка не учитывается в автомате защиты провайдера",
+    "resourceNotFoundNote": "Примечание: Эта ошибка не учитывается в автомате защиты; после исчерпания повторов произойдёт переключение.",
     "reselection": "Повторный выбор провайдера",
     "reselect": "Повторный выбор провайдера",
     "excluded": "Исключено: {providers}",

+ 13 - 0
messages/zh-CN/dashboard.json

@@ -244,6 +244,18 @@
       },
       "errorMessage": "错误信息",
       "fake200ForwardedNotice": "提示:对于流式请求,该失败可能在流结束后才被识别;响应内容可能已原样透传给客户端。",
+      "fake200DetectedReason": "检测原因:{reason}",
+      "fake200Reasons": {
+        "emptyBody": "响应体为空",
+        "htmlBody": "返回了 HTML 文档(可能是错误页)",
+        "jsonErrorNonEmpty": "JSON 顶层 error 字段非空",
+        "jsonErrorMessageNonEmpty": "JSON 中 error.message 非空",
+        "jsonMessageKeywordMatch": "JSON message 字段包含 \"error\"(启发式)",
+        "unknown": "响应体内容指示错误"
+      },
+      "statusCodeInferredBadge": "推测",
+      "statusCodeInferredTooltip": "该状态码根据响应体内容推断(例如假200),可能与上游真实 HTTP 状态码不同。",
+      "statusCodeInferredSuffix": "(推测)",
       "filteredProviders": "被过滤的供应商",
       "providerChain": {
         "title": "供应商决策链时间线",
@@ -315,6 +327,7 @@
         "prioritySelection": "优先级选择",
         "attemptProvider": "尝试: {provider}",
         "retryAttempt": "重试 #{number}",
+        "httpStatus": "HTTP {code}{inferredSuffix}",
         "sessionReuse": "会话复用",
         "sessionReuseDesc": "从会话缓存复用供应商",
         "sessionReuseTitle": "会话绑定",

+ 5 - 0
messages/zh-CN/provider-chain.json

@@ -35,6 +35,7 @@
     "candidate": "{name}({probability}%)",
     "requestChain": "请求链路:",
     "systemError": "系统错误",
+    "resourceNotFound": "资源不存在(404)",
     "concurrentLimit": "并发限制",
     "http2Fallback": "HTTP/2 回退",
     "clientError": "客户端错误",
@@ -46,6 +47,7 @@
     "retry_success": "重试成功",
     "retry_failed": "重试失败",
     "system_error": "系统错误",
+    "resource_not_found": "资源不存在(404)",
     "client_error_non_retryable": "客户端错误",
     "concurrent_limit_failed": "并发限制",
     "http2_fallback": "HTTP/2 回退",
@@ -128,11 +130,13 @@
     "candidateInfo": "  • {name}: 权重={weight} 成本={cost} 概率={probability}%",
     "selected": "✓ 选择: {provider}",
     "requestFailed": "请求失败(第 {attempt} 次尝试)",
+    "resourceNotFoundFailed": "资源不存在(404,第 {attempt} 次尝试)",
     "attemptNumber": "第 {number} 次",
     "firstAttempt": "首次尝试",
     "nthAttempt": "第 {attempt} 次尝试",
     "provider": "供应商: {provider}",
     "statusCode": "状态码: {code}",
+    "statusCodeInferred": "状态码(推测): {code}",
     "error": "错误: {error}",
     "requestDuration": "请求耗时: {duration}ms",
     "requestDurationSeconds": "请求耗时: {duration}s",
@@ -158,6 +162,7 @@
     "meaning": "含义",
     "notCountedInCircuit": "此错误不计入供应商熔断器",
     "systemErrorNote": "说明:此错误不计入供应商熔断器",
+    "resourceNotFoundNote": "说明:该错误不计入熔断器;重试耗尽后将触发故障转移。",
     "reselection": "重新选择供应商",
     "reselect": "重新选择供应商",
     "excluded": "已排除: {providers}",

+ 13 - 0
messages/zh-TW/dashboard.json

@@ -244,6 +244,18 @@
       },
       "errorMessage": "錯誤訊息",
       "fake200ForwardedNotice": "提示:對於串流請求,此失敗可能在串流結束後才被識別;回應內容可能已原樣透傳給用戶端。",
+      "fake200DetectedReason": "檢測原因:{reason}",
+      "fake200Reasons": {
+        "emptyBody": "回應本文為空",
+        "htmlBody": "回傳了 HTML 文件(可能是錯誤頁)",
+        "jsonErrorNonEmpty": "JSON 頂層 error 欄位非空",
+        "jsonErrorMessageNonEmpty": "JSON 中 error.message 非空",
+        "jsonMessageKeywordMatch": "JSON message 欄位包含 \"error\"(啟發式)",
+        "unknown": "回應本文內容顯示錯誤"
+      },
+      "statusCodeInferredBadge": "推測",
+      "statusCodeInferredTooltip": "此狀態碼係根據回應內容推測(例如假200),可能與上游真實 HTTP 狀態碼不同。",
+      "statusCodeInferredSuffix": "(推測)",
       "filteredProviders": "被過濾的供應商",
       "providerChain": {
         "title": "供應商決策鏈時間軸",
@@ -315,6 +327,7 @@
         "prioritySelection": "優先順序選擇",
         "attemptProvider": "嘗試: {provider}",
         "retryAttempt": "重試 #{number}",
+        "httpStatus": "HTTP {code}{inferredSuffix}",
         "sessionReuse": "會話複用",
         "sessionReuseDesc": "從會話快取複用供應商",
         "sessionReuseTitle": "會話綁定",

+ 5 - 0
messages/zh-TW/provider-chain.json

@@ -35,6 +35,7 @@
     "candidate": "{name}({probability}%)",
     "requestChain": "請求鏈路:",
     "systemError": "系統錯誤",
+    "resourceNotFound": "資源不存在(404)",
     "concurrentLimit": "並發限制",
     "http2Fallback": "HTTP/2 回退",
     "clientError": "客戶端錯誤",
@@ -46,6 +47,7 @@
     "retry_success": "重試成功",
     "retry_failed": "重試失敗",
     "system_error": "系統錯誤",
+    "resource_not_found": "資源不存在(404)",
     "client_error_non_retryable": "客戶端錯誤",
     "concurrent_limit_failed": "並發限制",
     "http2_fallback": "HTTP/2 回退",
@@ -128,11 +130,13 @@
     "candidateInfo": "  • {name}: 權重={weight} 成本={cost} 概率={probability}%",
     "selected": "✓ 選擇: {provider}",
     "requestFailed": "請求失敗(第 {attempt} 次嘗試)",
+    "resourceNotFoundFailed": "資源不存在(404,第 {attempt} 次嘗試)",
     "attemptNumber": "第 {number} 次",
     "firstAttempt": "首次嘗試",
     "nthAttempt": "第 {attempt} 次嘗試",
     "provider": "供應商: {provider}",
     "statusCode": "狀態碼: {code}",
+    "statusCodeInferred": "狀態碼(推測): {code}",
     "error": "錯誤: {error}",
     "requestDuration": "請求耗時: {duration}ms",
     "requestDurationSeconds": "請求耗時: {duration}s",
@@ -158,6 +162,7 @@
     "meaning": "含義",
     "notCountedInCircuit": "此錯誤不計入供應商熔斷器",
     "systemErrorNote": "說明:此錯誤不計入供應商熔斷器",
+    "resourceNotFoundNote": "說明:該錯誤不計入熔斷器;重試耗盡後將觸發故障轉移。",
     "reselection": "重新選擇供應商",
     "reselect": "重新選擇供應商",
     "excluded": "已排除: {providers}",

+ 11 - 0
src/app/[locale]/dashboard/logs/_components/error-details-dialog.test.tsx

@@ -245,6 +245,7 @@ const messages = {
           prioritySelection: "Priority Selection",
           attemptProvider: "Attempt: {provider}",
           retryAttempt: "Retry #{number}",
+          httpStatus: "HTTP {code}{inferredSuffix}",
         },
         noError: {
           processing: "No error (processing)",
@@ -253,6 +254,15 @@ const messages = {
         },
         errorMessage: "Error message",
         fake200ForwardedNotice: "Note: detected after stream end; payload may have been forwarded",
+        fake200DetectedReason: "Detected reason: {reason}",
+        fake200Reasons: {
+          emptyBody: "Empty response body",
+          htmlBody: "HTML document returned",
+          jsonErrorNonEmpty: "JSON has non-empty error field",
+          jsonErrorMessageNonEmpty: "JSON has non-empty error.message",
+          jsonMessageKeywordMatch: 'JSON message contains "error"',
+          unknown: "Response body indicates an error",
+        },
         viewDetails: "View details",
         filteredProviders: "Filtered providers",
         providerChain: {
@@ -339,6 +349,7 @@ describe("error-details-dialog layout", () => {
 
     expect(html).toContain("FAKE_200_EMPTY_BODY");
     expect(html).toContain("Note: detected after stream end; payload may have been forwarded");
+    expect(html).toContain("Detected reason: Empty response body");
   });
 
   test("renders special settings section when specialSettings exists", () => {

+ 14 - 2
src/app/[locale]/dashboard/logs/_components/error-details-dialog/components/LogicTraceTab.tsx

@@ -39,7 +39,9 @@ function getRequestStatus(item: ProviderChainItem): StepStatus {
   if (
     item.reason === "retry_failed" ||
     item.reason === "system_error" ||
+    item.reason === "resource_not_found" ||
     item.reason === "client_error_non_retryable" ||
+    item.reason === "endpoint_pool_exhausted" ||
     item.reason === "concurrent_limit_failed"
   ) {
     return "failure";
@@ -464,10 +466,20 @@ export function LogicTraceTab({
                 subtitle={
                   isSessionReuse
                     ? item.statusCode
-                      ? `HTTP ${item.statusCode}`
+                      ? t("logicTrace.httpStatus", {
+                          code: item.statusCode,
+                          inferredSuffix: item.statusCodeInferred
+                            ? ` ${t("statusCodeInferredSuffix")}`
+                            : "",
+                        })
                       : item.name
                     : item.statusCode
-                      ? `HTTP ${item.statusCode}`
+                      ? t("logicTrace.httpStatus", {
+                          code: item.statusCode,
+                          inferredSuffix: item.statusCodeInferred
+                            ? ` ${t("statusCodeInferredSuffix")}`
+                            : "",
+                        })
                       : item.reason
                         ? tChain(`reasons.${item.reason}`)
                         : undefined

+ 10 - 0
src/app/[locale]/dashboard/logs/_components/error-details-dialog/components/SummaryTab.tsx

@@ -20,6 +20,7 @@ import { Button } from "@/components/ui/button";
 import { Link } from "@/i18n/routing";
 import { cn, formatTokenAmount } from "@/lib/utils";
 import { formatCurrency } from "@/lib/utils/currency";
+import { getFake200ReasonKey } from "../../fake200-reason";
 import {
   calculateOutputRate,
   isInProgressStatus,
@@ -67,6 +68,10 @@ export function SummaryTab({
     specialSettings && specialSettings.length > 0 ? JSON.stringify(specialSettings, null, 2) : null;
   const isFake200PostStreamFailure =
     typeof errorMessage === "string" && errorMessage.startsWith("FAKE_200_");
+  const fake200Reason =
+    isFake200PostStreamFailure && typeof errorMessage === "string"
+      ? t(getFake200ReasonKey(errorMessage, "fake200Reasons"))
+      : null;
 
   return (
     <div className="space-y-6">
@@ -426,6 +431,11 @@ export function SummaryTab({
             <p className="text-xs text-rose-800 dark:text-rose-200 line-clamp-3 font-mono">
               {errorMessage.length > 200 ? `${errorMessage.slice(0, 200)}...` : errorMessage}
             </p>
+            {isFake200PostStreamFailure && fake200Reason && (
+              <p className="mt-2 text-[11px] text-rose-800 dark:text-rose-200">
+                {t("fake200DetectedReason", { reason: fake200Reason })}
+              </p>
+            )}
             {/* 注意:假 200 检测发生在 SSE 流式结束后;此时内容已可能透传给客户端,因此需要提示用户避免误解。 */}
             {isFake200PostStreamFailure && (
               <div className="mt-2 flex items-start gap-2 rounded-md border border-amber-200 bg-amber-50 p-2 text-[11px] text-amber-800 dark:border-amber-800 dark:bg-amber-950/20 dark:text-amber-200">

+ 15 - 0
src/app/[locale]/dashboard/logs/_components/fake200-reason.ts

@@ -0,0 +1,15 @@
+// Shared mapping from internal FAKE_200_* error codes to i18n suffix keys.
+// These codes represent: upstream returned 2xx but the body looks like an error page / error JSON.
+// UI-only: does not participate in detection logic.
+
+const FAKE_200_REASON_KEYS: Record<string, string> = {
+  FAKE_200_EMPTY_BODY: "emptyBody",
+  FAKE_200_HTML_BODY: "htmlBody",
+  FAKE_200_JSON_ERROR_NON_EMPTY: "jsonErrorNonEmpty",
+  FAKE_200_JSON_ERROR_MESSAGE_NON_EMPTY: "jsonErrorMessageNonEmpty",
+  FAKE_200_JSON_MESSAGE_KEYWORD_MATCH: "jsonMessageKeywordMatch",
+};
+
+export function getFake200ReasonKey(code: string, prefix: string): string {
+  return `${prefix}.${FAKE_200_REASON_KEYS[code] ?? "unknown"}`;
+}

+ 31 - 0
src/app/[locale]/dashboard/logs/_components/provider-chain-popover.test.tsx

@@ -86,6 +86,18 @@ const messages = {
       details: {
         clickStatusCode: "Click status code",
         fake200ForwardedNotice: "Note: payload may have been forwarded",
+        fake200DetectedReason: "Detected reason: {reason}",
+        statusCodeInferredBadge: "Inferred",
+        statusCodeInferredTooltip: "This status code is inferred from response body content.",
+        statusCodeInferredSuffix: "(inferred)",
+        fake200Reasons: {
+          emptyBody: "Empty response body",
+          htmlBody: "HTML document returned",
+          jsonErrorNonEmpty: "JSON has non-empty error field",
+          jsonErrorMessageNonEmpty: "JSON has non-empty error.message",
+          jsonMessageKeywordMatch: 'JSON message contains "error"',
+          unknown: "Response body indicates an error",
+        },
       },
     },
   },
@@ -276,6 +288,25 @@ describe("provider-chain-popover layout", () => {
     expect(html).toContain("Note: payload may have been forwarded");
   });
 
+  test("renders inferred status code badge when statusCodeInferred=true", () => {
+    const html = renderWithIntl(
+      <ProviderChainPopover
+        chain={[
+          {
+            id: 1,
+            name: "p1",
+            reason: "retry_failed",
+            statusCode: 429,
+            statusCodeInferred: true,
+          },
+        ]}
+        finalProvider="p1"
+      />
+    );
+
+    expect(html).toContain("Inferred");
+  });
+
   test("requestCount<=1 branch keeps truncation container shrinkable", () => {
     const html = renderWithIntl(
       <ProviderChainPopover

+ 80 - 12
src/app/[locale]/dashboard/logs/_components/provider-chain-popover.tsx

@@ -18,6 +18,7 @@ import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/comp
 import { cn } from "@/lib/utils";
 import { formatProbabilityCompact } from "@/lib/utils/provider-chain-formatter";
 import type { ProviderChainItem } from "@/types/message";
+import { getFake200ReasonKey } from "./fake200-reason";
 
 interface ProviderChainPopoverProps {
   chain: ProviderChainItem[];
@@ -33,7 +34,9 @@ interface ProviderChainPopoverProps {
  */
 function isActualRequest(item: ProviderChainItem): boolean {
   if (item.reason === "concurrent_limit_failed") return true;
+
   if (item.reason === "retry_failed" || item.reason === "system_error") return true;
+  if (item.reason === "resource_not_found") return true;
   if (item.reason === "endpoint_pool_exhausted") return true;
   if (item.reason === "vendor_type_all_timeout") return true;
   if (item.reason === "client_error_non_retryable") return true;
@@ -71,7 +74,13 @@ function getItemStatus(item: ProviderChainItem): {
       bgColor: "bg-emerald-50 dark:bg-emerald-950/30",
     };
   }
-  if (item.reason === "retry_failed" || item.reason === "system_error") {
+  if (
+    item.reason === "retry_failed" ||
+    item.reason === "system_error" ||
+    item.reason === "resource_not_found" ||
+    item.reason === "endpoint_pool_exhausted" ||
+    item.reason === "vendor_type_all_timeout"
+  ) {
     return {
       icon: XCircle,
       color: "text-rose-600",
@@ -92,13 +101,6 @@ function getItemStatus(item: ProviderChainItem): {
       bgColor: "bg-orange-50 dark:bg-orange-950/30",
     };
   }
-  if (item.reason === "endpoint_pool_exhausted" || item.reason === "vendor_type_all_timeout") {
-    return {
-      icon: XCircle,
-      color: "text-rose-600",
-      bgColor: "bg-rose-50 dark:bg-rose-950/30",
-    };
-  }
   return {
     icon: RefreshCw,
     color: "text-slate-500",
@@ -119,6 +121,9 @@ export function ProviderChainPopover({
   const hasFake200PostStreamFailure = chain.some(
     (item) => typeof item.errorMessage === "string" && item.errorMessage.startsWith("FAKE_200_")
   );
+  const fake200CodeForDisplay = chain.find(
+    (item) => typeof item.errorMessage === "string" && item.errorMessage.startsWith("FAKE_200_")
+  )?.errorMessage;
 
   // Calculate actual request count (excluding intermediate states)
   const requestCount = chain.filter(isActualRequest).length;
@@ -144,6 +149,7 @@ export function ProviderChainPopover({
       (item) => item.reason === "session_reuse" || item.selectionMethod === "session_reuse"
     );
     const sessionReuseContext = sessionReuseItem?.decisionContext;
+    const singleRequestItem = chain.find(isActualRequest);
 
     return (
       <div className={`${maxWidthClass} min-w-0 w-full`}>
@@ -166,12 +172,50 @@ export function ProviderChainPopover({
               <div className="space-y-2">
                 {/* Provider name */}
                 <div className="font-medium text-xs">{displayName}</div>
+                {singleRequestItem?.statusCode && (
+                  <div className="flex items-center gap-1">
+                    <Badge
+                      variant="outline"
+                      className={cn(
+                        "text-[10px] px-1 py-0",
+                        singleRequestItem.statusCode >= 200 && singleRequestItem.statusCode < 300
+                          ? "border-emerald-500 text-emerald-600"
+                          : "border-rose-500 text-rose-600"
+                      )}
+                    >
+                      {singleRequestItem.statusCode}
+                    </Badge>
+                    {singleRequestItem.statusCodeInferred && (
+                      <Badge
+                        variant="outline"
+                        className="text-[10px] px-1 py-0 border-amber-500 text-amber-700 dark:text-amber-300"
+                        title={t("logs.details.statusCodeInferredTooltip")}
+                      >
+                        {t("logs.details.statusCodeInferredBadge")}
+                      </Badge>
+                    )}
+                  </div>
+                )}
 
                 {/* 注意:假 200 检测发生在 SSE 流式结束后;此时内容已可能透传给客户端。 */}
                 {hasFake200PostStreamFailure && (
                   <div className="flex items-start gap-1.5 text-[10px] text-amber-500 dark:text-amber-400">
                     <InfoIcon className="h-3 w-3 shrink-0 mt-0.5" aria-hidden="true" />
-                    <span>{t("logs.details.fake200ForwardedNotice")}</span>
+                    <div className="space-y-0.5">
+                      {typeof fake200CodeForDisplay === "string" && (
+                        <div>
+                          {t("logs.details.fake200DetectedReason", {
+                            reason: t(
+                              getFake200ReasonKey(
+                                fake200CodeForDisplay,
+                                "logs.details.fake200Reasons"
+                              )
+                            ),
+                          })}
+                        </div>
+                      )}
+                      <div>{t("logs.details.fake200ForwardedNotice")}</div>
+                    </div>
                   </div>
                 )}
 
@@ -458,6 +502,15 @@ export function ProviderChainPopover({
                         {item.statusCode}
                       </Badge>
                     )}
+                    {item.statusCode && item.statusCodeInferred && (
+                      <Badge
+                        variant="outline"
+                        className="text-[10px] px-1 py-0 border-amber-500 text-amber-700 dark:text-amber-300"
+                        title={t("logs.details.statusCodeInferredTooltip")}
+                      >
+                        {t("logs.details.statusCodeInferredBadge")}
+                      </Badge>
+                    )}
                     {item.reason && !item.statusCode && (
                       <span className="text-[10px] text-muted-foreground">
                         {tChain(`reasons.${item.reason}`)}
@@ -465,9 +518,24 @@ export function ProviderChainPopover({
                     )}
                   </div>
                   {item.errorMessage && (
-                    <p className="text-[10px] text-muted-foreground mt-0.5 line-clamp-1">
-                      {item.errorMessage}
-                    </p>
+                    <>
+                      <p className="text-[10px] text-muted-foreground mt-0.5 line-clamp-1">
+                        {item.errorMessage}
+                      </p>
+                      {typeof item.errorMessage === "string" &&
+                        item.errorMessage.startsWith("FAKE_200_") && (
+                          <p className="text-[10px] text-amber-700 dark:text-amber-300 mt-0.5 line-clamp-2">
+                            {t("logs.details.fake200DetectedReason", {
+                              reason: t(
+                                getFake200ReasonKey(
+                                  item.errorMessage,
+                                  "logs.details.fake200Reasons"
+                                )
+                              ),
+                            })}
+                          </p>
+                        )}
+                    </>
                   )}
                 </div>
               </div>

+ 64 - 1
src/app/v1/_lib/proxy/error-handler.ts

@@ -1,3 +1,4 @@
+import { getCachedSystemSettings } from "@/lib/config/system-settings-cache";
 import {
   isClaudeErrorFormat,
   isGeminiErrorFormat,
@@ -6,6 +7,7 @@ import {
 } from "@/lib/error-override-validator";
 import { logger } from "@/lib/logger";
 import { ProxyStatusTracker } from "@/lib/proxy-status-tracker";
+import { sanitizeErrorTextForDetail } from "@/lib/utils/upstream-error-detection";
 import { updateMessageRequestDetails, updateMessageRequestDuration } from "@/repository/message";
 import { attachSessionIdToErrorResponse } from "./error-session-id";
 import {
@@ -236,9 +238,70 @@ export class ProxyErrorHandler {
       overridden: false,
     });
 
+    // verboseProviderError(系统设置)开启时:对“假 200/空响应”等上游异常返回更详细的报告,便于排查。
+    // 注意:
+    // - 该逻辑放在 error override 之后:确保优先级更低,不覆盖用户自定义覆写。
+    // - rawBody 仅用于本次错误响应回传(受系统设置控制),不写入数据库/决策链;
+    // - 出于安全考虑,这里会对 rawBody 做基础脱敏(Bearer/key/JWT/email 等),避免上游错误页意外回显敏感信息。
+    let details: Record<string, unknown> | undefined;
+    let upstreamRequestId: string | undefined;
+    const shouldAttachVerboseDetails =
+      (error instanceof ProxyError && error.message.startsWith("FAKE_200_")) ||
+      isEmptyResponseError(error);
+
+    if (shouldAttachVerboseDetails) {
+      try {
+        const settings = await getCachedSystemSettings();
+        if (settings.verboseProviderError) {
+          if (error instanceof ProxyError) {
+            upstreamRequestId = error.upstreamError?.requestId;
+            const rawBodySrc = error.upstreamError?.rawBody;
+            const rawBody =
+              typeof rawBodySrc === "string" && rawBodySrc
+                ? sanitizeErrorTextForDetail(
+                    rawBodySrc.length > 4096 ? rawBodySrc.slice(0, 4096) : rawBodySrc
+                  )
+                : rawBodySrc;
+            details = {
+              upstreamError: {
+                kind: "fake_200",
+                code: error.message,
+                statusCode: error.statusCode,
+                statusCodeInferred: error.upstreamError?.statusCodeInferred ?? false,
+                statusCodeInferenceMatcherId:
+                  error.upstreamError?.statusCodeInferenceMatcherId ?? null,
+                clientSafeMessage: error.getClientSafeMessage(),
+                rawBody,
+                rawBodyTruncated: error.upstreamError?.rawBodyTruncated ?? false,
+              },
+            };
+          } else if (isEmptyResponseError(error)) {
+            details = {
+              upstreamError: {
+                kind: "empty_response",
+                reason: error.reason,
+                clientSafeMessage: error.getClientSafeMessage(),
+                rawBody: "",
+                rawBodyTruncated: false,
+              },
+            };
+          }
+        }
+      } catch (verboseError) {
+        logger.warn("ProxyErrorHandler: failed to gather verbose details, skipping", {
+          error: verboseError instanceof Error ? verboseError.message : String(verboseError),
+        });
+      }
+    }
+
+    const safeRequestId =
+      typeof upstreamRequestId === "string" && upstreamRequestId.trim()
+        ? upstreamRequestId.trim()
+        : undefined;
+
     return await attachSessionIdToErrorResponse(
       session.sessionId,
-      ProxyResponses.buildError(statusCode, clientErrorMessage)
+      ProxyResponses.buildError(statusCode, clientErrorMessage, undefined, details, safeRequestId)
     );
   }
 

+ 81 - 1
src/app/v1/_lib/proxy/errors.ts

@@ -9,6 +9,7 @@
 import { getEnvConfig } from "@/lib/config/env.schema";
 import { type ErrorDetectionResult, errorRuleDetector } from "@/lib/error-rule-detector";
 import { redactJsonString } from "@/lib/utils/message-redaction";
+import { sanitizeErrorTextForDetail } from "@/lib/utils/upstream-error-detection";
 import type { ErrorOverrideResponse } from "@/repository/error-rules";
 import type { ProviderChainItem } from "@/types/message";
 import type { ProxySession } from "./session";
@@ -18,11 +19,41 @@ export class ProxyError extends Error {
     message: string,
     public readonly statusCode: number,
     public readonly upstreamError?: {
-      body: string; // 原始响应体(智能截断)
+      /**
+       * 上游响应体(智能截断)。
+       *
+       * 注意:该字段会进入 getDetailedErrorMessage(),并被记录到数据库中,
+       * 因此不要在这里放入“大段原文”或未脱敏的敏感内容。
+       */
+      body: string;
       parsed?: unknown; // 解析后的 JSON(如果有)
       providerId?: number;
       providerName?: string;
       requestId?: string; // 上游请求 ID(用于覆写响应时注入)
+
+      /**
+       * 上游响应体原文(通常为前缀片段)。
+       *
+       * 设计目标:
+       * - 仅用于“本次错误响应”返回给客户端(受系统设置控制);
+       * - 不参与规则匹配与持久化(避免污染数据库/日志)。
+       *
+       * 目前主要用于“假 200”检测:HTTP 状态码为 2xx,但 body 实际为错误页/错误 JSON。
+       */
+      rawBody?: string;
+      rawBodyTruncated?: boolean;
+
+      /**
+       * 标记该 ProxyError 的 statusCode 是否由“响应体内容”推断得出(而非上游真实 HTTP 状态码)。
+       *
+       * 典型场景:上游返回 HTTP 200,但 body 为错误页/错误 JSON(假 200)。此时 CCH 会根据响应体内容推断更贴近语义的 4xx/5xx,
+       * 以便让故障转移/熔断/会话绑定逻辑与“真实上游错误状态码”保持一致。
+       */
+      statusCodeInferred?: boolean;
+      /**
+       * 命中的推断规则 id(仅用于内部调试/审计,不应用于用户展示文案)。
+       */
+      statusCodeInferenceMatcherId?: string;
     }
   ) {
     super(message);
@@ -447,6 +478,55 @@ export class ProxyError extends Error {
    * - getClientSafeMessage(): 不包含供应商名称,用于返回给客户端
    */
   getClientSafeMessage(): string {
+    // 注意:一些内部检测/统计用的“错误码”(例如 FAKE_200_*)不适合直接暴露给客户端。
+    // 这里做最小映射:当 message 为 FAKE_200_* 时返回“可读原因说明”,并附带安全的上游片段(若有)。
+    if (this.message.startsWith("FAKE_200_")) {
+      // 说明:这些 code 都来自内部的“假 200”检测,代表:上游返回 HTTP 200,但响应体内容更像错误页/错误 JSON。
+      // 我们需要:
+      // 1) 给用户清晰的错误原因(避免只看到一个内部 code);
+      // 2) 不泄露内部错误码/供应商名称;
+      // 3) 在有 detail 时附带一小段“脱敏 + 截断”的上游片段,帮助排查。
+      const reason = (() => {
+        switch (this.message) {
+          case "FAKE_200_EMPTY_BODY":
+            return "Upstream returned HTTP 200, but the response body was empty.";
+          case "FAKE_200_HTML_BODY":
+            return "Upstream returned HTTP 200, but the response body looks like an HTML document (likely an error page).";
+          case "FAKE_200_JSON_ERROR_MESSAGE_NON_EMPTY":
+            return "Upstream returned HTTP 200, but the JSON body contains a non-empty `error.message`.";
+          case "FAKE_200_JSON_ERROR_NON_EMPTY":
+            return "Upstream returned HTTP 200, but the JSON body contains a non-empty `error` field.";
+          case "FAKE_200_JSON_MESSAGE_KEYWORD_MATCH":
+            return "Upstream returned HTTP 200, but the JSON `message` suggests an error (heuristic).";
+          default:
+            return "Upstream returned HTTP 200, but the response body indicates an error.";
+        }
+      })();
+
+      const inferredNote = this.upstreamError?.statusCodeInferred
+        ? ` Inferred HTTP status: ${this.statusCode}.`
+        : "";
+
+      const detail = this.upstreamError?.body?.trim();
+      if (detail) {
+        // 注意:对 FAKE_200_* 路径,我们期望 upstreamError.body 来自内部检测得到的“脱敏 + 截断片段”(详见 upstream-error-detection.ts)。
+        //
+        // 但为避免未来调用方误把“未脱敏的大段原文”塞进 upstreamError.body 导致泄露,
+        // 这里再做一次防御性处理:
+        // - whitespace 归一化(避免多行污染客户端日志)
+        // - 二次截断(上限 200 字符)
+        // - 轻量脱敏(避免明显的 token/key 泄露)
+        const normalized = detail.replace(/\s+/g, " ").trim();
+        const maxChars = 200;
+        const clipped =
+          normalized.length > maxChars ? `${normalized.slice(0, maxChars)}…` : normalized;
+        const safe = sanitizeErrorTextForDetail(clipped);
+        return `${reason}${inferredNote} Upstream detail: ${safe}`;
+      }
+
+      return `${reason}${inferredNote}`;
+    }
+
     return this.message;
   }
 }

+ 204 - 20
src/app/v1/_lib/proxy/forwarder.ts

@@ -25,6 +25,10 @@ import {
 import { getGlobalAgentPool, getProxyAgentForProvider } from "@/lib/proxy-agent";
 import { SessionManager } from "@/lib/session-manager";
 import { CONTEXT_1M_BETA_HEADER, shouldApplyContext1m } from "@/lib/special-attributes";
+import {
+  detectUpstreamErrorFromSseOrJsonText,
+  inferUpstreamErrorStatusCodeFromText,
+} from "@/lib/utils/upstream-error-detection";
 import {
   isVendorTypeCircuitOpen,
   recordVendorTypeAllEndpointsTimeout,
@@ -84,6 +88,81 @@ const MAX_PROVIDER_SWITCHES = 20; // 保险栓:最多切换 20 次供应商(
 
 type CacheTtlOption = CacheTtlPreference | null | undefined;
 
+// 非流式响应体检查的上限(字节):避免上游在 2xx 场景返回超大内容导致内存占用失控。
+// 说明:
+// - 该检查仅用于“空响应/假 200”启发式判定,不用于业务逻辑解析;
+// - 超过上限时,仍认为“非空”,但会跳过 JSON 内容结构检查(避免截断导致误判)。
+const NON_STREAM_BODY_INSPECTION_MAX_BYTES = 32 * 1024; // 32 KiB
+
+/**
+ * 读取响应体文本,但最多读取 `maxBytes` 字节(用于非流式 2xx 的“空响应/假 200”嗅探)。
+ *
+ * 注意:
+ * - 该函数只用于启发式检测,不用于业务逻辑解析;
+ * - 超过上限时会 `cancel()` reader,避免继续占用资源;
+ * - 调用方应使用 `response.clone()`,避免消费掉原始响应体,影响后续透传/解析。
+ */
+async function readResponseTextUpTo(
+  response: Response,
+  maxBytes: number
+): Promise<{ text: string; truncated: boolean }> {
+  const reader = response.body?.getReader();
+  if (!reader) {
+    return { text: "", truncated: false };
+  }
+
+  const decoder = new TextDecoder();
+  const chunks: string[] = [];
+  let bytesRead = 0;
+  let truncated = false;
+
+  try {
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      if (!value || value.byteLength === 0) continue;
+
+      const remaining = maxBytes - bytesRead;
+      // 注意:remaining<=0 发生在“已经读到下一块 chunk”之后。
+      // 对启发式嗅探而言,直接标记 truncated 并退出即可(等价于丢弃超出上限的后续字节),
+      // 避免对超出部分做无谓的解码开销。
+      if (remaining <= 0) {
+        truncated = true;
+        break;
+      }
+
+      if (value.byteLength > remaining) {
+        chunks.push(decoder.decode(value.subarray(0, remaining), { stream: true }));
+        bytesRead += remaining;
+        truncated = true;
+        break;
+      }
+
+      chunks.push(decoder.decode(value, { stream: true }));
+      bytesRead += value.byteLength;
+    }
+
+    const flushed = decoder.decode();
+    if (flushed) chunks.push(flushed);
+  } finally {
+    if (truncated) {
+      try {
+        await reader.cancel();
+      } catch (cancelErr) {
+        logger.debug("readResponseTextUpTo: failed to cancel reader", { error: cancelErr });
+      }
+    }
+
+    try {
+      reader.releaseLock();
+    } catch (releaseErr) {
+      logger.debug("readResponseTextUpTo: failed to release reader lock", { error: releaseErr });
+    }
+  }
+
+  return { text: chunks.join(""), truncated };
+}
+
 function resolveCacheTtlPreference(
   keyPref: CacheTtlOption,
   providerPref: CacheTtlOption
@@ -523,6 +602,9 @@ export class ProxyForwarder {
           session.addProviderToChain(currentProvider, {
             reason: "endpoint_pool_exhausted",
             strictBlockCause: strictBlockCause as ProviderChainItem["strictBlockCause"],
+            // 为避免被 initial_selection/session_reuse 去重吞掉,这里需要写入 attemptNumber。
+            // 同时也能让“决策链/技术时间线”把它当作一次实际尝试(虽然请求未发出)。
+            attemptNumber: 1,
             ...(filterStats ? { endpointFilterStats: filterStats } : {}),
             errorMessage: endpointSelectionError?.message,
           });
@@ -619,7 +701,14 @@ export class ProxyForwarder {
 
           // ========== 空响应检测(仅非流式)==========
           const contentType = response.headers.get("content-type") || "";
-          const isSSE = contentType.includes("text/event-stream");
+          const normalizedContentType = contentType.toLowerCase();
+          const isSSE = normalizedContentType.includes("text/event-stream");
+          const isHtml =
+            normalizedContentType.includes("text/html") ||
+            normalizedContentType.includes("application/xhtml+xml");
+          const isJson =
+            normalizedContentType.includes("application/json") ||
+            normalizedContentType.includes("+json");
 
           // ========== 流式响应:延迟成功判定(避免“假 200”)==========
           // 背景:上游可能返回 HTTP 200,但 SSE 内容为错误 JSON(如 {"error": "..."})。
@@ -655,29 +744,111 @@ export class ProxyForwarder {
             return response;
           }
 
-          if (!isSSE) {
-            // 非流式响应:检测空响应
-            const contentLength = response.headers.get("content-length");
+          // 非流式响应:检测空响应
+          const contentLengthHeader = response.headers.get("content-length");
+          const contentLength = contentLengthHeader?.trim() || undefined;
+          const contentLengthBytes = (() => {
+            if (!contentLength) return null;
+
+            // Content-Length 必须是纯数字;parseInt("12abc") 会返回 12,容易误判为合法值,
+            // 从而跳过 “!hasValidContentLength” 的检查分支。
+            if (!/^\d+$/.test(contentLength)) return null;
+
+            const num = Number(contentLength);
+            if (!Number.isSafeInteger(num) || num < 0) return null;
+            return num;
+          })();
+          const hasValidContentLength = contentLengthBytes !== null;
+
+          // 检测 Content-Length: 0 的情况
+          if (contentLengthBytes === 0) {
+            throw new EmptyResponseError(currentProvider.id, currentProvider.name, "empty_body");
+          }
 
-            // 检测 Content-Length: 0 的情况
-            if (contentLength === "0") {
+          // 200 + text/html(或 xhtml)通常是上游网关/WAF/Cloudflare 的错误页,但被包装成了 HTTP 200。
+          // 这种“假 200”会导致:
+          // - 熔断/故障转移统计被误记为成功;
+          // - session 智能绑定被更新到不可用 provider(影响后续重试)。
+          // 因此这里在进入成功分支前做一次强信号检测:仅当 body 看起来是完整 HTML 文档时才视为错误。
+          let inspectedText: string | undefined;
+          let inspectedTruncated = false;
+          // 注意:这里不会对“大体积 JSON”做假 200 检测(例如 Content-Length > 32KiB)。
+          // 原因:
+          // - 非流式路径需要 clone 并额外读取响应体,会带来额外的内存/延迟开销;
+          // - 大体积 JSON 更可能是正常响应(而不是网关/WAF 的短错误 JSON)。
+          // 这意味着:极少数“超大 JSON 错误体 + HTTP 200”的上游异常可能会漏检。
+          const shouldInspectJson =
+            isJson &&
+            hasValidContentLength &&
+            contentLengthBytes <= NON_STREAM_BODY_INSPECTION_MAX_BYTES;
+          const shouldInspectBody = isHtml || !hasValidContentLength || shouldInspectJson;
+          if (shouldInspectBody) {
+            // 注意:Response.clone() 会 tee 底层 ReadableStream,可能带来一定的瞬时内存开销;
+            // 这里通过“最多读取 32 KiB”并在截断时 cancel 克隆分支来控制开销。
+            const clonedResponse = response.clone();
+            const inspected = await readResponseTextUpTo(
+              clonedResponse,
+              NON_STREAM_BODY_INSPECTION_MAX_BYTES
+            );
+            inspectedText = inspected.text;
+            inspectedTruncated = inspected.truncated;
+          }
+
+          if (inspectedText !== undefined) {
+            // 对非流式 2xx 响应:只启用“强信号”判定(HTML 文档 / 顶层 error 非空 / 空 body)。
+            // `message` 关键字匹配属于弱信号,误判风险更高;该规则主要用于 SSE 结束后的补充检测。
+            const detected = detectUpstreamErrorFromSseOrJsonText(inspectedText, {
+              maxJsonCharsForMessageCheck: 0,
+            });
+
+            if (detected.isError && detected.code === "FAKE_200_EMPTY_BODY") {
               throw new EmptyResponseError(currentProvider.id, currentProvider.name, "empty_body");
             }
 
-            // 对于没有 Content-Length 的情况,需要 clone 并检查响应体
-            // 注意:这会增加一定的性能开销,但对于非流式响应是可接受的
-            if (!contentLength) {
-              const clonedResponse = response.clone();
-              const responseText = await clonedResponse.text();
-
-              if (!responseText || responseText.trim() === "") {
-                throw new EmptyResponseError(
-                  currentProvider.id,
-                  currentProvider.name,
-                  "empty_body"
-                );
-              }
+            const isStrongFake200 =
+              detected.isError &&
+              (detected.code === "FAKE_200_HTML_BODY" ||
+                detected.code === "FAKE_200_JSON_ERROR_NON_EMPTY" ||
+                detected.code === "FAKE_200_JSON_ERROR_MESSAGE_NON_EMPTY");
+
+            if (isStrongFake200) {
+              const inferredStatus = inferUpstreamErrorStatusCodeFromText(inspectedText);
+              const inferredStatusCode = inferredStatus?.statusCode;
+
+              throw new ProxyError(detected.code, inferredStatusCode ?? 502, {
+                body: detected.detail ?? "",
+                providerId: currentProvider.id,
+                providerName: currentProvider.name,
+                // 注意:rawBody 仅用于“本次错误响应”向客户端提供更多排查信息(受系统设置控制),
+                // 不参与规则匹配/持久化,避免污染数据库或误触发覆写规则。
+                rawBody: inspectedText,
+                rawBodyTruncated: inspectedTruncated,
+                statusCodeInferred: inferredStatusCode !== undefined,
+                statusCodeInferenceMatcherId: inferredStatus?.matcherId,
+              });
+            }
+          }
+
+          // 对于缺失或非法 Content-Length 的情况,需要 clone 并检查响应体
+          // 注意:这会增加一定的性能开销,但对于非流式响应是可接受的
+          if (!contentLength || !hasValidContentLength) {
+            const responseText = inspectedText ?? "";
 
+            if (!responseText || responseText.trim() === "") {
+              throw new EmptyResponseError(currentProvider.id, currentProvider.name, "empty_body");
+            }
+
+            if (inspectedTruncated) {
+              logger.debug(
+                "ProxyForwarder: Response body too large for non-stream content check, skipping JSON parse",
+                {
+                  providerId: currentProvider.id,
+                  providerName: currentProvider.name,
+                  contentType,
+                  maxBytes: NON_STREAM_BODY_INSPECTION_MAX_BYTES,
+                }
+              );
+            } else {
               // 尝试解析 JSON 并检查是否有输出内容
               try {
                 const responseJson = JSON.parse(responseText) as Record<string, unknown>;
@@ -722,7 +893,12 @@ export class ProxyForwarder {
                     // 注意:不抛出错误,因为某些请求(如 count_tokens)可能合法地返回 0 output tokens
                   }
                 }
-              } catch (_parseError) {
+              } catch (_parseOrContentError) {
+                // EmptyResponseError 会触发重试/故障转移,不能在这里被当作 JSON 解析错误吞掉。
+                if (isEmptyResponseError(_parseOrContentError)) {
+                  throw _parseOrContentError;
+                }
+
                 // JSON 解析失败但响应体不为空,不视为空响应错误
                 logger.debug("ProxyForwarder: Non-JSON response body, skipping content check", {
                   providerId: currentProvider.id,
@@ -964,6 +1140,7 @@ export class ProxyForwarder {
                       attemptNumber: attemptCount,
                       errorMessage,
                       statusCode: lastError.statusCode,
+                      statusCodeInferred: lastError.upstreamError?.statusCodeInferred ?? false,
                       errorDetails: {
                         provider: {
                           id: currentProvider.id,
@@ -1096,6 +1273,7 @@ export class ProxyForwarder {
                       attemptNumber: attemptCount,
                       errorMessage,
                       statusCode: lastError.statusCode,
+                      statusCodeInferred: lastError.upstreamError?.statusCodeInferred ?? false,
                       errorDetails: {
                         provider: {
                           id: currentProvider.id,
@@ -1160,6 +1338,7 @@ export class ProxyForwarder {
               providerId: currentProvider.id,
               providerName: currentProvider.name,
               statusCode: statusCode,
+              statusCodeInferred: proxyError.upstreamError?.statusCodeInferred ?? false,
               error: errorMessage,
               attemptNumber: attemptCount,
               totalProvidersAttempted,
@@ -1176,6 +1355,7 @@ export class ProxyForwarder {
               attemptNumber: attemptCount,
               errorMessage: errorMessage,
               statusCode: statusCode,
+              statusCodeInferred: proxyError.upstreamError?.statusCodeInferred ?? false,
               errorDetails: {
                 provider: {
                   id: currentProvider.id,
@@ -1298,6 +1478,7 @@ export class ProxyForwarder {
               providerId: currentProvider.id,
               providerName: currentProvider.name,
               statusCode: 404,
+              statusCodeInferred: proxyError.upstreamError?.statusCodeInferred ?? false,
               error: errorMessage,
               attemptNumber: attemptCount,
               totalProvidersAttempted,
@@ -1312,6 +1493,7 @@ export class ProxyForwarder {
               attemptNumber: attemptCount,
               errorMessage: errorMessage,
               statusCode: 404,
+              statusCodeInferred: proxyError.upstreamError?.statusCodeInferred ?? false,
               errorDetails: {
                 provider: {
                   id: currentProvider.id,
@@ -1454,6 +1636,7 @@ export class ProxyForwarder {
               providerId: currentProvider.id,
               providerName: currentProvider.name,
               statusCode: statusCode,
+              statusCodeInferred: proxyError.upstreamError?.statusCodeInferred ?? false,
               error: errorMessage,
               attemptNumber: attemptCount,
               totalProvidersAttempted,
@@ -1473,6 +1656,7 @@ export class ProxyForwarder {
               circuitFailureCount: health.failureCount + 1, // 包含本次失败
               circuitFailureThreshold: config.failureThreshold,
               statusCode: statusCode,
+              statusCodeInferred: proxyError.upstreamError?.statusCodeInferred ?? false,
               errorDetails: {
                 provider: {
                   id: currentProvider.id,

+ 59 - 30
src/app/v1/_lib/proxy/response-handler.ts

@@ -12,7 +12,10 @@ import type { CostBreakdown } from "@/lib/utils/cost-calculation";
 import { calculateRequestCost, calculateRequestCostBreakdown } from "@/lib/utils/cost-calculation";
 import { hasValidPriceData } from "@/lib/utils/price-data";
 import { isSSEText, parseSSEData } from "@/lib/utils/sse";
-import { detectUpstreamErrorFromSseOrJsonText } from "@/lib/utils/upstream-error-detection";
+import {
+  detectUpstreamErrorFromSseOrJsonText,
+  inferUpstreamErrorStatusCodeFromText,
+} from "@/lib/utils/upstream-error-detection";
 import {
   updateMessageRequestCost,
   updateMessageRequestDetails,
@@ -135,7 +138,8 @@ type FinalizeDeferredStreamingResult = {
  *   - 如果内容看起来是上游错误 JSON(假 200),则:
  *     - 计入熔断器失败;
  *     - 不更新 session 智能绑定(避免把会话粘到坏 provider);
- *     - 内部状态码改为 502(只影响统计与后续重试选择,不影响本次客户端响应)。
+ *     - 内部状态码改为“推断得到的 4xx/5xx”(未命中则回退 502),
+ *       仅影响统计与后续重试选择,不影响本次客户端响应。
  *   - 如果流正常结束且未命中错误判定,则按成功结算并更新绑定/熔断/endpoint 成功率。
  *
  * @param streamEndedNormally - 必须是 reader 读到 done=true 的“自然结束”;超时/中断等异常结束由其它逻辑处理。
@@ -166,12 +170,21 @@ async function finalizeDeferredStreamingFinalizationIfNeeded(
     : ({ isError: false } as const);
 
   // “内部结算用”的状态码(不会改变客户端实际 HTTP 状态码)。
-  // - 假 200:映射为 502,确保内部统计/熔断/会话绑定把它当作失败。
+  // - 假 200:优先映射为“推断得到的 4xx/5xx”(未命中则回退 502,确保内部统计/熔断/会话绑定把它当作失败。
   // - 未自然结束:也应映射为失败(避免把中断/部分流误记为 200 completed)。
   let effectiveStatusCode: number;
   let errorMessage: string | null;
+  let statusCodeInferred = false;
+  let statusCodeInferenceMatcherId: string | undefined;
   if (detected.isError) {
-    effectiveStatusCode = 502;
+    const inferred = inferUpstreamErrorStatusCodeFromText(allContent);
+    if (inferred) {
+      effectiveStatusCode = inferred.statusCode;
+      statusCodeInferred = true;
+      statusCodeInferenceMatcherId = inferred.matcherId;
+    } else {
+      effectiveStatusCode = 502;
+    }
     errorMessage = detected.code;
   } else if (!streamEndedNormally) {
     effectiveStatusCode = clientAborted ? 499 : 502;
@@ -277,21 +290,29 @@ async function finalizeDeferredStreamingFinalizationIfNeeded(
       providerName: meta.providerName,
       upstreamStatusCode: meta.upstreamStatusCode,
       effectiveStatusCode,
+      statusCodeInferred,
+      statusCodeInferenceMatcherId: statusCodeInferenceMatcherId ?? null,
       code: detected.code,
       detail: detected.detail ?? null,
     });
 
-    // 计入熔断器:让后续请求能正确触发故障转移/熔断
-    try {
-      // 动态导入:避免 proxy 模块与熔断器模块之间潜在的循环依赖。
-      const { recordFailure } = await import("@/lib/circuit-breaker");
-      await recordFailure(meta.providerId, new Error(detected.code));
-    } catch (cbError) {
-      logger.warn("[ResponseHandler] Failed to record fake-200 error in circuit breaker", {
-        providerId: meta.providerId,
-        sessionId: session.sessionId ?? null,
-        error: cbError,
-      });
+    const chainReason = effectiveStatusCode === 404 ? "resource_not_found" : "retry_failed";
+
+    // 计入熔断器:让后续请求能正确触发故障转移/熔断。
+    //
+    // 注意:404 语义在 forwarder 中属于 RESOURCE_NOT_FOUND,不计入熔断器(避免把“资源/模型不存在”当作供应商故障)。
+    if (effectiveStatusCode !== 404) {
+      try {
+        // 动态导入:避免 proxy 模块与熔断器模块之间潜在的循环依赖。
+        const { recordFailure } = await import("@/lib/circuit-breaker");
+        await recordFailure(meta.providerId, new Error(detected.code));
+      } catch (cbError) {
+        logger.warn("[ResponseHandler] Failed to record fake-200 error in circuit breaker", {
+          providerId: meta.providerId,
+          sessionId: session.sessionId ?? null,
+          error: cbError,
+        });
+      }
     }
 
     // NOTE: Do NOT call recordEndpointFailure here. Fake-200 errors are key-level
@@ -299,14 +320,16 @@ async function finalizeDeferredStreamingFinalizationIfNeeded(
     // the error is in the response content, not endpoint connectivity.
 
     // 记录到决策链(用于日志展示与 DB 持久化)。
-    // 注意:这里用 effectiveStatusCode(502)而不是 upstreamStatusCode(200),
-    // 以便让内部链路明确显示这是一次失败(否则会被误读为成功)。
+    // 注意:这里用 effectiveStatusCode(推断得到的 4xx/5xx,或回退 502)
+    // 而不是 upstreamStatusCode(200),以便让内部链路明确显示这是一次失败
+    // (否则会被误读为成功)。
     session.addProviderToChain(providerForChain, {
       endpointId: meta.endpointId,
       endpointUrl: meta.endpointUrl,
-      reason: "retry_failed",
+      reason: chainReason,
       attemptNumber: meta.attemptNumber,
       statusCode: effectiveStatusCode,
+      statusCodeInferred,
       errorMessage: detected.code,
     });
 
@@ -323,16 +346,21 @@ async function finalizeDeferredStreamingFinalizationIfNeeded(
       errorMessage,
     });
 
-    // 计入熔断器:让后续请求能正确触发故障转移/熔断
-    try {
-      const { recordFailure } = await import("@/lib/circuit-breaker");
-      await recordFailure(meta.providerId, new Error(errorMessage));
-    } catch (cbError) {
-      logger.warn("[ResponseHandler] Failed to record non-200 error in circuit breaker", {
-        providerId: meta.providerId,
-        sessionId: session.sessionId ?? null,
-        error: cbError,
-      });
+    const chainReason = effectiveStatusCode === 404 ? "resource_not_found" : "retry_failed";
+
+    // 计入熔断器:让后续请求能正确触发故障转移/熔断。
+    // 注意:与 forwarder 口径保持一致:404 不计入熔断器(资源不存在不是供应商故障)。
+    if (effectiveStatusCode !== 404) {
+      try {
+        const { recordFailure } = await import("@/lib/circuit-breaker");
+        await recordFailure(meta.providerId, new Error(errorMessage));
+      } catch (cbError) {
+        logger.warn("[ResponseHandler] Failed to record non-200 error in circuit breaker", {
+          providerId: meta.providerId,
+          sessionId: session.sessionId ?? null,
+          error: cbError,
+        });
+      }
     }
 
     // NOTE: Do NOT call recordEndpointFailure here. Non-200 HTTP errors (401, 429,
@@ -343,7 +371,7 @@ async function finalizeDeferredStreamingFinalizationIfNeeded(
     session.addProviderToChain(providerForChain, {
       endpointId: meta.endpointId,
       endpointUrl: meta.endpointUrl,
-      reason: "retry_failed",
+      reason: chainReason,
       attemptNumber: meta.attemptNumber,
       statusCode: effectiveStatusCode,
       errorMessage: errorMessage,
@@ -2750,7 +2778,8 @@ async function updateRequestCostFromUsage(
  * 统一的请求统计处理方法
  * 用于消除 Gemini 透传、普通非流式、普通流式之间的重复统计逻辑
  *
- * @param statusCode - 内部结算状态码(可能与客户端实际收到的 HTTP 状态不同,例如“假 200”会被映射为 502)
+ * @param statusCode - 内部结算状态码(可能与客户端实际收到的 HTTP 状态不同,例如“假 200”会被推断并映射为更贴近语义的 4xx/5xx;
+ *                   未命中推断规则时回退为 502)
  * @param errorMessage - 可选的内部错误原因(用于把假 200/解析失败等信息写入 DB 与监控)
  */
 export async function finalizeRequestStats(

+ 2 - 0
src/app/v1/_lib/proxy/session.ts

@@ -457,6 +457,7 @@ export class ProxySession {
       endpointUrl?: string;
       // 修复:添加新字段
       statusCode?: number; // 成功时的状态码
+      statusCodeInferred?: boolean; // statusCode 是否为响应体推断
       circuitFailureCount?: number; // 熔断失败计数
       circuitFailureThreshold?: number; // 熔断阈值
       errorDetails?: ProviderChainItem["errorDetails"]; // 结构化错误详情
@@ -485,6 +486,7 @@ export class ProxySession {
       errorMessage: metadata?.errorMessage, // 记录错误信息
       // 修复:记录新字段
       statusCode: metadata?.statusCode,
+      statusCodeInferred: metadata?.statusCodeInferred,
       circuitFailureCount: metadata?.circuitFailureCount,
       circuitFailureThreshold: metadata?.circuitFailureThreshold,
       errorDetails: metadata?.errorDetails, // 结构化错误详情

+ 107 - 0
src/lib/utils/provider-chain-formatter.test.ts

@@ -375,6 +375,113 @@ describe("vendor_type_all_timeout", () => {
   });
 });
 
+// =============================================================================
+// resource_not_found reason tests
+// =============================================================================
+
+describe("resource_not_found", () => {
+  const baseNotFoundItem: ProviderChainItem = {
+    id: 1,
+    name: "provider-a",
+    reason: "resource_not_found",
+    attemptNumber: 1,
+    statusCode: 404,
+    errorMessage: "Not Found",
+    timestamp: 1000,
+    errorDetails: {
+      provider: {
+        id: 1,
+        name: "provider-a",
+        statusCode: 404,
+        statusText: "Not Found",
+      },
+    },
+  };
+
+  describe("formatProviderSummary", () => {
+    test("renders resource_not_found item as failure in summary", () => {
+      const chain: ProviderChainItem[] = [baseNotFoundItem];
+      const result = formatProviderSummary(chain, mockT);
+
+      expect(result).toContain("provider-a");
+      expect(result).toContain("✗");
+    });
+
+    test("renders resource_not_found alongside a successful retry in multi-provider chain", () => {
+      const chain: ProviderChainItem[] = [
+        baseNotFoundItem,
+        {
+          id: 2,
+          name: "provider-b",
+          reason: "retry_success",
+          statusCode: 200,
+          timestamp: 2000,
+          attemptNumber: 1,
+        },
+      ];
+      const result = formatProviderSummary(chain, mockT);
+
+      expect(result).toContain("provider-a");
+      expect(result).toContain("provider-b");
+      expect(result).toMatch(/provider-a\(.*\).*provider-b\(.*\)/);
+    });
+  });
+
+  describe("formatProviderDescription", () => {
+    test("shows resource not found label in request chain", () => {
+      const chain: ProviderChainItem[] = [baseNotFoundItem];
+      const result = formatProviderDescription(chain, mockT);
+
+      expect(result).toContain("provider-a");
+      expect(result).toContain("description.resourceNotFound");
+    });
+  });
+
+  describe("formatProviderTimeline", () => {
+    test("renders resource_not_found with status code and note", () => {
+      const chain: ProviderChainItem[] = [baseNotFoundItem];
+      const { timeline } = formatProviderTimeline(chain, mockT);
+
+      expect(timeline).toContain("timeline.resourceNotFoundFailed [attempt=1]");
+      expect(timeline).toContain("timeline.statusCode [code=404]");
+      expect(timeline).toContain("timeline.resourceNotFoundNote");
+    });
+
+    test("renders inferred status code label when statusCodeInferred=true", () => {
+      const chain: ProviderChainItem[] = [{ ...baseNotFoundItem, statusCodeInferred: true }];
+      const { timeline } = formatProviderTimeline(chain, mockT);
+
+      expect(timeline).toContain("timeline.resourceNotFoundFailed [attempt=1]");
+      expect(timeline).toContain("timeline.statusCodeInferred [code=404]");
+      expect(timeline).toContain("timeline.resourceNotFoundNote");
+    });
+
+    test("degrades gracefully when errorDetails.provider is missing", () => {
+      const chain: ProviderChainItem[] = [
+        {
+          ...baseNotFoundItem,
+          errorDetails: {
+            request: {
+              method: "POST",
+              url: "https://example.com/v1/messages",
+              headers: "{}",
+              body: "{}",
+              bodyTruncated: false,
+            },
+          },
+        },
+      ];
+      const { timeline } = formatProviderTimeline(chain, mockT);
+
+      expect(timeline).toContain("timeline.resourceNotFoundFailed [attempt=1]");
+      expect(timeline).toContain("timeline.provider [provider=provider-a]");
+      expect(timeline).toContain("timeline.statusCode [code=404]");
+      expect(timeline).toContain("timeline.error [error=Not Found]");
+      expect(timeline).toContain("timeline.resourceNotFoundNote");
+    });
+  });
+});
+
 // =============================================================================
 // Unknown reason graceful degradation
 // =============================================================================

+ 59 - 4
src/lib/utils/provider-chain-formatter.ts

@@ -63,6 +63,7 @@ function getProviderStatus(item: ProviderChainItem): "✓" | "✗" | "⚡" | "
   if (
     item.reason === "retry_failed" ||
     item.reason === "system_error" ||
+    item.reason === "resource_not_found" ||
     item.reason === "client_error_non_retryable" ||
     item.reason === "endpoint_pool_exhausted" ||
     item.reason === "vendor_type_all_timeout"
@@ -92,6 +93,7 @@ function isActualRequest(item: ProviderChainItem): boolean {
   if (
     item.reason === "retry_failed" ||
     item.reason === "system_error" ||
+    item.reason === "resource_not_found" ||
     item.reason === "client_error_non_retryable" ||
     item.reason === "endpoint_pool_exhausted" ||
     item.reason === "vendor_type_all_timeout"
@@ -127,6 +129,16 @@ function translateCircuitState(state: string | undefined, t: (key: string) => st
   }
 }
 
+function formatTimelineStatusCode(
+  item: ProviderChainItem,
+  code: number,
+  t: (key: string, values?: Record<string, string | number>) => string
+): string {
+  return item.statusCodeInferred
+    ? t("timeline.statusCodeInferred", { code })
+    : t("timeline.statusCode", { code });
+}
+
 /**
  * 辅助函数:获取错误码含义
  */
@@ -313,6 +325,8 @@ export function formatProviderDescription(
         desc += ` ${t("description.http2Fallback")}`;
       } else if (item.reason === "client_error_non_retryable") {
         desc += ` ${t("description.clientError")}`;
+      } else if (item.reason === "resource_not_found") {
+        desc += ` ${t("description.resourceNotFound")}`;
       } else if (item.reason === "endpoint_pool_exhausted") {
         desc += ` ${t("description.endpointPoolExhausted")}`;
       } else if (item.reason === "vendor_type_all_timeout") {
@@ -445,6 +459,47 @@ export function formatProviderTimeline(
       continue;
     }
 
+    // === 资源不存在(上游 404) ===
+    if (item.reason === "resource_not_found") {
+      const attempt = actualAttemptNumber ?? item.attemptNumber ?? 0;
+      timeline += `${t("timeline.resourceNotFoundFailed", { attempt })}\n\n`;
+
+      if (item.errorDetails?.provider) {
+        const p = item.errorDetails.provider;
+        timeline += `${t("timeline.provider", { provider: p.name })}\n`;
+        timeline += `${formatTimelineStatusCode(item, p.statusCode, t)}\n`;
+        timeline += `${t("timeline.error", { error: p.statusText })}\n`;
+
+        // 计算请求耗时
+        if (i > 0 && item.timestamp && chain[i - 1]?.timestamp) {
+          const duration = item.timestamp - (chain[i - 1]?.timestamp || 0);
+          timeline += `${t("timeline.requestDuration", { duration })}\n`;
+        }
+
+        // 错误详情(格式化 JSON)
+        if (p.upstreamParsed) {
+          timeline += `\n${t("timeline.errorDetails")}:\n`;
+          timeline += JSON.stringify(p.upstreamParsed, null, 2);
+        } else if (p.upstreamBody) {
+          timeline += `\n${t("timeline.errorDetails")}:\n${p.upstreamBody}`;
+        }
+      } else {
+        timeline += `${t("timeline.provider", { provider: item.name })}\n`;
+        if (item.statusCode) {
+          timeline += `${formatTimelineStatusCode(item, item.statusCode, t)}\n`;
+        }
+        timeline += t("timeline.error", { error: item.errorMessage || t("timeline.unknown") });
+      }
+
+      // 请求详情(用于问题排查)
+      if (item.errorDetails?.request) {
+        timeline += formatRequestDetails(item.errorDetails.request, t);
+      }
+
+      timeline += `\n${t("timeline.resourceNotFoundNote")}`;
+      continue;
+    }
+
     // === 供应商错误(请求失败) ===
     if (item.reason === "retry_failed") {
       timeline += `${t("timeline.requestFailed", { attempt: actualAttemptNumber ?? 0 })}\n\n`;
@@ -453,7 +508,7 @@ export function formatProviderTimeline(
       if (item.errorDetails?.provider) {
         const p = item.errorDetails.provider;
         timeline += `${t("timeline.provider", { provider: p.name })}\n`;
-        timeline += `${t("timeline.statusCode", { code: p.statusCode })}\n`;
+        timeline += `${formatTimelineStatusCode(item, p.statusCode, t)}\n`;
         timeline += `${t("timeline.error", { error: p.statusText })}\n`;
 
         // 计算请求耗时
@@ -500,7 +555,7 @@ export function formatProviderTimeline(
         // 降级:使用 errorMessage
         timeline += `${t("timeline.provider", { provider: item.name })}\n`;
         if (item.statusCode) {
-          timeline += `${t("timeline.statusCode", { code: item.statusCode })}\n`;
+          timeline += `${formatTimelineStatusCode(item, item.statusCode, t)}\n`;
         }
         timeline += t("timeline.error", { error: item.errorMessage || t("timeline.unknown") });
 
@@ -588,12 +643,12 @@ export function formatProviderTimeline(
       if (item.errorDetails?.provider) {
         const p = item.errorDetails.provider;
         timeline += `${t("timeline.provider", { provider: p.name })}\n`;
-        timeline += `${t("timeline.statusCode", { code: p.statusCode })}\n`;
+        timeline += `${formatTimelineStatusCode(item, p.statusCode, t)}\n`;
         timeline += `${t("timeline.error", { error: p.statusText })}\n`;
       } else {
         timeline += `${t("timeline.provider", { provider: item.name })}\n`;
         if (item.statusCode) {
-          timeline += `${t("timeline.statusCode", { code: item.statusCode })}\n`;
+          timeline += `${formatTimelineStatusCode(item, item.statusCode, t)}\n`;
         }
         timeline += `${t("timeline.error", { error: item.errorMessage || t("timeline.unknown") })}\n`;
       }

+ 107 - 1
src/lib/utils/upstream-error-detection.test.ts

@@ -1,5 +1,8 @@
 import { describe, expect, test } from "vitest";
-import { detectUpstreamErrorFromSseOrJsonText } from "@/lib/utils/upstream-error-detection";
+import {
+  detectUpstreamErrorFromSseOrJsonText,
+  inferUpstreamErrorStatusCodeFromText,
+} from "@/lib/utils/upstream-error-detection";
 
 describe("detectUpstreamErrorFromSseOrJsonText", () => {
   test("空响应体视为错误", () => {
@@ -16,6 +19,49 @@ describe("detectUpstreamErrorFromSseOrJsonText", () => {
     });
   });
 
+  test("明显的 HTML 文档视为错误(覆盖 200+text/html 的“假 200”)", () => {
+    const html = [
+      "<!doctype html>",
+      '<html lang="en">',
+      "<head><title>New API</title></head>",
+      "<body>Something went wrong</body>",
+      "</html>",
+    ].join("\n");
+    const res = detectUpstreamErrorFromSseOrJsonText(html);
+    expect(res).toEqual({
+      isError: true,
+      code: "FAKE_200_HTML_BODY",
+      detail: expect.any(String),
+    });
+  });
+
+  test("带 BOM 的 HTML 文档也应视为错误", () => {
+    const htmlWithBom = "\uFEFF \n<!doctype html>\n<html><head></head><body>blocked</body></html>";
+    const res = detectUpstreamErrorFromSseOrJsonText(htmlWithBom);
+    expect(res.isError).toBe(true);
+    if (res.isError) {
+      expect(res.code).toBe("FAKE_200_HTML_BODY");
+    }
+  });
+
+  test("带 BOM 的 JSON error 也应正常识别", () => {
+    const jsonWithBom = '\uFEFF \n{"error":"当前无可用凭证"}';
+    const res = detectUpstreamErrorFromSseOrJsonText(jsonWithBom);
+    expect(res.isError).toBe(true);
+    if (res.isError) {
+      expect(res.code).toBe("FAKE_200_JSON_ERROR_NON_EMPTY");
+    }
+  });
+
+  test("纯 JSON:content 内包含 <html> 文本不应误判为 HTML 错误", () => {
+    const body = JSON.stringify({
+      type: "message",
+      content: [{ type: "text", text: "<html>not an error</html>" }],
+    });
+    const res = detectUpstreamErrorFromSseOrJsonText(body);
+    expect(res.isError).toBe(false);
+  });
+
   test("纯 JSON:error 字段非空视为错误", () => {
     const res = detectUpstreamErrorFromSseOrJsonText('{"error":"当前无可用凭证"}');
     expect(res.isError).toBe(true);
@@ -211,3 +257,63 @@ describe("detectUpstreamErrorFromSseOrJsonText", () => {
     expect(res.isError).toBe(false);
   });
 });
+
+describe("inferUpstreamErrorStatusCodeFromText", () => {
+  test("空文本不推断状态码", () => {
+    expect(inferUpstreamErrorStatusCodeFromText("")).toBeNull();
+    expect(inferUpstreamErrorStatusCodeFromText("   \n\t  ")).toBeNull();
+  });
+
+  test("可从错误文本中推断 429(rate limit)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText('{"error":"Rate limit exceeded"}')).toEqual({
+      statusCode: 429,
+      matcherId: "rate_limit",
+    });
+  });
+
+  test("可从错误文本中推断 401(invalid api key)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText('{"error":"Invalid API key"}')).toEqual({
+      statusCode: 401,
+      matcherId: "unauthorized",
+    });
+  });
+
+  test("可从错误文本中推断 403(access denied)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText("Access denied")).toEqual({
+      statusCode: 403,
+      matcherId: "forbidden",
+    });
+  });
+
+  test("可从错误文本中推断 402(billing hard limit)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText("billing_hard_limit_reached")).toEqual({
+      statusCode: 402,
+      matcherId: "payment_required",
+    });
+  });
+
+  test("可从错误文本中推断 404(model not found)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText("model not found")).toEqual({
+      statusCode: 404,
+      matcherId: "not_found",
+    });
+  });
+
+  test("可从错误文本中推断 413(payload too large)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText("payload too large")).toEqual({
+      statusCode: 413,
+      matcherId: "payload_too_large",
+    });
+  });
+
+  test("可从错误文本中推断 415(unsupported media type)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText("Unsupported Media Type")).toEqual({
+      statusCode: 415,
+      matcherId: "unsupported_media_type",
+    });
+  });
+
+  test("仅包含泛化 error 字样时不推断(避免误判)", () => {
+    expect(inferUpstreamErrorStatusCodeFromText('{"message":"some error happened"}')).toBeNull();
+  });
+});

+ 176 - 2
src/lib/utils/upstream-error-detection.ts

@@ -18,6 +18,7 @@ import { parseSSEData } from "@/lib/utils/sse";
  *
  * 设计目标(偏保守)
  * - 仅基于结构化字段做启发式判断:`error` 与 `message`;
+ * - 对明显的 HTML 文档(doctype/html 标签)做强信号判定,覆盖部分网关/WAF/Cloudflare 返回的“假 200”;
  * - 不扫描模型生成的正文内容(例如 content/choices),避免把用户/模型自然语言里的 "error" 误判为上游错误;
  * - message 关键字检测仅对“小体积 JSON”启用,降低误判与性能开销。
  * - 返回的 `code` 是语言无关的错误码(便于写入 DB/监控/告警);
@@ -31,6 +32,22 @@ export type UpstreamErrorDetectionResult =
       detail?: string;
     };
 
+/**
+ * 基于“响应体文本内容”的状态码推断结果。
+ *
+ * 设计目标(偏保守):
+ * - 仅用于“假 200”场景:上游返回 HTTP 200,但 body 明显是错误页/错误 JSON;
+ * - 用于把内部结算/熔断/故障转移的 statusCode 调整为更贴近真实错误语义的 4xx/5xx;
+ * - 若未命中任何规则,应保持调用方既有默认行为(通常回退为 502)。
+ */
+export type UpstreamErrorStatusInferenceResult = {
+  statusCode: number;
+  /**
+   * 命中的规则 id(用于内部审计/调试;不应作为用户展示文案)。
+   */
+  matcherId: string;
+};
+
 type DetectionOptions = {
   /**
    * 仅对小体积 JSON 启用 message 关键字检测,避免误判与无谓开销。
@@ -53,6 +70,7 @@ const DEFAULT_MESSAGE_KEYWORD = /error/i;
 
 const FAKE_200_CODES = {
   EMPTY_BODY: "FAKE_200_EMPTY_BODY",
+  HTML_BODY: "FAKE_200_HTML_BODY",
   JSON_ERROR_NON_EMPTY: "FAKE_200_JSON_ERROR_NON_EMPTY",
   JSON_ERROR_MESSAGE_NON_EMPTY: "FAKE_200_JSON_ERROR_MESSAGE_NON_EMPTY",
   JSON_MESSAGE_KEYWORD_MATCH: "FAKE_200_JSON_MESSAGE_KEYWORD_MATCH",
@@ -63,6 +81,142 @@ const FAKE_200_CODES = {
 const MAY_HAVE_JSON_ERROR_KEY = /"error"\s*:/;
 const MAY_HAVE_JSON_MESSAGE_KEY = /"message"\s*:/;
 
+const HTML_DOC_SNIFF_MAX_CHARS = 1024;
+const HTML_DOCTYPE_RE = /^<!doctype\s+html[\s>]/i;
+const HTML_HTML_TAG_RE = /^<html[\s>]/i;
+
+// 状态码推断:为避免在极端大响应体上执行正则带来额外开销,仅取前缀做匹配。
+// 说明:对“假 200”错误页/错误 JSON 来说,关键错误信息通常会出现在前段。
+const STATUS_INFERENCE_MAX_CHARS = 64 * 1024;
+
+// 注意:这些正则只用于“假 200”场景,且仅在 detectUpstreamErrorFromSseOrJsonText 已判定 isError=true 时才会被调用。
+// 因此允许包含少量“关键词启发式”,但仍应尽量避免过宽匹配,降低误判导致“错误码误推断”的概率。
+const ERROR_STATUS_MATCHERS: Array<{ statusCode: number; matcherId: string; re: RegExp }> = [
+  {
+    statusCode: 429,
+    matcherId: "rate_limit",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+429\b|\b429\s+too\s+many\s+requests\b|\btoo\s+many\s+requests\b|\brate\s*limit(?:ed|ing)?\b|\bthrottl(?:e|ed|ing)\b|\bretry-after\b|\bRESOURCE_EXHAUSTED\b|\bRequestLimitExceeded\b|\bThrottling(?:Exception)?\b|\bError\s*1015\b|超出频率|请求过于频繁|限流|稍后重试)/iu,
+  },
+  {
+    statusCode: 402,
+    matcherId: "payment_required",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+402\b|\bpayment\s+required\b|\binsufficient\s+(?:balance|funds|credits)\b|\b(?:out\s+of|no)\s+credits\b|\binsufficient_balance\b|\bbilling_hard_limit_reached\b|\bcard\s+(?:declined|expired)\b|\bpayment\s+(?:method|failed)\b|余额不足|欠费|请充值|支付(?:失败|方式))/iu,
+  },
+  {
+    statusCode: 401,
+    matcherId: "unauthorized",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+401\b|\bunauthori(?:sed|zed)\b|\bunauthenticated\b|\bauthentication\s+failed\b|\b(?:invalid|incorrect|missing)\s+api[-_ ]?key\b|\binvalid\s+token\b|\bexpired\s+token\b|\bsignature\s+(?:invalid|mismatch)\b|\bUNAUTHENTICATED\b|未授权|鉴权失败|密钥无效|token\s*过期)/iu,
+  },
+  {
+    statusCode: 403,
+    matcherId: "forbidden",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+403\b|\bforbidden\b|\bpermission\s+denied\b|\baccess\s+denied\b|\bnot\s+allowed\b|\baccount\s+(?:disabled|suspended|banned)\b|\bnot\s+whitelisted\b|\bPERMISSION_DENIED\b|\bAccessDenied(?:Exception)?\b|\bError\s*1020\b|\b(?:region|country)\b[\s\S]{0,40}\b(?:not\s+supported|blocked)\b|地区不支持|禁止访问|无权限|权限不足|账号被封|地区(?:限制|屏蔽))/iu,
+  },
+  {
+    statusCode: 404,
+    matcherId: "not_found",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+404\b|\b(?:model|deployment|endpoint|resource|route|path|api|service|url)\s+not\s+found\b|\bunknown\s+model\b|\bdoes\s+not\s+exist\b|\bNOT_FOUND\b|\bResourceNotFoundException\b|未找到|不存在|模型不存在)/iu,
+  },
+  {
+    statusCode: 413,
+    matcherId: "payload_too_large",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+413\b|\bpayload\s+too\s+large\b|\brequest\s+entity\s+too\s+large\b|\bbody\s+too\s+large\b|\bContent-Length\b[\s\S]{0,40}\btoo\s+large\b|\bexceed(?:s|ed)?\b[\s\S]{0,40}\b(?:max(?:imum)?|limit)\b[\s\S]{0,40}\b(?:size|length)\b|请求体过大|内容过大|超过最大)/iu,
+  },
+  {
+    statusCode: 415,
+    matcherId: "unsupported_media_type",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+415\b|\bunsupported\s+media\s+type\b|\binvalid\s+content-type\b|\bContent-Type\b[\s\S]{0,40}\b(?:must\s+be|required)\b|不支持的媒体类型|Content-Type\s*错误)/iu,
+  },
+  {
+    statusCode: 409,
+    matcherId: "conflict",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+409\b|\bconflict\b|\bidempotency(?:-key)?\b|\bABORTED\b|冲突|幂等)/iu,
+  },
+  {
+    statusCode: 422,
+    matcherId: "unprocessable_entity",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+422\b|\bunprocessable\s+entity\b|\bINVALID_ARGUMENT\b[\s\S]{0,40}\bvalidation\b|\bschema\s+validation\b|实体无法处理)/iu,
+  },
+  {
+    statusCode: 408,
+    matcherId: "request_timeout",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+408\b|\brequest\s+timeout\b|请求\s*超时)/iu,
+  },
+  {
+    statusCode: 451,
+    matcherId: "legal_restriction",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+451\b|\bunavailable\s+for\s+legal\s+reasons\b|\bexport\s+control\b|\bsanctions?\b|法律原因不可用|合规限制|出口管制)/iu,
+  },
+  {
+    statusCode: 503,
+    matcherId: "service_unavailable",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+503\b|\bservice\s+unavailable\b|\boverloaded\b|\bserver\s+is\s+busy\b|\btry\s+again\s+later\b|\btemporarily\s+unavailable\b|\bmaintenance\b|\bUNAVAILABLE\b|\bServiceUnavailableException\b|\bError\s*521\b|服务不可用|过载|系统繁忙|维护中)/iu,
+  },
+  {
+    statusCode: 504,
+    matcherId: "gateway_timeout",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+504\b|\bgateway\s+timeout\b|\bupstream\b[\s\S]{0,40}\btim(?:e|ed)\s*out\b|\bDEADLINE_EXCEEDED\b|\bError\s*522\b|\bError\s*524\b|网关超时|上游超时)/iu,
+  },
+  {
+    statusCode: 500,
+    matcherId: "internal_server_error",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+500\b|\binternal\s+server\s+error\b|\bInternalServerException\b|\bINTERNAL\b|内部错误|服务器错误)/iu,
+  },
+  {
+    statusCode: 400,
+    matcherId: "bad_request",
+    re: /(?:\bHTTP\/\d(?:\.\d)?\s+400\b|\bbad\s+request\b|\bINVALID_ARGUMENT\b|\bjson\s+parse\b|\binvalid\s+json\b|\bunexpected\s+token\b|无效请求|格式错误|JSON\s*解析失败)/iu,
+  },
+];
+
+/**
+ * 从上游响应体文本中推断一个“更贴近错误语义”的 HTTP 状态码(用于假200修正)。
+ *
+ * 注意:
+ * - 该函数不会判断“是否为错误”,只做“状态码推断”;调用方应确保仅在已判定错误时才调用。
+ * - 未命中时返回 null,调用方应保持现有默认错误码(通常为 502)。
+ */
+export function inferUpstreamErrorStatusCodeFromText(
+  text: string
+): UpstreamErrorStatusInferenceResult | null {
+  let trimmed = text.trim();
+  if (!trimmed) return null;
+
+  // 与 detectUpstreamErrorFromSseOrJsonText 保持一致:移除 UTF-8 BOM,避免关键字匹配失效。
+  if (trimmed.charCodeAt(0) === 0xfeff) {
+    trimmed = trimmed.slice(1).trimStart();
+  }
+
+  const limited =
+    trimmed.length > STATUS_INFERENCE_MAX_CHARS
+      ? trimmed.slice(0, STATUS_INFERENCE_MAX_CHARS)
+      : trimmed;
+
+  for (const matcher of ERROR_STATUS_MATCHERS) {
+    if (matcher.re.test(limited)) {
+      return { statusCode: matcher.statusCode, matcherId: matcher.matcherId };
+    }
+  }
+
+  return null;
+}
+
+/**
+ * 判断文本是否“很像”一个完整的 HTML 文档(强信号)。
+ *
+ * 规则(偏保守):
+ * - 仅当文本以 `<` 开头时才继续;
+ * - 仅在前 1024 字符内检测 `<!doctype html ...>` 或以 `<html ...>` 开头;
+ * - 不做 HTML 解析/清洗,避免误判与额外开销。
+ *
+ * 说明:调用方应先对文本做 `trim()`,并在需要时移除 BOM(`\uFEFF`)。
+ */
+function isLikelyHtmlDocument(trimmedText: string): boolean {
+  if (!trimmedText.startsWith("<")) return false;
+  const head = trimmedText.slice(0, HTML_DOC_SNIFF_MAX_CHARS);
+  return HTML_DOCTYPE_RE.test(head) || HTML_HTML_TAG_RE.test(head);
+}
+
 function isPlainRecord(value: unknown): value is Record<string, unknown> {
   return !!value && typeof value === "object" && !Array.isArray(value);
 }
@@ -82,7 +236,7 @@ function hasNonEmptyValue(value: unknown): boolean {
   return true;
 }
 
-function sanitizeErrorTextForDetail(text: string): string {
+export function sanitizeErrorTextForDetail(text: string): string {
   // 注意:这里的目的不是“完美脱敏”,而是尽量降低上游错误信息中意外夹带敏感内容的风险。
   // 若后续发现更多敏感模式,可在不改变检测语义的前提下补充。
   let sanitized = text;
@@ -189,11 +343,31 @@ export function detectUpstreamErrorFromSseOrJsonText(
       messageKeyword: options.messageKeyword ?? DEFAULT_MESSAGE_KEYWORD,
     };
 
-  const trimmed = text.trim();
+  let trimmed = text.trim();
   if (!trimmed) {
     return { isError: true, code: FAKE_200_CODES.EMPTY_BODY };
   }
 
+  // 某些上游会带 UTF-8 BOM(\uFEFF),会导致 startsWith("{") / startsWith("<") 等快速判断失效。
+  // 这里仅剥离首字符 BOM,并再做一次 trimStart,避免误判。
+  if (trimmed.charCodeAt(0) === 0xfeff) {
+    trimmed = trimmed.slice(1).trimStart();
+  }
+
+  // 情况 0:明显的 HTML 文档(通常是网关/WAF/Cloudflare 返回的错误页)
+  //
+  // 说明:
+  // - 此处不依赖 Content-Type:部分上游会缺失/错误设置该字段;
+  // - 仅匹配 doctype/html 标签等“强信号”,避免把普通 `<...>` 文本误判为 HTML 页面。
+  if (isLikelyHtmlDocument(trimmed)) {
+    return {
+      isError: true,
+      code: FAKE_200_CODES.HTML_BODY,
+      // 避免对超大 HTML 做无谓处理:仅截取前段用于脱敏/截断与排查
+      detail: truncateForDetail(trimmed.slice(0, 4096)),
+    };
+  }
+
   // 情况 1:纯 JSON(对象)
   // 上游可能 Content-Type 设置为 SSE,但实际上返回 JSON;此处只处理对象格式({...}),
   // 不处理数组([...])以避免误判(数组场景的语义差异较大,后续若确认需要再扩展)。

+ 9 - 0
src/types/message.ts

@@ -71,6 +71,15 @@ export interface ProviderChainItem {
 
   // 修复:新增成功时的状态码
   statusCode?: number;
+  /**
+   * 标记 statusCode 是否为“基于响应体内容推断”的结果(而非上游真实返回的 HTTP 状态码)。
+   *
+   * 典型场景:上游返回 HTTP 200,但 body 为错误页/错误 JSON(假 200)。
+   * 此时为了让熔断/故障转移/会话绑定与“真实错误语义”保持一致,CCH 会推断更合理的 4xx/5xx。
+   *
+   * 该字段用于在决策链 / 技术时间线 / UI 中显著提示“此状态码为推断”,避免误读。
+   */
+  statusCodeInferred?: boolean;
 
   // 模型重定向信息(在供应商级别记录)
   modelRedirect?: {

+ 184 - 0
tests/unit/proxy/error-handler-verbose-provider-error-details.test.ts

@@ -0,0 +1,184 @@
+import { beforeEach, describe, expect, test, vi } from "vitest";
+
+const mocks = vi.hoisted(() => {
+  return {
+    getCachedSystemSettings: vi.fn(async () => ({ verboseProviderError: false }) as any),
+    getErrorOverrideAsync: vi.fn(async () => undefined),
+  };
+});
+
+vi.mock("@/lib/config/system-settings-cache", () => ({
+  getCachedSystemSettings: mocks.getCachedSystemSettings,
+}));
+
+vi.mock("@/lib/logger", () => ({
+  logger: {
+    debug: vi.fn(),
+    info: vi.fn(),
+    warn: vi.fn(),
+    trace: vi.fn(),
+    error: vi.fn(),
+    fatal: vi.fn(),
+  },
+}));
+
+vi.mock("@/app/v1/_lib/proxy/errors", async (importOriginal) => {
+  const actual = await importOriginal<typeof import("@/app/v1/_lib/proxy/errors")>();
+  return {
+    ...actual,
+    getErrorOverrideAsync: mocks.getErrorOverrideAsync,
+  };
+});
+
+import { ProxyErrorHandler } from "@/app/v1/_lib/proxy/error-handler";
+import { EmptyResponseError, ProxyError } from "@/app/v1/_lib/proxy/errors";
+
+function createSession(): any {
+  return {
+    sessionId: null,
+    messageContext: null,
+    startTime: Date.now(),
+    getProviderChain: () => [],
+    getCurrentModel: () => null,
+    getContext1mApplied: () => false,
+    provider: null,
+  };
+}
+
+describe("ProxyErrorHandler.handle - verboseProviderError details", () => {
+  beforeEach(() => {
+    mocks.getCachedSystemSettings.mockResolvedValue({ verboseProviderError: false } as any);
+    mocks.getErrorOverrideAsync.mockResolvedValue(undefined);
+  });
+
+  test("verboseProviderError=false 时,不应附带 fake-200 raw body/details", async () => {
+    const session = createSession();
+    const err = new ProxyError("FAKE_200_JSON_ERROR_NON_EMPTY", 429, {
+      body: "sanitized",
+      providerId: 1,
+      providerName: "p1",
+      requestId: "req_123",
+      rawBody: '{"error":"boom"}',
+      rawBodyTruncated: false,
+      statusCodeInferred: true,
+      statusCodeInferenceMatcherId: "rate_limit",
+    });
+
+    const res = await ProxyErrorHandler.handle(session, err);
+    expect(res.status).toBe(429);
+
+    const body = await res.json();
+    expect(body.error.details).toBeUndefined();
+    expect(body.request_id).toBeUndefined();
+  });
+
+  test("verboseProviderError=true 时,fake-200 应返回详细报告与上游原文", async () => {
+    mocks.getCachedSystemSettings.mockResolvedValue({ verboseProviderError: true } as any);
+
+    const session = createSession();
+    const err = new ProxyError("FAKE_200_HTML_BODY", 429, {
+      body: "redacted snippet",
+      providerId: 1,
+      providerName: "p1",
+      requestId: "req_123",
+      rawBody: "<!doctype html><html><body>blocked</body></html>",
+      rawBodyTruncated: false,
+      statusCodeInferred: true,
+      statusCodeInferenceMatcherId: "rate_limit",
+    });
+
+    const res = await ProxyErrorHandler.handle(session, err);
+    expect(res.status).toBe(429);
+
+    const body = await res.json();
+    expect(body.request_id).toBe("req_123");
+    expect(body.error.details).toEqual({
+      upstreamError: {
+        kind: "fake_200",
+        code: "FAKE_200_HTML_BODY",
+        statusCode: 429,
+        statusCodeInferred: true,
+        statusCodeInferenceMatcherId: "rate_limit",
+        clientSafeMessage: expect.any(String),
+        rawBody: "<!doctype html><html><body>blocked</body></html>",
+        rawBodyTruncated: false,
+      },
+    });
+  });
+
+  test("verboseProviderError=true 时,rawBody 应做基础脱敏(避免泄露 token/key)", async () => {
+    mocks.getCachedSystemSettings.mockResolvedValue({ verboseProviderError: true } as any);
+
+    const session = createSession();
+    const err = new ProxyError("FAKE_200_HTML_BODY", 429, {
+      body: "redacted snippet",
+      providerId: 1,
+      providerName: "p1",
+      requestId: "req_123",
+      rawBody:
+        "<!doctype html><html><body>Authorization: Bearer abc123 sk-1234567890abcdef1234567890 [email protected]</body></html>",
+      rawBodyTruncated: false,
+      statusCodeInferred: true,
+      statusCodeInferenceMatcherId: "rate_limit",
+    });
+
+    const res = await ProxyErrorHandler.handle(session, err);
+    expect(res.status).toBe(429);
+
+    const body = await res.json();
+    expect(body.request_id).toBe("req_123");
+    expect(body.error.details.upstreamError.kind).toBe("fake_200");
+
+    const rawBody = body.error.details.upstreamError.rawBody as string;
+    expect(rawBody).toContain("Bearer [REDACTED]");
+    expect(rawBody).toContain("[REDACTED_KEY]");
+    expect(rawBody).toContain("[EMAIL]");
+    expect(rawBody).not.toContain("Bearer abc123");
+    expect(rawBody).not.toContain("sk-1234567890abcdef1234567890");
+    expect(rawBody).not.toContain("[email protected]");
+  });
+
+  test("verboseProviderError=true 时,空响应错误也应返回详细报告(rawBody 为空字符串)", async () => {
+    mocks.getCachedSystemSettings.mockResolvedValue({ verboseProviderError: true } as any);
+
+    const session = createSession();
+    const err = new EmptyResponseError(1, "p1", "empty_body");
+
+    const res = await ProxyErrorHandler.handle(session, err);
+    expect(res.status).toBe(502);
+
+    const body = await res.json();
+    expect(body.error.details).toEqual({
+      upstreamError: {
+        kind: "empty_response",
+        reason: "empty_body",
+        clientSafeMessage: "Empty response: Response body is empty",
+        rawBody: "",
+        rawBodyTruncated: false,
+      },
+    });
+  });
+
+  test("有 error override 时,verbose details 不应覆盖覆写逻辑(优先级更低)", async () => {
+    mocks.getCachedSystemSettings.mockResolvedValue({ verboseProviderError: true } as any);
+    mocks.getErrorOverrideAsync.mockResolvedValue({ response: null, statusCode: 418 });
+
+    const session = createSession();
+    const err = new ProxyError("FAKE_200_JSON_ERROR_NON_EMPTY", 429, {
+      body: "sanitized",
+      providerId: 1,
+      providerName: "p1",
+      requestId: "req_123",
+      rawBody: '{"error":"boom"}',
+      rawBodyTruncated: false,
+      statusCodeInferred: true,
+      statusCodeInferenceMatcherId: "rate_limit",
+    });
+
+    const res = await ProxyErrorHandler.handle(session, err);
+    expect(res.status).toBe(418);
+
+    const body = await res.json();
+    expect(body.error.details).toBeUndefined();
+  });
+});

+ 53 - 0
tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts

@@ -562,6 +562,59 @@ describe("ProxyForwarder - endpoint audit", () => {
     expect(exhaustedItem!.errorMessage).toBeUndefined();
   });
 
+  test("endpoint_pool_exhausted should not be deduped away when initial_selection already recorded", async () => {
+    const requestPath = "/v1/messages";
+    const session = createSession(new URL(`https://example.com${requestPath}`));
+    const provider = createProvider({
+      providerType: "claude",
+      providerVendorId: 123,
+      url: "https://provider.example.com/v1/messages",
+    });
+    session.setProvider(provider);
+
+    // Simulate ProviderSelector already recorded initial_selection for the same provider
+    session.addProviderToChain(provider, { reason: "initial_selection" });
+
+    mocks.getPreferredProviderEndpoints.mockResolvedValueOnce([]);
+    mocks.getEndpointFilterStats.mockResolvedValueOnce({
+      total: 0,
+      enabled: 0,
+      circuitOpen: 0,
+      available: 0,
+    });
+
+    const doForward = vi.spyOn(
+      ProxyForwarder as unknown as { doForward: (...args: unknown[]) => unknown },
+      "doForward"
+    );
+
+    await expect(ProxyForwarder.send(session)).rejects.toThrow();
+
+    expect(doForward).not.toHaveBeenCalled();
+
+    const chain = session.getProviderChain();
+    expect(chain.some((item) => item.reason === "initial_selection")).toBe(true);
+
+    const exhaustedItems = chain.filter((item) => item.reason === "endpoint_pool_exhausted");
+    expect(exhaustedItems).toHaveLength(1);
+
+    expect(exhaustedItems[0]).toEqual(
+      expect.objectContaining({
+        id: provider.id,
+        name: provider.name,
+        reason: "endpoint_pool_exhausted",
+        strictBlockCause: "no_endpoint_candidates",
+        attemptNumber: 1,
+        endpointFilterStats: {
+          total: 0,
+          enabled: 0,
+          circuitOpen: 0,
+          available: 0,
+        },
+      })
+    );
+  });
+
   test("endpoint pool exhausted (selector_error) should record endpoint_pool_exhausted with selectorError in decisionContext", async () => {
     const requestPath = "/v1/responses";
     const session = createSession(new URL(`https://example.com${requestPath}`));

+ 573 - 0
tests/unit/proxy/proxy-forwarder-fake-200-html.test.ts

@@ -0,0 +1,573 @@
+import { beforeEach, describe, expect, test, vi } from "vitest";
+
+const mocks = vi.hoisted(() => {
+  return {
+    pickRandomProviderWithExclusion: vi.fn(),
+    recordSuccess: vi.fn(),
+    recordFailure: vi.fn(async () => {}),
+    getCircuitState: vi.fn(() => "closed"),
+    getProviderHealthInfo: vi.fn(async () => ({
+      health: { failureCount: 0 },
+      config: { failureThreshold: 3 },
+    })),
+    updateMessageRequestDetails: vi.fn(async () => {}),
+    isHttp2Enabled: vi.fn(async () => false),
+    getPreferredProviderEndpoints: vi.fn(async () => []),
+    getEndpointFilterStats: vi.fn(async () => null),
+    recordEndpointSuccess: vi.fn(async () => {}),
+    recordEndpointFailure: vi.fn(async () => {}),
+    isVendorTypeCircuitOpen: vi.fn(async () => false),
+    recordVendorTypeAllEndpointsTimeout: vi.fn(async () => {}),
+    // ErrorCategory.PROVIDER_ERROR
+    categorizeErrorAsync: vi.fn(async () => 0),
+  };
+});
+
+vi.mock("@/lib/logger", () => ({
+  logger: {
+    debug: vi.fn(),
+    info: vi.fn(),
+    warn: vi.fn(),
+    trace: vi.fn(),
+    error: vi.fn(),
+    fatal: vi.fn(),
+  },
+}));
+
+vi.mock("@/lib/config", async (importOriginal) => {
+  const actual = await importOriginal<typeof import("@/lib/config")>();
+  return {
+    ...actual,
+    isHttp2Enabled: mocks.isHttp2Enabled,
+  };
+});
+
+vi.mock("@/lib/provider-endpoints/endpoint-selector", () => ({
+  getPreferredProviderEndpoints: mocks.getPreferredProviderEndpoints,
+  getEndpointFilterStats: mocks.getEndpointFilterStats,
+}));
+
+vi.mock("@/lib/endpoint-circuit-breaker", () => ({
+  recordEndpointSuccess: mocks.recordEndpointSuccess,
+  recordEndpointFailure: mocks.recordEndpointFailure,
+}));
+
+vi.mock("@/lib/circuit-breaker", () => ({
+  getCircuitState: mocks.getCircuitState,
+  getProviderHealthInfo: mocks.getProviderHealthInfo,
+  recordFailure: mocks.recordFailure,
+  recordSuccess: mocks.recordSuccess,
+}));
+
+vi.mock("@/lib/vendor-type-circuit-breaker", () => ({
+  isVendorTypeCircuitOpen: mocks.isVendorTypeCircuitOpen,
+  recordVendorTypeAllEndpointsTimeout: mocks.recordVendorTypeAllEndpointsTimeout,
+}));
+
+vi.mock("@/repository/message", () => ({
+  updateMessageRequestDetails: mocks.updateMessageRequestDetails,
+}));
+
+vi.mock("@/app/v1/_lib/proxy/provider-selector", () => ({
+  ProxyProviderResolver: {
+    pickRandomProviderWithExclusion: mocks.pickRandomProviderWithExclusion,
+  },
+}));
+
+vi.mock("@/app/v1/_lib/proxy/errors", async (importOriginal) => {
+  const actual = await importOriginal<typeof import("@/app/v1/_lib/proxy/errors")>();
+  return {
+    ...actual,
+    categorizeErrorAsync: mocks.categorizeErrorAsync,
+  };
+});
+
+import { ProxyForwarder } from "@/app/v1/_lib/proxy/forwarder";
+import { ProxyError } from "@/app/v1/_lib/proxy/errors";
+import { ProxySession } from "@/app/v1/_lib/proxy/session";
+import type { Provider } from "@/types/provider";
+
+function createProvider(overrides: Partial<Provider> = {}): Provider {
+  return {
+    id: 1,
+    name: "p1",
+    url: "https://provider.example.com",
+    key: "k",
+    providerVendorId: null,
+    isEnabled: true,
+    weight: 1,
+    priority: 0,
+    groupPriorities: null,
+    costMultiplier: 1,
+    groupTag: null,
+    providerType: "claude",
+    preserveClientIp: false,
+    modelRedirects: null,
+    allowedModels: null,
+    mcpPassthroughType: "none",
+    mcpPassthroughUrl: null,
+    limit5hUsd: null,
+    limitDailyUsd: null,
+    dailyResetMode: "fixed",
+    dailyResetTime: "00:00",
+    limitWeeklyUsd: null,
+    limitMonthlyUsd: null,
+    limitTotalUsd: null,
+    totalCostResetAt: null,
+    limitConcurrentSessions: 0,
+    maxRetryAttempts: 1,
+    circuitBreakerFailureThreshold: 5,
+    circuitBreakerOpenDuration: 1_800_000,
+    circuitBreakerHalfOpenSuccessThreshold: 2,
+    proxyUrl: null,
+    proxyFallbackToDirect: false,
+    firstByteTimeoutStreamingMs: 30_000,
+    streamingIdleTimeoutMs: 10_000,
+    requestTimeoutNonStreamingMs: 1_000,
+    websiteUrl: null,
+    faviconUrl: null,
+    cacheTtlPreference: null,
+    context1mPreference: null,
+    codexReasoningEffortPreference: null,
+    codexReasoningSummaryPreference: null,
+    codexTextVerbosityPreference: null,
+    codexParallelToolCallsPreference: null,
+    anthropicMaxTokensPreference: null,
+    anthropicThinkingBudgetPreference: null,
+    anthropicAdaptiveThinking: null,
+    geminiGoogleSearchPreference: null,
+    tpm: 0,
+    rpm: 0,
+    rpd: 0,
+    cc: 0,
+    createdAt: new Date(),
+    updatedAt: new Date(),
+    deletedAt: null,
+    ...overrides,
+  };
+}
+
+function createSession(): ProxySession {
+  const headers = new Headers();
+  const session = Object.create(ProxySession.prototype);
+
+  Object.assign(session, {
+    startTime: Date.now(),
+    method: "POST",
+    requestUrl: new URL("https://example.com/v1/messages"),
+    headers,
+    originalHeaders: new Headers(headers),
+    headerLog: JSON.stringify(Object.fromEntries(headers.entries())),
+    request: {
+      model: "claude-test",
+      log: "(test)",
+      message: {
+        model: "claude-test",
+        messages: [{ role: "user", content: "hi" }],
+      },
+    },
+    userAgent: null,
+    context: null,
+    clientAbortSignal: null,
+    userName: "test-user",
+    authState: { success: true, user: null, key: null, apiKey: null },
+    provider: null,
+    messageContext: null,
+    sessionId: null,
+    requestSequence: 1,
+    originalFormat: "claude",
+    providerType: null,
+    originalModelName: null,
+    originalUrlPathname: null,
+    providerChain: [],
+    cacheTtlResolved: null,
+    context1mApplied: false,
+    specialSettings: [],
+    cachedPriceData: undefined,
+    cachedBillingModelSource: undefined,
+    isHeaderModified: () => false,
+  });
+
+  return session as ProxySession;
+}
+
+describe("ProxyForwarder - fake 200 HTML body", () => {
+  beforeEach(() => {
+    vi.clearAllMocks();
+  });
+
+  test("200 + text/html 的 HTML 页面应视为失败并切换供应商", async () => {
+    const provider1 = createProvider({ id: 1, name: "p1", key: "k1", maxRetryAttempts: 1 });
+    const provider2 = createProvider({ id: 2, name: "p2", key: "k2", maxRetryAttempts: 1 });
+
+    const session = createSession();
+    session.setProvider(provider1);
+
+    mocks.pickRandomProviderWithExclusion.mockResolvedValueOnce(provider2);
+
+    const doForward = vi.spyOn(ProxyForwarder as any, "doForward");
+
+    const htmlBody = [
+      "<!doctype html>",
+      "<html><head><title>New API</title></head>",
+      "<body>blocked</body></html>",
+    ].join("\n");
+    const okJson = JSON.stringify({ type: "message", content: [{ type: "text", text: "ok" }] });
+
+    doForward.mockResolvedValueOnce(
+      new Response(htmlBody, {
+        status: 200,
+        headers: {
+          "content-type": "text/html; charset=utf-8",
+          "content-length": String(htmlBody.length),
+        },
+      })
+    );
+
+    doForward.mockResolvedValueOnce(
+      new Response(okJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(okJson.length),
+        },
+      })
+    );
+
+    const response = await ProxyForwarder.send(session);
+    expect(await response.text()).toContain("ok");
+
+    expect(doForward).toHaveBeenCalledTimes(2);
+    expect(doForward.mock.calls[0][1].id).toBe(1);
+    expect(doForward.mock.calls[1][1].id).toBe(2);
+
+    expect(mocks.pickRandomProviderWithExclusion).toHaveBeenCalledWith(session, [1]);
+    expect(mocks.recordFailure).toHaveBeenCalledWith(
+      1,
+      expect.objectContaining({ message: "FAKE_200_HTML_BODY" })
+    );
+    const failure1 = mocks.recordFailure.mock.calls[0]?.[1];
+    expect(failure1).toBeInstanceOf(ProxyError);
+    expect((failure1 as ProxyError).getClientSafeMessage()).toContain("HTML document");
+    expect((failure1 as ProxyError).getClientSafeMessage()).toContain("Upstream detail:");
+    expect(mocks.recordSuccess).toHaveBeenCalledWith(2);
+    expect(mocks.recordSuccess).not.toHaveBeenCalledWith(1);
+  });
+
+  test("200 + text/html 但 body 是 JSON error 也应视为失败并切换供应商", async () => {
+    const provider1 = createProvider({ id: 1, name: "p1", key: "k1", maxRetryAttempts: 1 });
+    const provider2 = createProvider({ id: 2, name: "p2", key: "k2", maxRetryAttempts: 1 });
+
+    const session = createSession();
+    session.setProvider(provider1);
+
+    mocks.pickRandomProviderWithExclusion.mockResolvedValueOnce(provider2);
+
+    const doForward = vi.spyOn(ProxyForwarder as any, "doForward");
+
+    const jsonErrorBody = JSON.stringify({ error: "upstream blocked" });
+    const okJson = JSON.stringify({ type: "message", content: [{ type: "text", text: "ok" }] });
+
+    doForward.mockResolvedValueOnce(
+      new Response(jsonErrorBody, {
+        status: 200,
+        headers: {
+          // 故意使用 text/html:模拟部分上游 Content-Type 错配但 body 仍为错误 JSON 的情况
+          "content-type": "text/html; charset=utf-8",
+          "content-length": String(jsonErrorBody.length),
+        },
+      })
+    );
+
+    doForward.mockResolvedValueOnce(
+      new Response(okJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(okJson.length),
+        },
+      })
+    );
+
+    const response = await ProxyForwarder.send(session);
+    expect(await response.text()).toContain("ok");
+
+    expect(doForward).toHaveBeenCalledTimes(2);
+    expect(doForward.mock.calls[0][1].id).toBe(1);
+    expect(doForward.mock.calls[1][1].id).toBe(2);
+
+    expect(mocks.pickRandomProviderWithExclusion).toHaveBeenCalledWith(session, [1]);
+    expect(mocks.recordFailure).toHaveBeenCalledWith(
+      1,
+      expect.objectContaining({ message: "FAKE_200_JSON_ERROR_NON_EMPTY" })
+    );
+    const failure2 = mocks.recordFailure.mock.calls[0]?.[1];
+    expect(failure2).toBeInstanceOf(ProxyError);
+    expect((failure2 as ProxyError).getClientSafeMessage()).toContain("JSON body");
+    expect((failure2 as ProxyError).getClientSafeMessage()).toContain("`error`");
+    expect((failure2 as ProxyError).getClientSafeMessage()).toContain("upstream blocked");
+    expect((failure2 as ProxyError).upstreamError?.rawBody).toBe(jsonErrorBody);
+    expect((failure2 as ProxyError).upstreamError?.rawBodyTruncated).toBe(false);
+    expect(mocks.recordSuccess).toHaveBeenCalledWith(2);
+    expect(mocks.recordSuccess).not.toHaveBeenCalledWith(1);
+  });
+
+  test("200 + application/json 且有 Content-Length 的 JSON error 也应视为失败并切换供应商", async () => {
+    const provider1 = createProvider({ id: 1, name: "p1", key: "k1", maxRetryAttempts: 1 });
+    const provider2 = createProvider({ id: 2, name: "p2", key: "k2", maxRetryAttempts: 1 });
+
+    const session = createSession();
+    session.setProvider(provider1);
+
+    mocks.pickRandomProviderWithExclusion.mockResolvedValueOnce(provider2);
+
+    const doForward = vi.spyOn(ProxyForwarder as any, "doForward");
+
+    const jsonErrorBody = JSON.stringify({ error: "upstream blocked" });
+    const okJson = JSON.stringify({ type: "message", content: [{ type: "text", text: "ok" }] });
+
+    doForward.mockResolvedValueOnce(
+      new Response(jsonErrorBody, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(jsonErrorBody.length),
+        },
+      })
+    );
+
+    doForward.mockResolvedValueOnce(
+      new Response(okJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(okJson.length),
+        },
+      })
+    );
+
+    const response = await ProxyForwarder.send(session);
+    expect(await response.text()).toContain("ok");
+
+    expect(doForward).toHaveBeenCalledTimes(2);
+    expect(doForward.mock.calls[0][1].id).toBe(1);
+    expect(doForward.mock.calls[1][1].id).toBe(2);
+
+    expect(mocks.pickRandomProviderWithExclusion).toHaveBeenCalledWith(session, [1]);
+    expect(mocks.recordFailure).toHaveBeenCalledWith(
+      1,
+      expect.objectContaining({ message: "FAKE_200_JSON_ERROR_NON_EMPTY" })
+    );
+    const failure3 = mocks.recordFailure.mock.calls[0]?.[1];
+    expect(failure3).toBeInstanceOf(ProxyError);
+    expect((failure3 as ProxyError).getClientSafeMessage()).toContain("JSON body");
+    expect((failure3 as ProxyError).getClientSafeMessage()).toContain("`error`");
+    expect((failure3 as ProxyError).getClientSafeMessage()).toContain("upstream blocked");
+    expect((failure3 as ProxyError).upstreamError?.rawBody).toBe(jsonErrorBody);
+    expect((failure3 as ProxyError).upstreamError?.rawBodyTruncated).toBe(false);
+    expect(mocks.recordSuccess).toHaveBeenCalledWith(2);
+    expect(mocks.recordSuccess).not.toHaveBeenCalledWith(1);
+  });
+
+  test("假200 JSON error 命中 rate limit 关键字时,应推断为 429 并在决策链中标记为推断", async () => {
+    const provider1 = createProvider({ id: 1, name: "p1", key: "k1", maxRetryAttempts: 1 });
+    const provider2 = createProvider({ id: 2, name: "p2", key: "k2", maxRetryAttempts: 1 });
+
+    const session = createSession();
+    session.setProvider(provider1);
+
+    mocks.pickRandomProviderWithExclusion.mockResolvedValueOnce(provider2);
+
+    const doForward = vi.spyOn(ProxyForwarder as any, "doForward");
+
+    const jsonErrorBody = JSON.stringify({ error: "Rate limit exceeded" });
+    const okJson = JSON.stringify({ type: "message", content: [{ type: "text", text: "ok" }] });
+
+    doForward.mockResolvedValueOnce(
+      new Response(jsonErrorBody, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(jsonErrorBody.length),
+        },
+      })
+    );
+
+    doForward.mockResolvedValueOnce(
+      new Response(okJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(okJson.length),
+        },
+      })
+    );
+
+    const response = await ProxyForwarder.send(session);
+    expect(await response.text()).toContain("ok");
+
+    expect(mocks.recordFailure).toHaveBeenCalledWith(
+      1,
+      expect.objectContaining({ message: "FAKE_200_JSON_ERROR_NON_EMPTY" })
+    );
+
+    const failure = mocks.recordFailure.mock.calls[0]?.[1];
+    expect(failure).toBeInstanceOf(ProxyError);
+    expect((failure as ProxyError).statusCode).toBe(429);
+    expect((failure as ProxyError).upstreamError?.statusCodeInferred).toBe(true);
+
+    const chain = session.getProviderChain();
+    expect(
+      chain.some(
+        (item) =>
+          item.id === 1 &&
+          item.reason === "retry_failed" &&
+          item.statusCode === 429 &&
+          item.statusCodeInferred === true
+      )
+    ).toBe(true);
+  });
+
+  test("200 + 非法 Content-Length 时应按缺失处理,避免漏检 HTML 假200", async () => {
+    const provider1 = createProvider({ id: 1, name: "p1", key: "k1", maxRetryAttempts: 1 });
+    const provider2 = createProvider({ id: 2, name: "p2", key: "k2", maxRetryAttempts: 1 });
+
+    const session = createSession();
+    session.setProvider(provider1);
+
+    mocks.pickRandomProviderWithExclusion.mockResolvedValueOnce(provider2);
+
+    const doForward = vi.spyOn(ProxyForwarder as any, "doForward");
+
+    const htmlErrorBody = "<!doctype html><html><body>blocked</body></html>";
+    const okJson = JSON.stringify({ type: "message", content: [{ type: "text", text: "ok" }] });
+
+    doForward.mockResolvedValueOnce(
+      new Response(htmlErrorBody, {
+        status: 200,
+        headers: {
+          // 故意不提供 html/json 的 Content-Type,覆盖“仅靠 body 嗅探”的假200检测分支
+          "content-type": "text/plain; charset=utf-8",
+          // 非法 Content-Length:parseInt("12abc") 会返回 12;修复后应视为非法并进入 body 检查分支
+          "content-length": "12abc",
+        },
+      })
+    );
+
+    doForward.mockResolvedValueOnce(
+      new Response(okJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(okJson.length),
+        },
+      })
+    );
+
+    const response = await ProxyForwarder.send(session);
+    expect(await response.text()).toContain("ok");
+
+    expect(doForward).toHaveBeenCalledTimes(2);
+    expect(doForward.mock.calls[0][1].id).toBe(1);
+    expect(doForward.mock.calls[1][1].id).toBe(2);
+
+    expect(mocks.pickRandomProviderWithExclusion).toHaveBeenCalledWith(session, [1]);
+    expect(mocks.recordFailure).toHaveBeenCalledWith(
+      1,
+      expect.objectContaining({ message: "FAKE_200_HTML_BODY" })
+    );
+
+    const failure = mocks.recordFailure.mock.calls[0]?.[1];
+    expect(failure).toBeInstanceOf(ProxyError);
+    expect((failure as ProxyError).upstreamError?.rawBody).toBe(htmlErrorBody);
+    expect(mocks.recordSuccess).toHaveBeenCalledWith(2);
+    expect(mocks.recordSuccess).not.toHaveBeenCalledWith(1);
+  });
+
+  test("缺少 content 字段(missing_content)不应被 JSON 解析 catch 吞掉,应触发切换供应商", async () => {
+    const provider1 = createProvider({ id: 1, name: "p1", key: "k1", maxRetryAttempts: 1 });
+    const provider2 = createProvider({ id: 2, name: "p2", key: "k2", maxRetryAttempts: 1 });
+
+    const session = createSession();
+    session.setProvider(provider1);
+
+    mocks.pickRandomProviderWithExclusion.mockResolvedValueOnce(provider2);
+
+    const doForward = vi.spyOn(ProxyForwarder as any, "doForward");
+
+    const missingContentJson = JSON.stringify({ type: "message", content: [] });
+    const okJson = JSON.stringify({ type: "message", content: [{ type: "text", text: "ok" }] });
+
+    doForward.mockResolvedValueOnce(
+      new Response(missingContentJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          // 故意不提供 content-length:覆盖 forwarder 的 clone + JSON 内容结构检查分支
+        },
+      })
+    );
+
+    doForward.mockResolvedValueOnce(
+      new Response(okJson, {
+        status: 200,
+        headers: {
+          "content-type": "application/json; charset=utf-8",
+          "content-length": String(okJson.length),
+        },
+      })
+    );
+
+    const response = await ProxyForwarder.send(session);
+    expect(await response.text()).toContain("ok");
+
+    expect(doForward).toHaveBeenCalledTimes(2);
+    expect(doForward.mock.calls[0][1].id).toBe(1);
+    expect(doForward.mock.calls[1][1].id).toBe(2);
+
+    expect(mocks.pickRandomProviderWithExclusion).toHaveBeenCalledWith(session, [1]);
+    expect(mocks.recordFailure).toHaveBeenCalledWith(
+      1,
+      expect.objectContaining({ reason: "missing_content" })
+    );
+    expect(mocks.recordSuccess).toHaveBeenCalledWith(2);
+    expect(mocks.recordSuccess).not.toHaveBeenCalledWith(1);
+  });
+});
+
+describe("ProxyError.getClientSafeMessage - FAKE_200 sanitization", () => {
+  test("upstream body 包含 JWT 和 email 时应被脱敏为 [JWT] / [EMAIL]", () => {
+    const jwtToken =
+      "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U";
+    const email = "[email protected]";
+    const body = `Authentication failed for ${email} with token ${jwtToken}`;
+
+    const error = new ProxyError("FAKE_200_JSON_ERROR_NON_EMPTY", 502, {
+      body,
+      providerId: 1,
+      providerName: "p1",
+    });
+
+    const msg = error.getClientSafeMessage();
+    expect(msg).toContain("[JWT]");
+    expect(msg).toContain("[EMAIL]");
+    expect(msg).not.toContain(jwtToken);
+    expect(msg).not.toContain(email);
+    expect(msg).toContain("Upstream detail:");
+  });
+
+  test("upstream body 包含 password=xxx 时应被脱敏", () => {
+    const body = "Config error: password=s3cretValue in /etc/app.json";
+
+    const error = new ProxyError("FAKE_200_HTML_BODY", 502, {
+      body,
+      providerId: 1,
+      providerName: "p1",
+    });
+
+    const msg = error.getClientSafeMessage();
+    expect(msg).not.toContain("s3cretValue");
+    expect(msg).toContain("[PATH]");
+    expect(msg).toContain("Upstream detail:");
+  });
+});

+ 44 - 5
tests/unit/proxy/response-handler-endpoint-circuit-isolation.test.ts

@@ -183,7 +183,7 @@ function createSession(opts?: { sessionId?: string | null }): ProxySession {
     ttfbMs: null,
     getRequestSequence: () => 1,
     addProviderToChain: function (
-      this: ProxySession & { providerChain: unknown[] },
+      this: ProxySession & { providerChain: Record<string, unknown>[] },
       prov: {
         id: number;
         name: string;
@@ -193,7 +193,8 @@ function createSession(opts?: { sessionId?: string | null }): ProxySession {
         costMultiplier: number;
         groupTag: string;
         providerVendorId?: string;
-      }
+      },
+      metadata?: Record<string, unknown>
     ) {
       this.providerChain.push({
         id: prov.id,
@@ -204,7 +205,11 @@ function createSession(opts?: { sessionId?: string | null }): ProxySession {
         weight: prov.weight,
         costMultiplier: prov.costMultiplier,
         groupTag: prov.groupTag,
-        timestamp: Date.now(),
+        timestamp:
+          typeof metadata?.timestamp === "number" && Number.isFinite(metadata.timestamp)
+            ? metadata.timestamp
+            : Date.now(),
+        ...(metadata ?? {}),
       });
     },
   });
@@ -249,8 +254,8 @@ function setDeferredMeta(session: ProxySession, endpointId: number | null = 42)
 }
 
 /** Create an SSE stream that emits a fake-200 error body (valid HTTP 200 but error in content). */
-function createFake200StreamResponse(): Response {
-  const body = `data: ${JSON.stringify({ error: { message: "invalid api key" } })}\n\n`;
+function createFake200StreamResponse(errorMessage: string = "invalid api key"): Response {
+  const body = `data: ${JSON.stringify({ error: { message: errorMessage } })}\n\n`;
   const encoder = new TextEncoder();
   const stream = new ReadableStream<Uint8Array>({
     start(controller) {
@@ -353,6 +358,40 @@ describe("Endpoint circuit breaker isolation", () => {
       expect.objectContaining({ message: expect.stringContaining("FAKE_200") })
     );
     expect(mockRecordEndpointFailure).not.toHaveBeenCalled();
+
+    const chain = session.getProviderChain();
+    expect(
+      chain.some(
+        (item) =>
+          item.id === 1 &&
+          item.reason === "retry_failed" &&
+          item.statusCode === 401 &&
+          item.statusCodeInferred === true
+      )
+    ).toBe(true);
+  });
+
+  it("fake-200 inferred 404 should NOT call recordFailure and should be marked as resource_not_found", async () => {
+    const session = createSession();
+    setDeferredMeta(session, 42);
+
+    const response = createFake200StreamResponse("model not found");
+    await ProxyResponseHandler.dispatch(session, response);
+    await drainAsyncTasks();
+
+    expect(mockRecordFailure).not.toHaveBeenCalled();
+    expect(mockRecordEndpointFailure).not.toHaveBeenCalled();
+
+    const chain = session.getProviderChain();
+    expect(
+      chain.some(
+        (item) =>
+          item.id === 1 &&
+          item.reason === "resource_not_found" &&
+          item.statusCode === 404 &&
+          item.statusCodeInferred === true
+      )
+    ).toBe(true);
   });
 
   it("non-200 HTTP status should call recordFailure but NOT recordEndpointFailure", async () => {