شاخه: text-to-speech-experiments

148-restore-docs-links

420-show-message-when-using-cline-or-roo-rule-files-failing-to-improve-import

4525_chore_reorganize_kilocode_with

4810_featcli_add_append-system-prompt

4826_cli_queue_messages_json-io_plus

5032_fix_cli_dispose_randomuuid_debug_ux-ne

5318_kilocode_cli_write_to_file_twice

LigiaZ-patch-1

add-agent-management-tab

add-auto-model-support

add-back-update-contributors-script

add-gpt-53-codex

add-opus-46

add-sessions-to-cli

add-smol-command

add-support-for-multi-root-workspaces-822

add-symlink-support

add-type-export-script

add-type-export-script-attempt-2

add-walkthrough

agent-branch-picker

agent-manager-worktree-support

aider-watch

am-cloud-agents

am-permissions

animated-tab-switching

at_kilo_dig_out

auto-triage

autocomplete-abort-stream

autocomplete-follow-ups

autocomplete-follow-ups-2

bdo/daily-broken-list-fix

bdo/remove-roo-cline

beatlevic/autocomplete-lru-cache-in-mem

beatlevic/fim-context-improvements

beatlevic/ghost-streaming-parser-test-cleanup

beatlevic/inline-ghost-completion

beatlevic/inline-ghost-empty-search

bmc/fix-project-id-for-non-dot-git

bmc/remove-dead-code

brave_bhaskara

briant/deploydocs

briant/updateNPM

cancel-revert-enhancement

catriel/ghost-prevent-race-conditions

catriel/speedup-runners

catrielmuller/cli-request

catrielmuller/fix-jetbrains-terminal-integration

catrielmuller/fix-msg-deadlocks

catrielmuller/jetbrains-remove-mpc-button

catrielmuller/migrate-jetbrains-sdk-253

catrielmuller/multi-kilo-providers-fix

change-display-name

changeset-release/main

cherry-pick/pr-10813

chore/remove-openai-handler-todo-comment

chore/remove-redundant-todo-comment

christiaan/cli

cli-bundle

cli-bundle-HOLD

cli-fix-command-suggestion-default

cli-integration-tests

cli-jsonio-yolo-fix

cli-queue-messages

cline-max-requests

cline-max-requests-roo

code-index-progress-bar

codicon-mode-icons-roo

comment-out-all-telemetry

correctly-handle-kitty-input

david-test

debug-settings

docs-project-name

docs/add-breadcrumbs

docs/add-claude-code-credentials-notice

docs/add-sitemap-xml

docs/cli-configuration

docs/cli-custom-modes-location

docs/cli-development-quick-start

docs/cli-development-quickstart

docs/cli-env-setup

docs/enable-search-insights

docs/missing-items

docs/slackbot-model-configuration

docs/troubleshooting

docs/update-404-links

docs/update-managed-indexing

docs/update-managed-indexing-heading

docs/verify-algolia

eamon/MergeStrategyReview

eamon/SupportStaging

enable-autocomplete-jetbrains

exec-cmd-in-background

extension-release-notes

feat/add-llms-txt-support

feat/add-review-mode

feat/cli-auto-purge

feat/cli-diff-syntax-highlighting

feat/cli-ephemeral-mode-argument

feat/cli-no-git-restore

feat/enhanced-cache-matching-poc

feat/multi-directory-skills-support

feat/session-name-in-history

feat/trigger-homebrew-tap-update

feature/agent-manager-image-paste

feature/cli-version-check-caching

feature/ghost-interface-alignment

feature/ghost-request-deduplication

feature/jetbrains-vscode-command-check

feature/modes-folder

feature/skills-md-notification

fix-build

fix-cli-duplicate-output

fix-new-terminal-every-cmd

fix-playwright-test-dl-failure

fix-playwright-test-notifications

fix-terminal-btn

fix-tests

fix-workspace-tracker-too-many-files

fix/agent-behaviour-search-translation

fix/agent-manager-image-support

fix/auto-scroll-auto-approve

fix/auto-scroll-auto-approve-section

fix/autocomplete-model-refresh-on-login

fix/bracket-autocompletion-duplication

fix/cli-agent-manager-no-config-auth

fix/cli-teams-autocomplete

fix/cli-terminal-scroll-flicker

fix/command-execution-multiline-display

fix/disable-unconditional-cli-notifications

fix/ghost-auto-bracket-detection

fix/mode-create-role-definition-clearing

fix/moonshot-kimi-k2-temperature

fix/multiline-command-auto-approve

fix/word-boundary-search-backtracking

fix/word-boundary-search-fusejs

fix/word-boundary-search-matchsorter

florian/fix/ctrl-chars-in-json-out

florian/fix/thinking-block-errors

ghost-memory-improvements

ghost-strategies-redux

ghost-strategies-redux-2

git-commit-gen-map-reduce-prompt

git-restore-no-head-fallback

goofy_roentgen

hassoncs/ghost-testing-framework

hassoncs/jetbrains

hiding-welcome-message

highlight-limited-context-remaining

ignore-vscode-submodule

improve-ask-response-chat-ux

improve_how_file_updates-mw

improve_the_ui_in-mh

jetbrains-build-optimizations

jl-acp-prototype

jl-add-global-ignore

jl-change-all-icons

jl-chatgpt-cherry-pick

jl-fix-model-selection

jl-kilo-pass-profile-page

jl-prerelease-docs

jl-roo-paths

jl-support-mcp-reload

jobrietbergen-patch-1

jovial_franklin

kevinvandijk/enable-auto-complete-for-new-installs

kevinvandijk/fix-clear-index-on-error

lambertjosh-patch-1

lambertjosh-patch-2

lambertjosh-patch-3

main

mark/add-autoapprove-for-commands

mark/add-debounce

mark/add-web-tools

mark/autocomplete-profile-config

mark/autocomplete-settings

mark/autocomplete-single-line-truncation

mark/autocomplete-transplant-docs

mark/chat-autocomplete-use-shared-filter

mark/dead-continue-code

mark/debounce

mark/disable-streaming-parsing

mark/duplicatecommands2

mark/duplicatecommands3

mark/enable-autocomplete-for-new-installs

mark/fix-duplicate-upsert-api-configuration

mark/fix-mcp-restart-loop

mark/fix-profile-state-sharing-bug

mark/fix-settings-editing-profile

mark/fixbugs

mark/ghost-autocomplete-telemetry

mark/ghost-generator-reuse

mark/ghost-inline-partial-request-reuse

mark/ghost-inline-provider-test-simplify

mark/ghost-inline-streaming-reuse

mark/ghost-statusbar-click-to-show

mark/git-commit-autocomplete

mark/holefiller

mark/kilocode-backend-envvar

mark/less-noise

mark/mercury-coder-web-benchmarks

mark/merge-model-logic

mark/more-completions

mark/multiple-strategies

mark/o11y-error-classification

mark/onboarding-flow-update

mark/opt-in-free-models

mark/other-java-setup

mark/parse-search-replace-regex-tests

mark/pr-5234-review

mark/reapply-multiline-auto-approve-attempt1

mark/reload-autocomplete-on-get-started

mark/rename-ghost-to-autocomplete

mark/replace-approvals

mark/reproduce-786

mark/roo-v3.22.0

mark/session-metrics-o11y

mark/sqlite

mark/telemetry-id-based-tracking

mark/telemetry-phase-1a

mark/update-approvals-opus

mark/xml-testing

mcowger/virtualProvider

mcowger/virtualProvider-HOLD

mcp-panel-of-experts

memory-bank

memory-telemetry-service

model-per-mode

mw/agent-manager-cloud-mode

nested-agents

nested-agents-md

planning-doc-tool

playwright-network-cache

port/kilo-pr-867-to-roo

pr-3704-part3-context-fim-formatting

pr-4786

pr-4810

pr-4868

pr-5087

pr-5644-fix-dark-mode-icons

pr/single-commit-a95ff49

provider-selection-fix

quick-fix-openrouter-ui

refactor-tool-experiment

refactor-tool-squashed

refactor/extract-kilocode-webview-handlers

refactor/unify-holefiller-fim-strategies

refactor/unify-suggestion-adjustment

release-notes-ui-only

remove-enterprise-pricing

remove-request-message-and-sum-costs

resize-repaint

resource-and-log

restore-docs-link

revert-4222-catrielmuller/jetbrains-fix-webview-assets

revert-5427-bdo/fix-for-redirects

screenshot-flakes-script

session/agent_0c51ed5d-2c2e-4c6d-b6f9-62eeb9aaaee1

session/agent_0ee7e893-be7e-4c42-a5c9-efd44a06a437

session/agent_0f0287d4-6b73-4241-bbcb-9490b88294d5

session/agent_1003c3bf-9980-4e5b-98d3-d70e88fd32c3

session/agent_13b53740-6cd6-49b9-9501-04e3257d88b4

session/agent_18cd6e87-57a1-414e-bf74-d85c755e3d10

session/agent_2064ee1f-02c2-4320-9212-e741c96f0519

session/agent_28f8c0ba-2c31-4b63-b92c-20cbc377dd90

session/agent_33b6974d-5b92-4010-ae64-90f943a81df2

session/agent_5aed847c-c43f-4c5e-a20d-e4a68922dd29

session/agent_6eba240f-16a5-49bb-9866-dba88624acf9

session/agent_72a6a8f9-1432-45e8-991b-a589adb1f782

session/agent_787611af-421c-4f03-9983-6a2c166c368d

session/agent_82d2630e-34f5-4446-8a5c-826a79cb6d9b

session/agent_8c6d1bce-ad4a-43eb-ab53-6ac044269a70

session/agent_9a2fe77d-8f85-4842-97f8-c3f8fc5bed5d

session/agent_9e198744-1771-4f1e-a09a-ec2943f3b00f

session/agent_9ed7570f-22e9-40b5-b345-5f8d0001b541

session/agent_a6151947-5fad-4147-aafa-f50fd7b6af44

session/agent_b406bb85-0e1a-4d3a-96db-6cffcbcf8db6

session/agent_ba8d8282-0ad9-46b6-853b-cbe84ade43fa

session/agent_df19d6e7-7f13-45f5-9f3d-1753d3645579

session/agent_e392d966-f4a6-4a96-8850-e1e2b7fbd4c4

session/agent_e49e15d4-f925-4776-9057-19d834a0c8bd

session/agent_ebc17d95-b208-454d-a8cb-594797a35fdb

session/agent_ebf86ab6-6e53-4660-bd8c-4fae93f0bd2b

session/agent_ecd29846-3de4-41c8-91fa-1f72d8a84b31

skills-marketplace

skills-path-fix

spec_onboarding

ss-chunker-jotai

storybook-fixups

sync-chinese-docs

sync-chinese-docs-20260124

tab-management-tool

task-history-memory-improvements

test-change

text-to-speech-experiments

the_context_indicator_seems-pg-rk-dp

trusting_goldwasser

update-contributors-10

update-contributors-11

update-contributors-12

update-contributors-13

update-contributors-14

update-contributors-15

update-contributors-16

update-contributors-17

update-contributors-18

update-contributors-19

update-contributors-2

update-contributors-20

update-contributors-21

update-contributors-22

update-contributors-23

update-contributors-24

update-contributors-25

update-contributors-26

update-contributors-27

update-contributors-28

update-contributors-29

update-contributors-3

update-contributors-30

update-contributors-31

update-contributors-32

update-contributors-33

update-contributors-34

update-contributors-35

update-contributors-36

update-contributors-37

update-contributors-38

update-contributors-39

update-contributors-4

update-contributors-40

update-contributors-41

update-contributors-42

update-contributors-43

update-contributors-44

update-contributors-45

update-contributors-46

update-contributors-47

update-contributors-48

update-contributors-49

update-contributors-5

update-contributors-50

update-contributors-51

update-contributors-52

update-contributors-53

update-contributors-54

update-contributors-55

update-contributors-56

update-contributors-57

update-contributors-6

update-contributors-7

update-contributors-8

update-contributors-9

update-discord-link-4956

update-opus-model-4.6

vite-verbose-flag

Speech Service Implementation Plan

Phase 1: Module Extraction

1. AudioConverter Module (`src/services/speech/AudioConverter.ts`)

Purpose: Handle WebM to MP3 conversion using FFmpeg

Key Methods:

convertToMp3(webmPath: string): Promise<string> - Convert WebM to MP3
cleanup(mp3Path: string): Promise<void> - Clean up temporary MP3 files

Features:

Proper error handling with stderr capture
Automatic cleanup of temporary files
Optimized conversion settings (16kHz mono, 32kbps)

2. TranscriptionClient Module (`src/services/speech/TranscriptionClient.ts`)

Purpose: Handle OpenAI Whisper API communication

Key Methods:

transcribe(filePath: string, language?: string): Promise<string> - Transcribe audio file
getApiKey(): string | null - Get OpenAI API key from context
getBaseUrl(): string - Get OpenAI base URL

Features:

Automatic API key detection from multiple providers
Proper error handling for API failures
Support for different audio formats

3. ChunkProcessor Module (`src/services/speech/ChunkProcessor.ts`)

Purpose: Handle chunk file detection and processing coordination

Key Methods:

startWatching(directory: string): void - Start watching for chunks
stopWatching(): void - Stop watching
processChunk(chunkPath: string): Promise<string> - Process single chunk

Events:

chunkReady - Emitted when chunk is ready for processing
chunkProcessed - Emitted when chunk processing is complete
error - Emitted on processing errors

4. StreamingManager Module (`src/services/speech/StreamingManager.ts`)

Purpose: Handle text deduplication and streaming state

Key Methods:

addChunkText(text: string): string - Add chunk text with deduplication
getSessionText(): string - Get current session text
reset(): void - Reset session state

Features:

Word-level deduplication between chunks
Session text accumulation
Progressive update events

Phase 2: Event-Driven Architecture

FFmpeg Segment Completion Detection

Instead of polling, use FFmpeg's built-in notifications:

ffmpeg -f avfoundation -i :default \
  -c:a libopus -b:a 32k -application voip -ar 16000 -ac 1 \
  -f segment -segment_time 3 -reset_timestamps 1 \
  -segment_list /tmp/segments.txt -segment_list_flags +live \
  /tmp/chunk_%03d.webm

Key Changes:

Add -segment_list to track completed segments
Parse FFmpeg stderr for "Opening/Closing" messages
Only process chunks after "Closing" message

Event Flow

1. AudioRecorder starts FFmpeg with segment completion logging
2. ChunkProcessor watches FFmpeg stderr for completion events
3. On "Closing chunk_001.webm" → emit chunkReady event
4. AudioConverter converts WebM → MP3
5. TranscriptionClient transcribes MP3
6. StreamingManager deduplicates and emits progressive updates

Phase 3: SpeechService Refactor

Transform SpeechService from monolithic to orchestrator:

New Structure:

export class SpeechService extends EventEmitter {
	private audioConverter: AudioConverter
	private transcriptionClient: TranscriptionClient
	private chunkProcessor: ChunkProcessor
	private streamingManager: StreamingManager

	// Orchestrate the modules instead of doing everything
}

Benefits:

Single responsibility principle
Easier testing and debugging
Better error isolation
Cleaner code organization

Implementation Order

Extract AudioConverter - Self-contained, easy to test
Extract TranscriptionClient - Independent API client
Extract ChunkProcessor - Core event-driven logic
Extract StreamingManager - Text processing logic
Refactor SpeechService - Orchestration layer
Add FFmpeg event parsing - Replace polling
Add comprehensive error handling - Robust operation
Write tests - Ensure reliability

This approach eliminates race conditions and makes the system much more reliable and maintainable.

speech-implementation-plan.md 3.9 KB لینک دائمی تاريخچه خام

Speech Service Implementation Plan

Phase 1: Module Extraction

1. AudioConverter Module (src/services/speech/AudioConverter.ts)

2. TranscriptionClient Module (src/services/speech/TranscriptionClient.ts)

3. ChunkProcessor Module (src/services/speech/ChunkProcessor.ts)

4. StreamingManager Module (src/services/speech/StreamingManager.ts)

Phase 2: Event-Driven Architecture

FFmpeg Segment Completion Detection

Event Flow

Phase 3: SpeechService Refactor

Implementation Order

speech-implementation-plan.md 3.9 KB

لینک دائمی تاريخچه خام

1. AudioConverter Module (`src/services/speech/AudioConverter.ts`)

2. TranscriptionClient Module (`src/services/speech/TranscriptionClient.ts`)

3. ChunkProcessor Module (`src/services/speech/ChunkProcessor.ts`)

4. StreamingManager Module (`src/services/speech/StreamingManager.ts`)