Kevin van Dijk 4c1e8dcd8e Rename values for Kilo Code пре 11 месеци
..
prompts e2bdce0961 Update benchmark/prompts/cpp.md пре 11 месеци
src 4c1e8dcd8e Rename values for Kilo Code пре 11 месеци
.env.local.sample f108dfaeb8 Evals пре 11 месеци
Dockerfile f108dfaeb8 Evals пре 11 месеци
README.md f108dfaeb8 Evals пре 11 месеци
entrypoint.sh f108dfaeb8 Evals пре 11 месеци
package-lock.json f108dfaeb8 Evals пре 11 месеци
package.json f108dfaeb8 Evals пре 11 месеци
tsconfig.json f108dfaeb8 Evals пре 11 месеци

README.md

Benchmark Harness

Configure ENV vars (OpenRouter, PostHog, etc):

cp .env.local.sample .env.local
# Update ENV vars as needed.

Build and run a Docker image with the development environment needed to run the benchmarks (C++, Go, Java, Node.js, Python & Rust):

npm run docker:start

Run an exercise:

npm run docker:benchmark -- -e exercises/javascript/binary

Select and run an exercise:

npm run cli

Select and run an exercise for a specific language:

npm run cli -- run rust

Run all exercises for a language:

npm run cli -- run rust all

Run all exercises:

npm run cli -- run all

Run all exercises using a specific runId (useful for re-trying when an unexpected error occurs):

npm run cli -- run all --runId 1