Chris Estreich e2bdce0961 Update benchmark/prompts/cpp.md 9 месяцев назад
..
prompts e2bdce0961 Update benchmark/prompts/cpp.md 9 месяцев назад
src f108dfaeb8 Evals 9 месяцев назад
.env.local.sample f108dfaeb8 Evals 9 месяцев назад
Dockerfile f108dfaeb8 Evals 9 месяцев назад
README.md f108dfaeb8 Evals 9 месяцев назад
entrypoint.sh f108dfaeb8 Evals 9 месяцев назад
package-lock.json f108dfaeb8 Evals 9 месяцев назад
package.json f108dfaeb8 Evals 9 месяцев назад
tsconfig.json f108dfaeb8 Evals 9 месяцев назад

README.md

Benchmark Harness

Configure ENV vars (OpenRouter, PostHog, etc):

cp .env.local.sample .env.local
# Update ENV vars as needed.

Build and run a Docker image with the development environment needed to run the benchmarks (C++, Go, Java, Node.js, Python & Rust):

npm run docker:start

Run an exercise:

npm run docker:benchmark -- -e exercises/javascript/binary

Select and run an exercise:

npm run cli

Select and run an exercise for a specific language:

npm run cli -- run rust

Run all exercises for a language:

npm run cli -- run rust all

Run all exercises:

npm run cli -- run all

Run all exercises using a specific runId (useful for re-trying when an unexpected error occurs):

npm run cli -- run all --runId 1