Chris Estreich e2bdce0961 Update benchmark/prompts/cpp.md 9 months ago
..
prompts e2bdce0961 Update benchmark/prompts/cpp.md 9 months ago
src f108dfaeb8 Evals 9 months ago
.env.local.sample f108dfaeb8 Evals 9 months ago
Dockerfile f108dfaeb8 Evals 9 months ago
README.md f108dfaeb8 Evals 9 months ago
entrypoint.sh f108dfaeb8 Evals 9 months ago
package-lock.json f108dfaeb8 Evals 9 months ago
package.json f108dfaeb8 Evals 9 months ago
tsconfig.json f108dfaeb8 Evals 9 months ago

README.md

Benchmark Harness

Configure ENV vars (OpenRouter, PostHog, etc):

cp .env.local.sample .env.local
# Update ENV vars as needed.

Build and run a Docker image with the development environment needed to run the benchmarks (C++, Go, Java, Node.js, Python & Rust):

npm run docker:start

Run an exercise:

npm run docker:benchmark -- -e exercises/javascript/binary

Select and run an exercise:

npm run cli

Select and run an exercise for a specific language:

npm run cli -- run rust

Run all exercises for a language:

npm run cli -- run rust all

Run all exercises:

npm run cli -- run all

Run all exercises using a specific runId (useful for re-trying when an unexpected error occurs):

npm run cli -- run all --runId 1