Chris Estreich e2bdce0961 Update benchmark/prompts/cpp.md 10 mēneši atpakaļ
..
prompts e2bdce0961 Update benchmark/prompts/cpp.md 10 mēneši atpakaļ
src f108dfaeb8 Evals 10 mēneši atpakaļ
.env.local.sample f108dfaeb8 Evals 10 mēneši atpakaļ
Dockerfile f108dfaeb8 Evals 10 mēneši atpakaļ
README.md f108dfaeb8 Evals 10 mēneši atpakaļ
entrypoint.sh f108dfaeb8 Evals 10 mēneši atpakaļ
package-lock.json f108dfaeb8 Evals 10 mēneši atpakaļ
package.json f108dfaeb8 Evals 10 mēneši atpakaļ
tsconfig.json f108dfaeb8 Evals 10 mēneši atpakaļ

README.md

Benchmark Harness

Configure ENV vars (OpenRouter, PostHog, etc):

cp .env.local.sample .env.local
# Update ENV vars as needed.

Build and run a Docker image with the development environment needed to run the benchmarks (C++, Go, Java, Node.js, Python & Rust):

npm run docker:start

Run an exercise:

npm run docker:benchmark -- -e exercises/javascript/binary

Select and run an exercise:

npm run cli

Select and run an exercise for a specific language:

npm run cli -- run rust

Run all exercises for a language:

npm run cli -- run rust all

Run all exercises:

npm run cli -- run all

Run all exercises using a specific runId (useful for re-trying when an unexpected error occurs):

npm run cli -- run all --runId 1