Hugging Face leaderboard integration¶

audiobench can publish benchmark runs to a Hugging Face dataset, and a Gradio Space can read that dataset as a public leaderboard.

hf auth login

By default, audiobench push auto-targets:

<your-username>/audiobench-leaderboard-submissions

and creates that dataset repo on first upload.

2) Deploy the Space app¶

This repo ships a ready-to-deploy Space app in:

Create a Hugging Face Space (Gradio SDK), copy those files into the Space repo, and set:

Optional helper for CLI output links:

export AUDIOBENCH_LEADERBOARD_SPACE=<org-or-user>/audiobench-leaderboard

Run your benchmark, then upload:

audiobench run ab/sound-id --model heuristic-v0 --output results/sound-id.json
audiobench push results/sound-id.json --pretty-json

Useful push options:

Every upload lands at:

submissions/<suite-with-/-replaced-by-__>/<run_hash>.json

Each submission includes:

suite, revision, model, run_hash
payload_sha256 of the full run payload
suite-specific leaderboard metrics (weighted_recall / weighted_mean_wer / weighted_hallucination_rate, etc.)
when present, findings metadata (top_finding_status, validated_findings, top finding effect/q)
the original run payload for reproducibility audits