Hire engineers who solve your problems.

HackerRank and CodeSignal show you who can solve puzzles. Canary shows you who can do the job.

Candidate experience
37:42|▶ RunSubmit
Problem
🎫 CAR-523
⎇ PR #1098
CAR / Backlog / CAR-523
Sandbox temp directories not cleaned up on failure
● In Progress

INC-3201 — Disk usage on the execution host hit 94% on 2026-05-10, causing new runs to fail with No space left on device.

Orphaned temp dirs from timed-out and crashed runs — cleanup() never runs when run_tests() raises.

Workspace
solution.py
tests.py
README.md
from sandbox import create_sandbox, write_files
from sandbox import run_tests, cleanup
def execute_run(req):
sandbox = create_sandbox(req["run_id"])
write_files(sandbox, req["code"])
result = run_tests(sandbox, req["timeout"])
cleanup(sandbox) # only runs on success ↑
return result
Tests 2/5
successful run returns output
cleanup after success
cleanup after timeoutSandbox dir still exists

HackerRank tells you if they can code. Canary tells you if they can work.

01

Your bugs, not textbook puzzles

HackerRank and CodeSignal test whether candidates can reverse a linked list. Canary tests whether they can fix the thing that broke your system last week.

02

Same bar for every candidate

Solutions run against your hidden test suite in a sandboxed environment. No self-grading, no vibes — just a consistent, repeatable signal.

03

How they think, not just what they shipped

Algo platforms show you a score. Canary shows you the session — every edit, every AI prompt, every dead end they backed out of.

Get early access.

Onboarding teams in small batches. Drop your work email.