Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers

by matt_d | View on Hacker News