AgentDish directory

software engineering

Accepted listings with this tag.

Listing Category Score Trend Checked

JetBrains introduces Mellum2, an open-source 12B model built for software engineering workflows, routing, Q&A, RAG, sub-agents, and private deployment.

AI Model / Code/Workflow Model 88 ↓ -3 19 days ago Details
#306 ↓ -6
DeepSWE

DeepSWE is a benchmark for measuring frontier coding agents on original, long-horizon software engineering tasks. The page shows a leaderboard, methodology overview, task examples, and a full blog explaining the benchmark design and results.

Developer Tools / AI Benchmarking 84 ↓ -6 23 days ago Details