Developer Tools / AI Benchmarking

clawmark

A local Rust CLI for A/B testing two CLAUDE.md files against a fixed SWE-bench Lite smoke set, with doctor, run, and report commands.

AI tool Rust benchmarking claude cli evaluation open-source swe-bench

Why it was accepted

The page clearly describes an AI-adjacent developer tool with a specific workflow: compare two CLAUDE.md variants, run them through Claude locally, and score them on SWE-bench Lite tasks. The README gives concrete commands, prerequisites, runtime behavior, and the shipped v1 scope, which is enough for a useful public listing.

Weakness

The snapshot doesn’t show real benchmark results, screenshots, release status, or how the output report looks, so a visitor can’t judge effectiveness from the page alone.

Review status

2 days ago #432 ↓ -2

Last evaluated 2 days ago. Current rank #432. Down 2 spots in the rankings.

Score history

Related listings

#1 CodeGraph

Developer Tools / AI for Code

CodeGraph is a local code knowledge graph for AI coding agents like Claude Code, Cursor, Codex, OpenCode, and Hermes Agent. It aims to cut token use, tool calls, and runtime by letting agents query pre-indexed code structure instead of scanning files repeatedly.

→ 0 27 days ago

#3 LLMRender

Developer Tools / React Libraries

A lightweight React Markdown renderer with built-in LaTeX, syntax highlighting, streaming-safe rendering, and security-focused defaults.

↓ -1 7 days ago

#6 Version Sentinel

Developer Tools / AI Coding Guardrails

Claude Code plugin that blocks dependency edits until a fresh, source-cited version check is recorded, helping prevent hallucinated or stale package versions across npm, pip, Poetry/uv, Cargo, and NuGet.

↑ +95 45 days ago

#7 Omni

Developer Tools / Search & Retrieval

Omni is a local-first semantic search app for macOS that indexes text, code, PDFs, images, audio, and video on-device. It supports multilingual search, private offline use, and exposes a local endpoint for agents to query indexed files.

↓ -3 14 days ago