AgentDish directory

Trending

Listings with strong scores and recent movement.

Listing Category Score Trend Checked

A Visual Studio 2026 extension that connects Claude Code to the IDE, showing native diffs with accept/reject, selection context, compiler diagnostics, and a live session panel.

Developer Tools / IDE Extension 78 ↑ +6 4 days ago Details

An interactive explainer that teaches core AI and LLM evaluation metrics through playful visuals, including loss, perplexity, precision, recall, F1, accuracy, ROUGE, BLEU, and BERTScore.

Education / AI Metrics / Learning Resource 78 ↑ +6 5 days ago Details
#569 ↑ +6
Token-saviour

A routing skill that helps coding agents choose the most token-efficient tool for a task instead of reading whole files or dumping verbose output into context. It recommends tools like serena, graphify, rtk, caveman, or plain Read/Grep/Bash depending on the job.

Developer Tools / AI Developer Utilities 78 ↑ +6 5 days ago Details
#570 ↑ +6
brain-map-skill

An agent skill and Python builder that turns Markdown note folders from Obsidian or gbrain into an interactive HTML knowledge map with a force graph, growth timeline, filters, and note detail panels. The repo includes a prebuilt demo for quick inspection and supports use in Claude Code, OpenAI Codex, Cursor, and simila

Developer Tools / AI Agents 78 ↑ +6 7 days ago Details
#571 ↑ +6
MiroThinker

MiroThinker is a science-focused AI research app that emphasizes prediction, verification, and evidence-backed answers. The page also points to a MiroMind app and suggests use cases across finance, medicine, and regulation.

AI Research / Deep Research Agent 78 ↑ +6 8 days ago Details
#572 ↑ +6
OrchAPI

A set of small AI app experiments focused on “the use of the useless,” including AI Almanac, Forget, and Nickname Machine for history, closure, and naming moments.

AI Tools / Creative AI 78 ↑ +6 11 days ago Details
#573 ↑ +6
Hermex

Python library that automates ChatGPT and Gemini through the web UI, using a real Chrome browser to send prompts, upload files, and return responses as Python objects.

Developer Tools / Automation 78 ↑ +6 11 days ago Details

A local-first desktop environment for dispatching coding tasks to any model and any agent, with diff review, terminal streaming, task history, and per-project configuration.

Developer Tools / AI Desktop Apps 78 ↑ +6 11 days ago Details
#575 ↑ +6
Dropstone

Dropstone is a versioned coding agent runtime that routes work through open-weight models on US-hosted, no-retention infrastructure. The report explains its monthly re-baselining process, safety boundary, and cost model for Fast, Pro, and Heavy tiers.

Developer Tools / Code Assistant 78 ↑ +6 12 days ago Details

An interactive, sourced deep-dive showing how Stack Overflow’s Q&A format helped train coding-focused AI systems and what happened as AI reduced public question volume.

Developer Tools / Code Assistant 78 ↑ +6 13 days ago Details
#577 ↑ +6
Visr

Visr is a CLI tool that records full agent terminal trajectories, including commands, output, retries, and fixes, then returns a transcript link for reuse across sessions.

Developer Tools / AI Development Tools 78 ↑ +6 16 days ago Details

A blog post about verifiable RAG that benchmarks open-source NLI verifiers against Claude on RAGTruth and describes a Python library for sentence-level citation and claim verification.

AI / RAG / Verification & Hallucination Detection 78 ↑ +6 18 days ago Details
#579 ↑ +6
fuguUX

fuguUX is an AI-powered user testing and CX analysis tool that simulates real users on a website, surfaces UX issues, and generates reports with severity-rated findings, session records, and improvement roadmaps.

AI Product / UX Testing 78 ↑ +6 20 days ago Details

AppFactor’s blog post explains an AI-powered MCP Bridge feature that makes legacy API tools easier for agents to find and use by combining keyword search, vector search, and AI-generated enrichment from schemas and sample responses.

AI Development Tool / Agent tooling 78 ↑ +6 22 days ago Details
#581 ↑ +6
Repolog

Repolog scans a live website and produces a ranked audit covering on-page SEO, Core Web Vitals, security checks, and AI readiness for major AI search and assistant platforms.

AI-powered product / Website audit / SEO and security 78 ↑ +6 22 days ago Details

arXiv paper describing AVA, a GenAI platform for policy and development research built on 4,000+ World Bank reports. The abstract highlights multilingual support, evidence-based synthesis, citation verifiability, and reasoned abstention when queries cannot be supported.

AI Research / Trustworthy Generative AI 78 ↑ +6 23 days ago Details
#583 ↑ +6
Enough

Enough is a beta personal language system for planning, writing, reviewing, and translation. It supports local models and OpenRouter, and is aimed at users who want more control over their data while building a flexible personal knowledge workflow.

Writing / Copywriting 78 ↑ +6 23 days ago Details
#584 ↑ +6
Teleport-Env

Teleport-Env is an ultra-fast OS-level snapshot and rollback sandbox for autonomous coding agents, built with overlayfs and CRIU. It targets destructive agent testing, MCTS search loops, and reinforcement learning workflows that need rapid environment recovery.

Developer Tools / AI Agent Infrastructure 78 ↑ +6 23 days ago Details

A tool that analyzes local Claude Code logs to show where tokens, time, and cost go across a month of usage, including re-reading context, cache reads/writes, and subagent activity.

Developer Tools / AI Developer Tools 78 ↑ +6 23 days ago Details
#586 ↑ +6
SlopeAutoAcceptor

A macOS menu bar app that uses Apple Vision OCR to find approval buttons like Run, Fetch, or Retry and clicks them automatically for agent workflows.

Developer Tool / Automation 78 ↑ +6 24 days ago Details

A blog post describing ptc_runner’s MCP server and code-mode approach, where agents run short-lived untrusted programs inside a small Lisp REPL instead of Python or JavaScript.

AI Developer Tool / MCP Server 78 ↑ +6 25 days ago Details

An opinionated write-up on where multi-agent systems have and have not delivered value, with concrete comparisons across coding, images, CAD, and BIM, plus a description of Blade’s evidence-first architecture.

Writing / Copywriting 78 ↑ +6 25 days ago Details

An essay arguing that second-person system prompts create attentional overhead and identity fragmentation in agent design, and proposing first-person initialization as an alternative.

Writing / Copywriting 78 ↑ +6 27 days ago Details

AWS DevOps blog post explaining how to build a self-extending CLI using Amazon Bedrock, the Strands Agents SDK, and MCP. It outlines the workflow, technical pieces, and repository availability for readers who want to replicate the pattern.

Developer Tools / Code Assistant 78 ↑ +6 28 days ago Details