AgentDish directory

Research

Accepted listings with this tag.

Listing Category Score Trend Checked

An interactive dashboard that analyzes New York Times coverage since 2000 using the NYT Archive API, with views for reporters, beats, sections, subjects, geography, obituaries, and corrections.

Research / Data Visualization 89 ↑ +164 45 days ago Details
#64 ↓ -3
Slashspace

Slashspace is a local-first infinite canvas for AI chat and agentic work, designed for research, writing, development, and long-form thinking. It supports multiple models, MCP connections, local storage, and desktop apps for Mac, Windows, and Linux.

Productivity / AI Workspace 88 ↓ -3 2 days ago Details
#94 ↓ -3
CAD-Bench

An open benchmark and leaderboard for AI CAD agents, with 308 prompts across 20 categories and layered scoring for geometry, engineering, manufacturability, and cognition.

Research / Knowledge Work 88 ↓ -3 42 days ago Details

A research article from Applied Compute on how agentic, tool-using workloads differ from traditional LLM benchmarks, with production observations, workload profiles, and an open-source harness for replaying traces.

Research / Knowledge Work 87 ↓ -47 45 days ago Details

arXiv paper describing QUEST, an open family of deep research agents from 2B to 35B parameters, plus a synthetic-task training recipe and released models, data, and scripts.

Research / AI Agents 83 ↓ -3 25 days ago Details
#404 ↓ -3
wwwatch

A daily AI intelligence journal for builders, covering notable model, tooling, and release updates in a short sourced digest.

Research / Knowledge Work 83 ↓ -3 29 days ago Details
#408 ↓ -3
Physics AI

Physics AI is a physics homework and study tool that solves problems from photos or typed prompts, with step-by-step explanations, tutor mode, and visual breakdowns for diagrams and vectors.

Research / Knowledge Work 83 ↓ -3 30 days ago Details

A research report on the current MCP ecosystem, with live crawl numbers, verification rates, category breakdowns, and examples of both strong and weak MCP-positive sites.

Research / AI research 83 ↑ +57 45 days ago Details
#441 ↓ -2
BigTech AI News

Chrome extension that tracks major AI companies, pulls in AI news and research, and generates daily summaries with Gemini, including language-aware summaries and article deep dives.

Research / Knowledge Work 82 ↓ -2 10 days ago Details
#447 ↓ -2
Wanderwhim

AI-native workspace for writers, bloggers, and lifelong learners. It combines source collection, an idea map, AI-assisted exploration, and a focused writing editor designed to support long-form thinking.

Writing / Copywriting 82 ↓ -2 17 days ago Details
#481 ↑ +85
ShadowBrokers

AI-powered trade signal product for retail traders that turns financial news into ranked trade plans with entries, stops, targets, and tracked accuracy.

Research / Knowledge Work 82 ↑ +85 45 days ago Details
#495 ↑ +2
EuroMesh

A sourced model and short report exploring whether Europe could train a sovereign frontier AI model using public compute it already owns, with reproducible code, datasets, and a PDF report.

AI Research / Analysis / Reports 81 ↑ +2 5 days ago Details
#496 ↑ +2
Foyer

Foyer is a local dashboard for watching AI coding agents work, with a narrated current-focus view and a research panel for reading sourced briefings while you wait.

Developer Tools / AI Developer Tools 81 ↑ +2 9 days ago Details

Agora-1 is a multi-agent world model from Odyssey that simulates shared real-time environments for up to four participants, human or AI, with a focus on gaming, robotics, reinforcement learning, and foundation model research.

AI Research / World Models 78 ↑ +6 32 days ago Details

PaperProfit explains an AI-assisted stock evaluation approach that combines fundamentals, technical signals, and qualitative analysis from transcripts and SEC filings into a weighted score.

Research / Knowledge Work 77 → 0 18 days ago Details

A research write-up on detecting AI agents through process differences in CAPTCHA and related cognitive tasks. It outlines the CogCAPTCHA30 approach, reports human-vs-model differences, and connects the findings to Roundtable’s Proof of Human product.

Research / Knowledge Work 77 → 0 21 days ago Details

arXiv paper on a self-speculative decoding framework for speeding up reasoning LLM inference on edge hardware, with hardware co-design and reported speedups.

Research / AI/ML Paper 77 → 0 22 days ago Details

A Steel blog post that dissects Claude Code’s deep-research workflow and contrasts it with other research systems. It also positions Steel as an open-source browser API for AI agents, making the page relevant for AI builders.

AI Tools / Browser API 76 ↓ -1 10 days ago Details

A GitHub research project documenting a long-form, multi-model analysis of LLM behavior across Claude, Gemini, ChatGPT, and Grok. The repo includes an executive summary, screenplay, technical white paper, and archive of logs and chat records.

AI Research / LLM Evaluation & Analysis 75 → 0 25 days ago Details

Anthropic research report on how people use Claude Code in practice, based on a large privacy-preserving analysis of session data. It covers task types, division of labor between user and model, and how domain expertise affects outcomes.

Developer Tools / Code Assistant 74 ↓ -1 2 days ago Details
#664 ↓ -1
LEVI

LEVI is a harness-first evolutionary framework for code and prompt optimization. It focuses on reducing LLM cost with diversity-preserving search, role-aware model routing, and a proxy benchmark, and presents comparative results against several existing systems.

Developer Tools / Code Assistant 74 ↓ -1 12 days ago Details

arXiv paper on distilling multi-agent debate into a single LLM with a two-stage fine-tuning pipeline. The abstract reports lower token use, comparable or better benchmark performance, and an analysis of agent-specific activation subspaces, with code linked from the page.

Research / AI/LLM Reasoning 74 ↓ -1 15 days ago Details

A GitHub research project that measures how gpt-4.1 responds when asked to pick a random number between 1 and 100, using 10,000 API calls and comparing the results to a uniform baseline.

AI Research / Model Behavior Analysis 74 ↓ -1 26 days ago Details

An open-source experiment that adds a small zero-initialized overlay layer to a frozen GPT-2 so its behavior can be adjusted at inference time without retraining the base model.

AI Developer Tool / Model Adaptation / Adapters 74 ↓ -1 45 days ago Details