AgentDish directory

inference

Accepted listings with this tag.

Listing Category Score Trend Checked
#58 ↓ -26
AutoRound

AutoRound is an open-source quantization toolkit for LLMs and VLMs, focused on high-accuracy low-bit inference across CPU, XPU, CUDA, and multiple deployment backends.

Developer Tools / AI Infrastructure 89 ↓ -26 46 days ago Details
#92 ↓ -3
grunden.ai

A Swedish AI inference API offering GLM 5.1 on owned H200 hardware in Stockholm, with OpenAI-compatible endpoints, SEK pricing, and an emphasis on EU data residency.

Developer Tools / API 88 ↓ -3 39 days ago Details

An open-source handbook for production LLM serving and inference at scale, covering GPU fundamentals, KV cache, batching, quantization, speculative decoding, and engines like vLLM, SGLang, and TensorRT-LLM.

Developer Tools / AI Infrastructure 84 ↓ -6 14 days ago Details

An Apple Silicon–optimized inference build of Bonsai 1.7B with custom Metal kernels, benchmark results, quick-start instructions, and a bundled OpenAI-compatible server.

Developer Tools / Code Assistant 79 ↓ -80 45 days ago Details