AgentDish directory
vLLM
Accepted listings with this tag.
| Listing | Category | Score | Trend | Checked | |
|---|---|---|---|---|---|
|
#58
↓ -26
AutoRound
AutoRound is an open-source quantization toolkit for LLMs and VLMs, focused on high-accuracy low-bit inference across CPU, XPU, CUDA, and multiple deployment backends. |
Developer Tools / AI Infrastructure | 89 | ↓ -26 | 46 days ago | Details |
|
#82
↓ -3
tiny-vllm
Open-source C++ and CUDA LLM inference engine inspired by vLLM, with a teaching-focused course that walks through model serving, batching, KV cache, and attention kernels. |
Developer Tools / AI Inference / LLM Serving | 88 | ↓ -3 | 21 days ago | Details |
|
Google Developers Blog post about integrating DFlash, a diffusion-style speculative decoding framework, into the vLLM TPU ecosystem to improve LLM serving speed on TPU v5p. |
Developer Tools / Code Assistant | 78 | ↓ -127 | 45 days ago | Details |
|
#703
↓ -20
vLLM-Compile
A public slide deck about vLLM-compile, a project focused on bringing compiler optimizations to LLM inference and speeding up torch.compile for vLLM workflows. |
Developer Tools / Code Assistant | 72 | ↓ -20 | 45 days ago | Details |