Index
All Logs
— June 2026 —
- 2026-06-17 - study #45595 [KV Connector][Offloading] Avoid blocking the engine to flush offloads on idle
- 2026-06-16 - Study Bug in #45387 #45388 Scheduler deadlocks
- 2026-06-16 - PR #45679 Add Fence Tests and Learn Fence Principles
- 2026-06-13 - New AI Tech in WWDC 2026
- 2026-06-13 - Study Core AI KV Cache
- 2026-06-12 - Deep Study: Apple CoreAI-Torch compare with vllm
- 2026-06-12 - Texas Hold’em
- 2026-06-11 - Study storage_offload.cpp
- 2026-06-11 - Study Apple Core AI Pipeline
- 2026-06-11 - Study Apple Core AI
- 2026-06-08 - Study llmd-fs-backend
- 2026-06-08 - Study PVC Evictor
- 2026-06-05 - Fix vLLM Structured Output Runaway Whitespace #44619
- 2026-06-05 - Study llm-d KV Cache Deploy
- 2026-06-04 - Study OpenTelemetry Usage in LLM-D KV Cache
- 2026-06-04 - LLM-D KV Cache Routing and Querying
- 2026-06-03 - Study: vLLM Offload Metrics and RFC #44008
- 2026-06-03 - Study #44295: Optimization for Scheduling Delay Caused by Concurrent Same Prefix
- 2026-06-03 - KVConnector Event Logic
- 2026-06-02 - Simple study of LMCacheMPConnector implementation
- 2026-06-02 - Environment Research: nvcc, CUDA, and PyTorch Relationship
- 2026-06-02 - LMCache Integration Modules and Division of Labor
- 2026-06-01 - In-Depth Study of LM Cache Integration in vLLM
— May 2026 —
- 2026-05-29 - Simple Study of API Server, Engine, Scheduler, Worker Call Patterns
- 2026-05-28 - In-Depth Study of Offline Connector
- 2026-05-27 - vLLM SimpleCPUOffloadConnector Impletation
- 2026-05-27 - vLLM SimpleCPUOffloadConnector Process
- 2026-05-25 - Learning Fp8LinearMethod
- 2026-05-22 - DeepSeek V4 TP Size Exceeds N Groups Error #43182
- 2026-05-22 - GPU Programming Ladder
- 2026-05-22 - Memory Types and Processes
- 2026-05-21 - Fix Auto-Functionalized for Fused DeepSeek V4 QNorm ROPE PR #43058
- 2026-05-21 - Learning Gemini New Tips
- 2026-05-20 - Antigravity Experience and Agent Capability Analysis
- 2026-05-20 - Fix #43037 Gemma4 Tool Parser Multiple Tool Calls in Single Delta
- 2026-05-18 - Low precision data formats
- 2026-05-18 - Study PR #42686 Add patch for fullgraph compilation
- 2026-05-18 - Study PR #42885 Enable FULL cudagraph capture for TRITON_MLA decode
- 2026-05-18 - Fix bug #41469 XPU platform missing AWQ dequantize CUDA kernel
- 2026-05-16 - Study PR #42631 IR Op Priority Optimization
- 2026-05-16 - Apple AI Signals and Agent Environment
- 2026-05-16 - VLLM RFC: Model Implementation Redesign
- 2026-05-16 - Kelly Criterion and Betting Paradox