Index

All Logs

— June 2026 —

2026-06-17 - study #45595 [KV Connector][Offloading] Avoid blocking the engine to flush offloads on idle
2026-06-16 - Study Bug in #45387 #45388 Scheduler deadlocks
2026-06-16 - PR #45679 Add Fence Tests and Learn Fence Principles
2026-06-13 - New AI Tech in WWDC 2026
2026-06-13 - Study Core AI KV Cache
2026-06-12 - Deep Study: Apple CoreAI-Torch compare with vllm
2026-06-12 - Texas Hold’em
2026-06-11 - Study storage_offload.cpp
2026-06-11 - Study Apple Core AI Pipeline
2026-06-11 - Study Apple Core AI
2026-06-08 - Study llmd-fs-backend
2026-06-08 - Study PVC Evictor
2026-06-05 - Fix vLLM Structured Output Runaway Whitespace #44619
2026-06-05 - Study llm-d KV Cache Deploy
2026-06-04 - Study OpenTelemetry Usage in LLM-D KV Cache
2026-06-04 - LLM-D KV Cache Routing and Querying
2026-06-03 - Study: vLLM Offload Metrics and RFC #44008
2026-06-03 - Study #44295: Optimization for Scheduling Delay Caused by Concurrent Same Prefix
2026-06-03 - KVConnector Event Logic
2026-06-02 - Simple study of LMCacheMPConnector implementation
2026-06-02 - Environment Research: nvcc, CUDA, and PyTorch Relationship
2026-06-02 - LMCache Integration Modules and Division of Labor
2026-06-01 - In-Depth Study of LM Cache Integration in vLLM

— May 2026 —

2026-05-29 - Simple Study of API Server, Engine, Scheduler, Worker Call Patterns
2026-05-28 - In-Depth Study of Offline Connector
2026-05-27 - vLLM SimpleCPUOffloadConnector Impletation
2026-05-27 - vLLM SimpleCPUOffloadConnector Process
2026-05-25 - Learning Fp8LinearMethod
2026-05-22 - DeepSeek V4 TP Size Exceeds N Groups Error #43182
2026-05-22 - GPU Programming Ladder
2026-05-22 - Memory Types and Processes
2026-05-21 - Fix Auto-Functionalized for Fused DeepSeek V4 QNorm ROPE PR #43058
2026-05-21 - Learning Gemini New Tips
2026-05-20 - Antigravity Experience and Agent Capability Analysis
2026-05-20 - Fix #43037 Gemma4 Tool Parser Multiple Tool Calls in Single Delta
2026-05-18 - Low precision data formats
2026-05-18 - Study PR #42686 Add patch for fullgraph compilation
2026-05-18 - Study PR #42885 Enable FULL cudagraph capture for TRITON_MLA decode
2026-05-18 - Fix bug #41469 XPU platform missing AWQ dequantize CUDA kernel
2026-05-16 - Study PR #42631 IR Op Priority Optimization
2026-05-16 - Apple AI Signals and Agent Environment
2026-05-16 - VLLM RFC: Model Implementation Redesign
2026-05-16 - Kelly Criterion and Betting Paradox