Complex AI Systems

Cache Saver

Cache Saver

A modular, plug-and-play framework for high-level LLM inference optimizations. Cache Saver uses a namespace-aware list-valued cache to ensure statistical integrity and reproducibility, reducing inference costs by ~25% and CO2 emissions by ~35% on average.

Research Goal

Efficient and reproducible LLM inference