A comprehensive overview of research projects and systems.
Developing environments where multiple models and non-model components are integrated to work together efficiently. Research targets optimizing agentic workloads through advanced scheduling, caching, and resource management.
A modular, plug-and-play framework for high-level LLM inference optimizations. Cache Saver uses a namespace-aware list-valued cache to ensure statistical integrity and reproducibility, reducing inference costs by ~25% and CO2 emissions by ~35% on average.
Research on leveraging LLMs for code-related tasks. SQL-of-Thought introduces a multi-agent framework for text-to-SQL with guided error correction, achieving state-of-the-art results on Spider. LoRACode uses parameter-efficient fine-tuning for code embeddings, reducing trainable parameters to under 2%.
IOLM-DB makes LLM-enhanced database queries practical through query-specific model optimization. Rather than using expensive general-purpose LLMs for all queries, we create specialized lightweight models tailored to specific analytical tasks, enabling scalable processing of millions of rows.
Traditional benchmarks fail to characterize learned systems that overfit to static workloads. We propose new benchmarks that measure adaptability through descriptive statistics and outliers, rather than average metrics, to fairly evaluate the cost and benefits of learned database components.
Learned indexes promise significant performance gains over traditional B-trees, but struggle with updates. We are developing concurrently updateable learned index structures that maintain the performance benefits of learned models while supporting efficient insertions and deletions.
A GNN training framework designed to handle large graphs efficiently by eliminating remote accesses during sampling. It actively repartitions the graph and dynamically adjusts gradient weights, achieving up to 13x speedup over state-of-the-art systems.
A graph processing system built in collaboration with the University of Toronto that automatically fine-tunes configuration and synthesizes optimized components. It supports efficient updates at arbitrary granularity and is expanding to support mining anomalies and insights from streaming graph data.
Research on engineering secure, incentive-aligned decentralized platforms. Our work addresses DAO governance through computational decision support workflows and applies requirements-driven design to prediction markets, setting new standards for early-stage requirements engineering in the Web3 era.
A dedicated blockchain node architected specifically for analytics. It maintains parity with traditional nodes while integrating a robust analytics API, circumventing ETL rigidity and offering better trust models and performance.
SkyPulse transforms satellite imagery into actionable intelligence for infrastructure monitoring, disaster management, and traffic analysis. Developed in collaboration with Armasuisse, it integrates multi-source data (Sentinel, Twitter/X, Telegram), leverages LLMs and image-to-text models for data fusion, and provides a conversational agent for real-time querying and visualization.
Hurricane uses adaptive work partitioning and task cloning to handle skewed workloads, spreading data across all nodes to improve CPU and storage utilization.
Tesseract is a distributed graph mining system that executes static algorithms on dynamic graphs. It introduces a change detection algorithm to find exact modifications and decomposes streams into per-update tasks, achieving millions of updates per second with low latency.
Chaos scales graph processing from secondary storage to multiple machines in a cluster using streaming partitions. It achieved 1 trillion edge processing on small commodity clusters.
Hailstorm is a filesystem and storage substrate for LSM-based distributed databases. It pools storage within racks and offloads compaction to remote nodes.