Our paper on Chaos has been accepted at SOSP 2015 (Symposium on Operating Systems Principles) in Monterey, California!
SOSP is one of the most prestigious venues in systems research, and I'll be presenting our work there.
About Chaos
Chaos enables graph processing across multiple cluster machines using secondary storage.
Key Innovations
- Partitions for sequential storage access rather than for locality and load balance, resulting in much lower pre-processing times
- Uniform random distribution of graph data across clusters rather than pursuing data locality
- Work-stealing mechanisms for runtime load balancing
Performance
On 32 machines, Chaos processes a graph 32 times larger in only 1.61 times longer. The system handled a graph with 1 trillion edges representing 16 TB of input data — a significant milestone for commodity cluster graph processing.
Looking forward to presenting in Monterey!
