Our paper "Rock You Like a Hurricane: Taming Skew in Large Scale Analytics" has been accepted at EuroSys 2018 in Porto, Portugal!
About Hurricane
Hurricane is a high-performance analytics platform engineered to manage skewed data distributions automatically.
Core Innovation
The system employs adaptive work partitioning based on load observed by nodes at runtime. Overloaded nodes can spawn task clones that process data subsets, allowing dynamic parallelism adjustment.
Technical Approach
Data spreads across all nodes with decentralized retrieval capabilities, ensuring load balancing across tasks and rapid completion times.
Performance
Our testing demonstrates significant performance improvements over existing systems on both uniform and skewed datasets by maintaining optimal CPU and storage utilization.
The research addresses fundamental challenges in cluster computing frameworks stemming from load imbalance and limited parallelism caused by data skew and varying processing times.
