Distributed Graph Computation Meets Machine Learning
- Wencong Xiao ,
- Jilong Xue ,
- Youshan Miao ,
- Zhen Li ,
- Cheng Chen ,
- Ming Wu ,
- Wei Li ,
- Lidong Zhou
IEEE Transactions on Parallel and Distributed Systems | , Vol 31(7): pp. 1588-1604
TuX2 is a new distributed graph engine that bridges graph computation and distributed machine learning. TuX2 inherits the benefits of elegant graph computation model, efficient graph layout, and balanced parallelism to scale to billion-edge graphs, while extended and optimized for distributed machine learning to support heterogeneity in data model, Stale Synchronous Parallel in scheduling, and a new Mini-batch, Exchange, GlobalSync, and Apply ( MEGA ) model for programming. TuX2 further introduces a hybrid vertex-cut graph optimization and supports various consistency models in fault tolerance for machine learning. We have developed a set of representative distributed machine learning algorithms in TuX2 , covering both supervised and unsupervised learning. Compared to the implementations on distributed machine learning platforms, writing those algorithms in TuX2 takes only about 25 percent of the code: our graph computation model hides the detailed management of data layout, partitioning, and parallelism from developers. The extensive evaluation of TuX2 , using large datasets with up to 64 billion of edges, shows that TuX2 outperforms PowerGraph/PowerLyra, the state-of-the-art distributed graph engines, by an order of magnitude, while beating two state-of-the-art distributed machine learning systems by at least 60 percent.
1045-9219 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.