ProNE: Fast and Scalable Network Representation Learning

  • Jie Zhang ,
  • Yuxiao Dong ,
  • Yan Wang ,
  • Jie Tang ,
  • Ming Ding

International Joint Conference on Artificial Intelligence (IJCAI) |

Recent advances in network embedding have revolutionized the field of graph and network mining. However, (pre-)training embeddings for very large-scale networks is computationally challenging for most existing methods. In this work, we present ProNE—a fast, scalable, and effective model, whose single-thread version is 10–400× faster than efficient network embedding benchmarks with 20 threads, including LINE, DeepWalk, node2vec, GraRep, and HOPE. As a concrete example, the single-thread ProNE requires only 29 hours to embed a network of hundreds of millions of nodes while it takes LINE weeks and DeepWalk months by using 20 threads. To achieve this, ProNE first initializes network embeddings efficiently by formulating the task as sparse matrix factorization. The second step of ProNE is to enhance the embeddings by propagating them in the spectrally modulated space. Extensive experiments on networks of various scales and types demonstrate that ProNE achieves both effectiveness and significant efficiency superiority when compared to the aforementioned baselines. In addition, ProNE’s embedding enhancement step can be also generalized for improving other models at speed, e.g., offering >10% relative gains for the used baselines.