Intelligent Cloud and Edge Group

Some on-going projects:

Some open-sourced projects:

  • NNFusion (opens in new tab): A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
  • Rammer (opens in new tab): A DNN compiler technology that can generate an efficient static spatio-temporal schedule for a DNN at compile time to minimize scheduling overhead.
  • Roller (opens in new tab): A fast and efficient tensor compiler for DNN that can generate efficient kernels in seconds with a construction-based approach.
  • SparTA (opens in new tab): an end-to-end optimization system to harvest the speeding up gain from the model sparsity.
  • Antares (opens in new tab): an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.
  • Tutel (opens in new tab): An Optimized Mixture-of-Experts Implementation.