AIBench Training: Balanced Industry-Standard AI Training Benchmarking

Fei Tang; Wanling Gao; Jianfeng Zhan; Chuanxin Lan; Xu Wen; Lei Wang; Chunjie Luo; Jiahui Dai; Zheng Cao; Xingwang Xiong; Zihan Jiang; Tianshu Hao; Fanda Fan; Fan Zhang; Yunyou Huang; Jianan Chen; Mengjia Du; Rui Ren; Chen Zheng; Daoyi Zheng; Haoning Tang; Kunlin Zhan; Biao Wang; Defei Kong; Minghe Yu; Chongkang Tan; Huan Li; Xinhui Tian; Yatao Li; Gang Lu; Junchao Shao; Zhenyu Wang; Xiaoyu Wang; Hainan Ye

AIBench Training: Balanced Industry-Standard AI Training Benchmarking

Fei Tang ,
Wanling Gao ,
Jianfeng Zhan ,
Chuanxin Lan ,
Xu Wen ,
Lei Wang ,
Chunjie Luo ,
Jiahui Dai ,
Zheng Cao ,
Xingwang Xiong ,
Zihan Jiang ,
Tianshu Hao ,
Fanda Fan ,
Fan Zhang ,
Yunyou Huang ,
Jianan Chen ,
Mengjia Du ,
Rui Ren ,
Chen Zheng ,
Daoyi Zheng ,
Haoning Tang ,
Kunlin Zhan ,
Biao Wang ,
Defei Kong ,
Minghe Yu ,
Chongkang Tan ,
Huan Li ,
Xinhui Tian ,
Yatao Li ,
Gang Lu ,
Junchao Shao ,
Zhenyu Wang ,
Xiaoyu Wang ,
Hainan Ye

April 2020

Download BibTex

Earlier-stage evaluations of a new AI architecture/system need affordable AI benchmarks, while using a few AI component benchmarks alone in the other stages may lead to misleading conclusions. This paper proposes a balanced benchmarking methodology. Performing an exhaustive survey on Internet service AI domains, we identify and implement seventeen representative AI tasks with the state-of-the-art models to guarantee the diversity and representativeness of the benchmarks. Meanwhile, we keep a benchmark subset to a minimum for affordability. We contribute by far the most comprehensive AI training benchmark suite with seventeen industry partners. The evaluations show: (1) AIBench Training outperforms MLPerf Training in terms of the diversity and representativeness of model complexity, computational cost, convergent rate, computation and memory access patterns, and hotspot functions; (2) With respect to the AIBench full benchmarks, its subset shortens the benchmarking cost by 54%, while maintaining the primary workload characteristics; (3) The performance ranking shows the single-purpose AI accelerator like TPU with the optimized TensorFlow framework performs better than that of GPUs while losing the latters’ general support for a variety of AI models. The AIBench Training specifications, source code, testbed, and performance numbers are publicly available from the web site this http URL.