Portrait of Dongdong Zhang

Dongdong Zhang

Principal Research Manager

More Information


  • Jian Yang, Yuwei Yin, Shuming Ma, Haoyang Huang, Dongdong Zhang, Zhoujun Li, Furu Wei: Multilingual Agreement for Multilingual Neural Machine Translation. ACL/IJCNLP (2) 2021: 233-239
  • Weijia Xu, Shuming Ma, Dongdong Zhang, Marine Carpuat: How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation? ACL/IJCNLP (Findings) 2021: 4392-4400
  • Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Dongdong Zhang, Nan Duan: M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training. CVPR 2021: 3977-3986
  • Guanhua Chen, Shuming Ma, Yun Chen, Li Dong, Dongdong Zhang, Jia Pan, Wenping Wang, Furu Wei: Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders. EMNLP (1) 2021: 15-26
  • Weijia Xu, Yuwei Yin, Shuming Ma, Dongdong Zhang, Haoyang Huang: Improving Multilingual Neural Machine Translation with Auxiliary Source Languages. EMNLP (Findings) 2021: 3029-3041
  • Jian Yang, Shuming Ma, Dongdong Zhang, Juncheng Wan, Zhoujun Li, Ming Zhou: Smart-Start Decoding for Neural Machine Translation. NAACL-HLT 2021: 3982-3988
  • Jian Yang, Juncheng Wan, Shuming Ma, Haoyang Huang, Dongdong Zhang, Yong Yu, Zhoujun Li, Furu Wei: Learning to Select Relevant Knowledge for Neural Machine Translation. NLPCC (1) 2021: 79-91
  • Qiaolin Xia, Haoyang Huang, Nan Duan, Dongdong Zhang, Lei Ji, Zhifang Sui, Edward Cui, Taroon Bharti, Ming Zhou: XGPT: Cross-modal Generative Pre-Training for Image Captioning. NLPCC (1) 2021: 786-797
  • Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei: DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders. CoRR abs/2106.13736 (2021)
  • Jian Yang, Shuming Ma, Dongdong Zhang, Shuangzhi Wu, Zhoujun Li, Ming Zhou: Alternating Language Modeling for Cross-Lingual Pre-Training. AAAI 2020: 9386-9393
  • Shuming Ma, Dongdong Zhang, Ming Zhou: A Simple and Effective Unified Encoder for Document-Level Machine Translation. ACL 2020: 3505-3511
  • Jian Yang, Shuming Ma, Dongdong Zhang, Zhoujun Li, Ming Zhou: Improving Neural Machine Translation with Soft Template Prediction. ACL 2020: 5979-5989
  • Shuangzhi Wu, Dongdong Zhang, Ming Zhou: Effective Soft-Adaptation for Neural Machine Translation. NLPCC (2) 2019: 254-264
  • Pengcheng Yang, Fuli Luo, Shuangzhi Wu, Jingjing Xu, Dongdong Zhang: Learning Unsupervised Word Mapping via Maximum Mean Discrepancy. NLPCC (1) 2019: 290-302
  • Shuangzhi Wu, Dongdong Zhang, Zhirui Zhang, Nan Yang, Mu Li, Ming Zhou: Dependency-to-Dependency Neural Machine Translation. IEEE ACM Trans. Audio Speech Lang. Process. 26(11): 2132-2141 (2018)
  • Jian Yang, Shuangzhi Wu, Dongdong Zhang, Zhoujun Li, Ming Zhou: Improved Neural Machine Translation with Chinese Phonologic Features. NLPCC (1) 2018: 303-315
  • Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dongdong Zhang, Zhirui Zhang, Ming Zhou: Achieving Human Parity on Automatic Chinese to English News Translation. CoRR abs/1803.05567 (2018)
  • Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, Ming Zhou: Sequence-to-Dependency Neural Machine Translation. ACL (1) 2017: 698-707
  • Shuangzhi Wu, Ming Zhou, Dongdong Zhang: Improved Neural Machine Translation with Source Syntax. IJCAI 2017: 4179-4185
  • Shuangzhi Wu, Dongdong Zhang, Shujie Liu, Ming Zhou: Modeling Indicative Context for Statistical Machine Translation. NLPCC 2017: 224-232
  • Shuangzhi Wu, Dongdong Zhang, Ming Zhou, Tiejun Zhao. Efficient Disfluency Detection with Transition-based Parsing. ACL (1) 2015: 495-503
  • Qiang Li, Mu Li, Dongdong Zhang, Jingbo Zhu. Research on Example-Based Phrase Pairs in Statistical Machine Translation Research on Example-Based Phrase Pairs in Statistical Machine Translation. NLPCC 2016.
  • Lei Cui, Ming Zhou, Qiming Chen, Dongdong Zhang, Mu Li. Machine Translation with Real-Time Web Search. AAAI 2014: 23-29
  • Lei Cui, Dongdong Zhang, Shujie Liu, Qiming Chen, Mu Li, Ming Zhou, Muyun Yang. Learning Topic Representation for SMT with Neural Networks. ACL (1) 2014: 133-143
  • Hailong Cao, Dongdong Zhang, Mu Li, Ming Zhou, Tiejun Zhao. A Lexicalized Reordering Model for Hierarchical Phrase-based Translation. COLING 2014: 1144-1153
  • Hailong Cao, Dongdong Zhang, Ming Zhou, Tiejun Zhao. Soft Dependency Matching for Hierarchical Phrase-based Machine Translation. COLING 2014: 2227-2236
  • Bo Wang, Ming Zhou, Shujie Liu, Mu Li, Dongdong Zhang. Woodpecker: An Automatic Methodology for Machine Translation Diagnosis with Rich Linguistic Knowledge. J. Inf. Sci. Eng. 30(5): 1407-1424 (2014)
  • Dongdong Zhang, Shuangzhi Wu, Nan Yang, Mu Li. Punctuation Prediction with Transition-based Parsing. ACL 2013.
  • Lei Cui, Dongdong Zhang, Shujie Liu, Mu Li, Ming Zhou. Collective Corpus Weighting and Phrase Scoring for SMT using Graph-based Random Walk. NLPCC 2013.
  • Lei Cui, Xilun Chen, Dongdong Zhang, Shujie Liu, Mu Li, Ming Zhou. Multi-Domain Adaptation for SMT Using Multi-Task Learning. EMNLP 2013.
  • Lei Cui, Dongdong Zhang, Shujie Liu, Mu Li, and Ming Zhou, Bilingual Data Cleaning for SMT using Graph-based Random Walk, ACL 2013.
  • Seung-Wook Lee, Dongdong Zhang, Mu Li, Ming Zhou, Hae-Chang Rim: Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation. ACL (2) 2012: 291-295
  • Nan Yang, Mu Li, Dongdong Zhang, Nenghai Yu: A Ranking-based Approach to Word Reordering for Statistical Machine Translation. ACL (1) 2012: 912-920
  • Yang Feng, Dongdong Zhang, Mu Li, Qun Liu: Hierarchical Chunk-to-String Translation. ACL (1) 2012: 950-958
  • Yang Feng, Dongdong Zhang, Qun Liu. Prepositional Phrase Reordering for Hierarchical Phrase-Based Translation. Journal of Chinese Information Processing. 2012: 26(1)
  • Lei Cui, Dongdong Zhang, Mu Li, Ming Zhou. Function Word Generation in Statistical Machine Translation Systems. MT Summit 2011.
  • Nan Duan, Mu Li, Dongdong Zhang, Ming Zhou: Mixture Model-based Minimum Bayes Risk Decoding using Multiple Machine Translation Systems. COLING 2010: 313-321
  • Lei Cui, Dongdong Zhang, Mu Li, Ming Zhou, Tiejun Zhao. A Joint Rule Selection Model for Hierarchical Phrase-Based Translation. ACL 2010
  • Lei Cui, Dongdong Zhang, Mu Li, Ming Zhou, Tiejun Zhao. Hybrid Decoding: Decoding with Partial Hypotheses Combination over Multiple SMT Systems. COLING 2010
  • Mu Li, Yinggong Zhao, Dongdong Zhang, Ming Zhou: Adaptive Development Data Selection for Log-linear Model in Statistical Machine Translation. COLING 2010: 662-670
  • Mu Li, Nan Duan, Dongdong Zhang, Chi-Ho Li, Ming Zhou: Collaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders. ACL/IJCNLP 2009: 585-592
  • Tong Xiao, Mu Li, Dongdong Zhang, Jingbo Zhu, Ming Zhou: Better Synchronous Binarization for Machine Translation. EMNLP 2009: 362-370
  • Dongdong Zhang, Chi-ho Li, Nan Duan, Shujie Liu, Mu Li, and Ming Zhou, The Evaluation Technical Report of Chinese-to-English Machine Translation System from Microsoft Research Asia, CWMT, 2009
  • Dongdong Zhang, Mu Li, Nan Duan, Chi-Ho Li, Ming Zhou: Measure Word Generation for English-Chinese SMT Systems. ACL 2008: 89-96
  • Chi-Ho Li, Hailei Zhang, Dongdong Zhang, Mu Li, Ming Zhou. An empirical study in source word deletion for phrase-based statistical machine translation. In Proc. Third Workshop on Statistical Machine Translation, 2008.
  • Ming Zhou, Bo Wang, Shujie Liu, Mu Li, Dongdong Zhang, Tiejun Zhao: Diagnostic Evaluation of Machine Translation Systems Using Automatically Constructed Linguistic Check-Points. COLING 2008: 1121-1128
  • Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, and Ming Zhou, The MSR-MSRA MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008
  • Chi-Ho Li, Minghui Li, Dongdong Zhang, Mu Li, Ming Zhou, Yi Guan: A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation. ACL 2007
  • Dongdong Zhang, Mu Li, Chi-Ho Li, Ming Zhou: Phrase Reordering Model Integrating Syntactic Knowledge for SMT. EMNLP-CoNLL 2007: 533-540
  • Dongdong Zhang, Jianzhong Li, Kimutai Kimeli, Weiping Wang.” SlidingWindow based Multi-Join Algorithms over Distributed Data Streams”. The 22nd IEEE International Conference on Data Engineering (ICDE), 2006.
  • Weiping Wang, Jianzhong Li, Dongdong Zhang, Longjiang Guo. An Algorithm for Continuous J-A Queries Processing over Data Streams based on Sliding Windows. Journal of Software, 2006.
  • Dongdong Zhang, Jianzhong Li, Weiping Wang, Longjiang Guo, Chunyu Ai. Processing Frequent Items over Distributed Data Streams. Web Technologies Research and Development 7th Asia-Pacific Web Conference (APWEB), Shanghai, China, 2005. Lecture Notes in Computer Science 3399 Springer 2005: 523-529.
  • Dongdong Zhang, Jianzhong Li, Weiping Wang, Longjiang Guo, Chunyu Ai. Reducing Communication Overhead over Distributed Data Streams By filtering Frequent Items. Journal of Digital Information Management. 2005, Vol. 3, No.2.
  • Dongdong Zhang, Jianzhong Li, Weiping Wang, Longjiang Guo. Algorithms for Storing and Aggregating Historical Streaming Data. Journal of Software. 2005.
  • Jianzhong Li, Longjiang Guo, Dongdong Zhang, Weiping Zhang. Processing Algorithms for Predictive Aggregate Queries over Data Streams. Journal of Software, 2005,16(7):1252~1261.
  • Dongdong Zhang, Jianzhong Li, Weiping Wang, Longjiang Guo. Distributed Compound-Data Streams Processing.Journal of Computer Research and Development. 2004, Vol. 41, No. 10, pp: 1780-1785.
  • Jianzhong Li, Dongdong Zhang. Algorithms for Dynamically Adjusting the Sizes of Sliding Windows. Journal of Software, 2004, Vol. 15, No. 12, pp: 1800-1814.
  • Dongdong Zhang, Jianzhong Li, Weiping Wang, Jinbao Li, Longjiang Guo. Processing Distributed Compound-Data Streams. The East-European Conference on Advances in Databases and Information Systems (ADBIS), Budapest, Hungary, 2004. (local proceedings)
  • Longjiang Guo, Jianzhong Li, Weiping Wang, Dongdong Zhang. Predictive Continuous Aggregate Queries over Data Streams, Journal of Computer Research and Development, 2004, Vol. 41, No. 10, pp: 1690-1695.
  • Weiping Wang, Jianzhong Li, DongDong Zhang, Longjiang Guo. Processing Sliding Window Join Aggregate in Continuous Queries Over Data Streams. The East-European Conference on Advances in Databases and Information Systems (ADBIS). Budapest, Hungary, 2004. Lecture Notes in Computer Science 3255 Springer 2004: 348-363.
  • Weiping Wang, Jianzhong Li, Xu Wang, DongDong Zhang, Longjiang Guo. Evaluating Stream and Disk Join in Continuous Queries. Grid and Cooperative Computing: Third International Conference (GCC), Wuhan, China. Lecture Notes in Computer Science 3251 Springer, 2004: 823-826.
  • Dongdong Zhang, Jianzhong Li, Zhaogong Zhang, Weiping Wang, Longjiang Guo. Dynamic Adjustment of Sliding Windows over Data Streams. Advances in Web-Age Information Management: 5th International Conference (WAIM), Dalian, China, 2004. Lecture Notes in Computer Science 3129 Springer 2004: 24-33.
  • Weiping Wang, Jianzhong Li, Dongdong Zhang, Longjiang Guo. Periodically Updated Sliding Window Join Algorithm Over Data Streams. Journal of Harbin Institute of Technology, 2004, Vol.36, No.10.
  • Weiping Wang, Jianzhong Li, Dongdong Zhang, Longjiang Guo, Xu Wang. A Parallel Method for Processing Continuous Queries on Data Streams, Journal of Computer Research and Development, 2004, Vol. 41, No. 10 (Suppl.), pp: 603-609.
  • Dongdong Zhang, Jianzhong Li, Weiping Wang, Longjiang Guo. Algorithms for Aggregating History of Time-Series Data Streams. Computer Science, 2003, Vol. 30, No. 10 (Suppl. A), pp: 291-295.
  • Jianzhong Li, Dongdong Zhang, Yanqiu Zhang. Join Algorithm Based on Tertiary Storage.Journal of Software. 2003, Vol. 14, No. 5, pp: 947-954.
  • Weiping Wang, Jiangzhong Li, Dongdong Zhang, Longjiang Guo. Research of Timestamp-based Sliding Window Join Algorithm Over Data Stream. Computer Science, 2003, Vol.30, No. 10 (Suppl. A), pp: 174-177.
  • Dongdong Zhang, Jianzhong Li, Hong Gao. Aggregate Index Tree: An Approach for Range Sum Queries. Computer Science, 2002, Vol. 29, No. 8 (Suppl. A), pp: 132-134. (Best paper in NDBC’02)
  • Dongdong Zhang, Jianzhong Li, Yanqiu Zhang. Join Algorithms Based on Tertiary Storage. Computer Science, 2001, Vol. 28, No. 8 (Suppl. A).


Machine Translation This book is intended to serve as an introductory book in machine translation field, covering as much as possible the various mainstream research methods and related resources in the history of machine translation research. The book is divided into 7 chapters and three main parts. Part One (Chapters 1-2) introduces the history, research overview and basic knowledge of machine translation, Part Two (Chapters 3-4) discusses in detail the theory and implementation of statistical machine translation methods, and Part Three (Chapters 5-7) focuses on the latest advances in the application of deep learning in machine translation research, including the basic knowledge of deep learning and different methods of applying deep learning in machine translation. Each chapter is followed by extended reading for the reference of readers who wish to delve deeper.