{"id":1018299,"date":"2024-03-26T21:24:21","date_gmt":"2024-03-27T04:24:21","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=1018299"},"modified":"2024-07-14T19:24:12","modified_gmt":"2024-07-15T02:24:12","slug":"changho-hwang-pursuing-long-term-research-takes-constant-self-persuasion","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/changho-hwang-pursuing-long-term-research-takes-constant-self-persuasion\/","title":{"rendered":"Changho Hwang: Pursuing long-term research takes constant self-persuasion"},"content":{"rendered":"\n
“Before I became an intern at Microsoft Research Asia (MSR Asia), my knowledge of the institute was a paper on ResNet (Residual Network),\u201d said Changho Hwang. \u201cIn the paper, researchers at MSR Asia introduced the idea of ‘residual learning’ and made ResNet a milestone in the development of computer vision technology.”<\/p>\n\n\n\n
Hwang’s first impression of MSR Asia was that it was the place for cutting-edge technological research conducted by top innovative talents.<\/p>\n\n\n\n
During the second year of his PhD studies, Hwang became an intern at MSR Asia thanks to the recommendation of his supervisor at the Korean Advanced Institute of Science and Technology (KAIST). After two internships at the lab, one during the winter of 2018 and one during the summer of 2019, Hwang developed a new understanding of MSR Asia. He decided that upon graduation, his career goal would be to join MSR Asia and further pursue forward-looking technological research. Hwang said, \u201cAt the time, some of my classmates and colleagues had introduced me other laboratories and companies, but my internship experiences made MSR Asia a clear choice for me. I preferred the working environment and research atmosphere here. This place enabled me to focus on the areas I was really interested in.”<\/p>\n\n\n\n
According to Hwang, what\u2019s most attractive about MSR Asia is that it always does the right thing in the right way. MSR Asia never blindly follows technology trends. Rather, it sets unique strategies and research directions and always looks at the bigger picture while focusing on cutting-edge technologies.<\/p>\n\n\n\n
The people Hwang worked with during his internships and the diverse research directions found at MSR Asia were also important reasons behind Hwang’s decision to join the lab. MSR Asia boasts a group of extremely professional yet convivial researchers. Hwang’s mentor during his internships was highly approachable and offered him a great deal of freedom and solid academic support for his research. His colleagues were also warm and helpful both in and out of the office, and allowed Hwang to feel at home despite being abroad. Furthermore, among the cutting-edge research endeavors undertaken by MSR Asia, Hwang discovered not only research areas and projects that matched his expertise in electrical engineering but also a multitude of interdisciplinary research directions that offered researchers opportunities to expand the breadth and depth of their academic pursuits. Therefore, after graduating from his doctorate program in 2022, Hwang quickly decided to join MSR Asia and became a member of the Networking Infrastructure Group. He currently holds a position as a researcher at MSR Asia – Vancouver.<\/p>\n\n\n\n During his internship, Hwang was assigned to a team tasked with optimizing the performance of GPUs that support the operation of artificial intelligence (AI) models. At the time, Hwang’s mission was clear: to find a new way to improve the throughput and utilization rate of AI systems through software-hardware collaborative designing. However, scientific research is often a long journey, and many studies do not yield immediate results. As an advocate of long-term research, Hwang did not see himself as a mere passerby in the research team. Instead, he continued to work with them for two years after returning to school. The subsequent research results achieved by the team won them the Best Paper Award at the MLArchSys 2022 conference.<\/p>\n\n\n\n Paper Title: Towards GPU driven Code Execution for Distributed Deep Learning<\/em><\/p>\n\n\n\n Paper link: https:\/\/chhwang.github.io\/pubs\/mlarchsys22_hwang.pdf (opens in new tab)<\/span><\/a><\/em><\/p>\n\n\n\n With the development of large models, GPUs had become increasingly crucial for training and deploying AI models, and the performance and utilization efficiency of GPUs directly affected AI development. Upon joining MSR Asia as a researcher, Hwang continued to focus on this area, except now, he was a project leader rather than a mere participant.<\/p>\n\n\n\n Hwang believed that the most advanced deep learning applications today required a large number of parallel GPUs to provide sufficient computing power. However, communication efficiency between GPUs and CPUs served as a restricting factor that affected the performance of AI models. This was because CPUs played the role of chief commander in the current GPU-driven communication mode of AI systems, where each CPU was responsible for assigning tasks to multiple GPUs, but there existed considerable delay in message transmission between them, leading to low efficiency in task execution and a waste of GPU resources.<\/p>\n\n\n\n In his research, Hwang\u2019s goal was to enable a GPU to command itself, thereby improving communication efficiency. To this end, he and his colleagues in the group designed a GPU-driven code execution system, along with a DMA engine that could be directly driven by the GPU. This allowed GPUs to directly solve communication problems that were used to require CPU commands, thus reducing communication latency in AI systems and improving the utilization rate of GPU computing resources. This new method freed up occupied CPU resources in earlier communication modes, allowing CPUs to focus on their own work and GPUs to perform autonomous scheduling as well as to do what it did best: provide higher computational performance for AI models. This research has demonstrated that an AI system based on distributed GPUs are capable of having GPUs manage task scheduling on their own. The paper on this research has been accepted by the NSDI 2023 conference.<\/p>\n\n\n\nEnhancing AI system performance: Hone achievements through progressive research<\/h3>\n\n\n\n