Research Focus: Week of April 24, 2023

已发布

Microsoft Research Focus 14 edition, week of April 24, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

AWARD

Microsoft researcher Kalai awarded 2022 ACM Prize in Computing

Yael Tauman Kalai, a senior principal researcher at Microsoft Research, has been awarded the 2022 ACM Prize in Computing. Kalai was recognized for breakthroughs in verifiable delegation of computation and fundamental contributions to cryptography. According to the award announcement, “Kalai’s contributions have helped shape modern cryptographic practices and provided a strong foundation for further advancements.”

The ACM Prize in Computing (opens in new tab) recognizes early-to-mid-career computer scientists whose research contributions have fundamental impact and broad implications.

Spotlight: Blog post

MedFuzz: Exploring the robustness of LLMs on medical challenge problems

Medfuzz tests LLMs by breaking benchmark assumptions, exposing vulnerabilities to bolster real-world accuracy.

Among the multiple accomplishments cited for the award, Kalai has developed methods for producing succinct proofs that certify the correctness of any computation. This method enables a weak device to offload any computation to a stronger device in a way that enables the results to be efficiently checked for correctness. Such succinct proofs have been used by blockchain companies to certify transaction validity, thereby overcoming key obstacles in blockchain scalability and enabling faster and more reliable transactions.

Kalai was also cited for her breakthrough work on the security of the “Fiat-Shamir paradigm,” a general technique for eliminating interaction from interactive protocols. This paradigm is extensively utilized in real-world applications, including the most prevalent digital signature scheme (ECDSA), which is used by all iOS and Android mobile devices.


NEW RESEARCH

Empowering Azure Storage with RDMA

High performance and highly reliable storage are fundamental requirements of public clouds. Given the wide adoption of disaggregated storage in the cloud, networking is essential for enabling high performance and high reliability. Microsoft’s Azure cloud service uses remote direct memory access (RDMA) as its transport and aims to enable it for both storage frontend traffic (between compute virtual machines and storage clusters) and backend traffic (within a storage cluster) to fully realize its benefits. As compute and storage clusters may be located in different datacenters within an Azure region, RDMA needs to be supported at regional scale.

In a new paper: Empowering Azure Storage with RDMA, Microsoft Azure and Microsoft Research report on their intra-region RDMA deployment to support storage workloads in Azure. The high complexity and heterogeneity of Azure infrastructure creates challenges, such as the problem of interoperability between different types of RDMA network interface cards. Several changes were made to the network infrastructure to address these challenges. Today, around 70% of traffic in Azure is RDMA and intra-region RDMA is supported in all Azure public regions. This helps achieve significant disk I/O performance improvements and CPU core savings.


NEW RESEARCH

LIDA: Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models

Systems that support users in the automatic creation of visualizations must address several subtasks—understand the semantics of data; enumerate relevant visualization goals; and generate visualization specifications. In a new paper: LIDA: Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models, researchers from Microsoft pose visualization generation as a multi-stage generation problem and argue that well-orchestrated pipelines based on large language models (LLMs) and image generation models (IGMs) are suitable to addressing these tasks.

LIDA is a novel tool for generating grammar-agnostic visualizations and infographics. It is comprised of four modules—a summarizer that converts data into a rich but compact natural language summary; a goal explorer that enumerates visualization goals given the data; a visgenerator that generates, evaluates, refines, executes, and filters visualization code; and an infographer module that yields data-faithful stylized graphics using IGMs. LIDA provides a python API and a hybrid user interface (direct manipulation and multilingual natural language) for interactive chart, infographics and data story generation.


NEW RELEASE

Announcing DeepSpeed-Chat: Easy, fast, affordable RLHF Training of ChatGPT-like models at all scales

Microsoft’s AI at Scale initiative has released DeepSpeed-Chat (opens in new tab), an easy, fast, and low-cost open-source solution for reinforcement learning from human feedback (RLHF) training that can create high-quality ChatGPT-like models ranging in size from a few to hundreds of billions of parameters. DeepSpeed-Chat provides complete RLHF training experience with a single click. It combines the prowess of DeepSpeed-Inference and DeepSpeed-Training to offer 15x faster throughput than the previous state of the art, while also supporting model sizes that are up to 8x larger on the same hardware. With DeepSpeed-Chat, practitioners can train an OPT-13B ChatGPT-like model in under 1.5 hours or a massive 175B model in a day on a modest GPU cluster. For those who don’t have a GPU cluster handy, DeepSpeed-Chat enables practitioners to train up to a 13B model on a single GPU, or at $300 to train on Azure Cloud. 


NEWS

Gov4git: Decentralized community governance to fuel open-source projects

Communal open-source projects have helped build countless applications for sourcing and sharing information like bug details and scientific data, as well as decentralized planning, design and policymaking. 

But the lack of a standardized and secure governance solution prevents many open-source projects from getting started—and holds them back when they get too big to be managed through ad-hoc methods. These small communities often resort to external mechanisms to manage their projects and protect them from malicious actors.

Microsoft Research and Protocol Labs, an open-source R&D company, are collaborating to develop Gov4git, a decentralized, git-native protocol with configurable governance rules to help launch more open-source projects and communities and support their growth.

Gov4git comes with many of the transparency, decentralization, and security benefits of blockchains while also harnessing the power of formal governance to avoid costly approaches to validation and dispute resolution. 

Git is the worldwide standard for version control and management of collaborative software development projects. Gov4git is designed as a secure and cost-effective framework solution which can be tailored to the specific needs of any one community and deployed by non-technical users anywhere where access to git is present. Gov4git can strengthen the security of such communities against the risks of malicious actors posing as collaborators with the intent to negatively impact community maintenance.

相关论文与出版物

继续阅读

查看所有博客文章