Research Focus: Week of December 4, 2023

Published

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Research Focus
December 6th, 2023

Leveraging Large Language Models for Automated Proof Synthesis in Rust

Formal verification can probably guarantee the correctness of critical system software, but the high proof burden has long hindered its wide adoption. Recently, large language models (LLMs) have shown success in code analysis and synthesis. In a recent paper: Leveraging Large Language Models for Automated Proof Synthesis in Rust, researchers from Microsoft present a combination of LLMs and static analysis to synthesize invariants, assertions, and other proof structures for a Rust-based formal verification framework called Verus.

In a few-shot setting, GPT-4 demonstrates impressive logical ability in generating postconditions and loop invariants, especially when analyzing short code snippets. However, GPT-4 does not consistently retain and propagate the full context information needed by Verus, a task that can be straightforwardly accomplished through static analysis. Based on these observations, the researchers developed a prototype based on OpenAI’s GPT-4 model. This prototype decomposes the verification task into multiple smaller ones, iteratively queries GPT-4, and combines its output with lightweight static analysis. Evaluating the prototype with a developer in the automation loop on 20 vector-manipulating programs showed that it significantly reduces human effort in writing entry-level proof code.

Spotlight: AI-POWERED EXPERIENCE

Microsoft research copilot experience

Discover more about research at Microsoft through our AI-powered experience

Don’t Forget the User: It’s Time to Rethink Network Measurements

The goal of network measurement is to characterize how and how well a network is performing. This has traditionally meant a focus on the bits and bytes — low-level network metrics such as latency and throughput, which have the advantage of being objective but are limited in representativeness and reach. In a recent paper: Don’t Forget the User: It’s Time to Rethink Network Measurements, researchers from Microsoft argue that users also provide a rich and largely untapped source of implicit and explicit signals that could complement and expand the coverage of traditional measurement methods. Implicit feedback leverages user actions to indirectly infer network performance and the resulting quality of user experience. Explicit feedback leverages user input, typically provided offline, to expand the reach of network measurement, especially for newer ones.

The researchers analyze example scenarios, including capturing implicit feedback through user actions such as the user (un)muting the mic or turning on/off the camera in a large-scale conferencing service. These techniques complement existing measurement methods and open a broad set of research directions, ranging from rethinking measurements tools, to designing user-centric networked systems and applications.


Ghana 3D international telemedicine proof of concept study

A real-time 3D telemedicine system – leveraging Holoportation™ communication technology – was used to facilitate consultations with complex reconstructive patients prior, during, and after an overseas surgical collaboration. The system was used in a proof-of-concept clinic in November 2022 between Canniesburn Plastic Surgery Unit, UK, and the National Reconstructive Plastic Surgery and Burns Centre, Korle Bu Teaching Hospital, Ghana.

Four patients in Ghana were followed through their patient journey (mandibular ameloblastoma, sarcoma thigh, maxillary tumor, sarcoma back). A new report: Ghana 3D Telemedicine International MDT: A Proof-of-concept study details the responses of 13 participants (4 patients, 4 Ghana clinicians, 5 UK clinicians) completed feedback on the 3D multidisciplinary team (MDT). Outcome measures were rated highly with satisfaction 84.31/100, perceived benefit 4.54/5, overall quality 127.3/ 147, and usability 83.2/100. This data shows close alignment with that previously published on high income countries.

This novel technology has potential to enhance overseas surgical visits in low-to-middle income countries through improved planning, informed discussion with patients, expert consensus on complex cases, and fostering engagement with professionals who may be thousands of miles away.


Related publications

Continue reading

See all blog posts