Research Focus: Week of November 22, 2023

Publié

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Research Focus: November 22, 2023 on a gradient patterned background

NEW RESEARCH

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation

Dynamic sparsity is a technique used in machine learning to reduce computational and memory requirements while maintaining or improving performance. This can be particularly useful when computational resources are limited, such as on embedded devices or mobile platforms. However, efficiently supporting dynamic sparse computation is challenging, since the concrete sparsity of tensors is known only at runtime. As a result, state-of-the-art sparsity-aware deep learning solutions are restricted to pre-defined, static sparsity patterns due to significant overheads associated with preprocessing.

In a new paper: PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation, researchers from Microsoft propose a deep-learning compiler for dynamic sparsity. Permutation Invariant Transformation (PIT) uses a novel tiling mechanism to transform multiple sparsely located micro-tiles into a GPU-efficient dense tile without changing the computation results, thus achieving both high GPU utilization and low coverage waste. Given a model, PIT first finds feasible PIT rules for all its operators and generates efficient GPU kernels accordingly. At runtime, with the novel SRead and SWrite primitives, PIT rules can be executed rapidly to support dynamic sparsity in an online manner. Extensive evaluation on diverse models shows that PIT can accelerate dynamic sparsity computation by up to 5.9x (average 2.43x) over state-of-the-art compilers.

Spotlight: Blog post

Eureka: Evaluating and understanding progress in AI

How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings. 

NEW RESEARCH

TongueTap: Multimodal Tongue Gesture Recognition with Head-Worn Devices

Mouth-based interfaces are a promising new approach enabling silent, hands-free and eyes-free interaction with wearable devices. However, interfaces sensing mouth movements are traditionally custom-designed and placed near or within the mouth.

TongueTap synchronizes multimodal electroencephalogram (EEG), photoplethysmogram (PPG), inertial measurement unit (IMU), eye tracking and head tracking data from two commercial headsets to facilitate tongue gesture recognition using only off-the-shelf devices on the upper face. In a new paper: TongueTap: Multimodal Tongue Gesture Recognition with Head-Worn Devices, researchers from Microsoft classify eight closed-mouth tongue gestures with 94% accuracy, offering an invisible and inaudible method for discreet control of head-worn devices. Moreover, the research showed that the IMU alone differentiates eight gestures with 80% accuracy and a subset of four gestures with 92% accuracy. The researchers built a dataset of 48,000 gesture trials across 16 participants, allowing TongueTap to perform user-independent classification. The findings suggest tongue gestures can be a viable interaction technique for VR/AR headsets and wearables without requiring novel hardware.


NEW RESEARCH

Ranking LLM-Generated Loop Invariants for Program Verification

Synthesizing inductive loop invariants is fundamental to automating program verification. In a new paper: Ranking LLM-Generated Loop Invariants for Program Verification, researchers from Microsoft demonstrate that large language models (LLMs), such as GPT-3.5 or GPT-4, are capable of synthesizing loop invariants for a class of programs in a zero-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier or provide multiple incorrect suggestions to an interactive verification user in establishing an invariant.

To address this issue, the researchers propose a re-ranking approach for the generated results of LLMs, including a newly designed ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier.


NEW RESEARCH

Assessing the limits of zero-shot foundation models in single-cell biology

The success of foundation models such as GPT has sparked growing interest in their application to single-cell biology. Models like Geneformer (opens in new tab) and scGPT (opens in new tab) have emerged with the promise of serving as versatile tools for this specialized field. However, the efficacy of these models, particularly in zero-shot settings where models are not fine-tuned but used without any further training, remains an open question, especially as practical constraints require useful models to function in settings that preclude fine-tuning. For example, many biological problems are inherently exploratory, and intended to discover hypotheses for further experimentation. In such settings, labels that can serve as targets for downstream fine-tuning may not be known or may be biased. In other computational biology domains (including microscopy images and protein sequences), zero-shot evaluation is routine practice for this reason. However, this is not yet an established standard for single-cell foundation model work, where evaluation practices are still emerging.

In a new paper: Assessing the limits of zero-shot foundation models in single-cell biology, researchers from Microsoft present a rigorous evaluation of the zero-shot performance of these proposed single-cell foundation models. They assess their utility in tasks such as cell type clustering and batch effect correction, and evaluate the generality of their pretraining objectives. Research results indicate that both Geneformer and scGPT exhibit limited reliability in zero-shot settings and often underperform compared to simpler methods. These findings serve as a cautionary note for the deployment of proposed single-cell foundation models and highlight the need for more focused research to realize their potential.


NEW RESEARCH

Confidential Consortium Framework: Secure Multiparty Applications with Confidentiality, Integrity, and High Availability

Confidentiality, integrity protection, and high availability – abbreviated to CIA – are essential properties for trustworthy data systems. However, the rise of cloud computing and the growing demand for multiparty applications make building modern CIA systems more challenging than ever.

In response, researchers from Microsoft present: Confidential Consortium Framework: Secure Multiparty Applications with Confidentiality, Integrity, and High Availability (opens in new tab), a general-purpose foundation for developing secure stateful CIA applications. Confidential Consortium Framework (CCF) combines centralized compute with decentralized trust, supporting deployment on untrusted cloud infrastructure and transparent governance by mutually untrusted parties. CCF leverages hardware-based trusted execution environments for remotely verifiable confidentiality and code integrity. This is coupled with state machine replication backed by an auditable immutable ledger for data integrity and high availability. CCF enables each service to bring its own application logic, custom multiparty governance model, and deployment scenario, decoupling the operators of nodes from the consortium that governs them. CCF is open-source and available now at https://github.com/microsoft/CCF (opens in new tab).

Publications connexes

Lire la suite

Voir tous les articles de blog