NEW RESEARCH
Anonymous Tokens with Stronger Metadata Bit Hiding from Algebraic MACs
Protecting the web from malicious activities such as bots or DoS attacks is an important goal. Researchers and practitioners have identified different approaches to balance user experience and security. For example, anonymous tokens allow an issuer to ensure that a user has been vetted while also protecting the user’s privacy. However, in some cases, the issuance or absence of a token can inform an adversary about the strategies used to distinguish honest users from bots or attackers.
In a recent paper: Anonymous Tokens with Stronger Metadata Bit Hiding from Algebraic MACs, researchers from Microsoft show how they designed an anonymous token protocol between a client and an issuer (also a verifier) that enables the issuer to support its fraud detection mechanisms while preserving users’ privacy.
Spotlight: Blog post
NEW RESEARCH
Survival Instinct in Offline Reinforcement Learning
On many benchmark datasets, offline reinforcement learning (RL) can produce well-performing and safe policies, even when trained with “wrong” reward labels, such as those that are zero everywhere or are negatives of the true rewards. This phenomenon cannot be easily explained by offline RL’s return maximization objective. Moreover, it gives offline RL a degree of robustness that is uncharacteristic of its online RL counterparts, which are known to be sensitive to reward design.
In a new paper: Survival Instinct in Offline Reinforcement Learning, researchers from the University of Washington and Microsoft demonstrate that this surprising robustness property is attributable to an interplay between the notion of pessimism in offline RL algorithms and a certain bias implicit in common data collection practices. This work shows that pessimism endows the agent with a “survival instinct” – an incentive to stay within the data support in the long term, while the limited and biased data coverage further constrains the set of survival policies. The researchers argue that the survival instinct should be taken into account when interpreting results from existing offline RL benchmarks and when creating new ones. This research suggests a new paradigm for RL, whereby an agent is “nudged” to learn a desirable behavior with imperfect reward but purposely biased data coverage.
NEW RESEARCH
Nimble: Rollback Protection for Confidential Cloud Services
Cloud providers today offer confidential computing services in which virtual machines (VMs) support trusted execution environments (TEEs), that isolate a customer’s code from other code (including the hypervisor). TEEs offer security properties such as memory confidentiality and execution integrity, even if the provider is compromised. However, TEEs provide volatile state storage, not persistent state storage. So, if a TEE crashes or is maliciously restarted, its data can be lost.
A common way that TEEs today avoid such data loss is to persist an encrypted version of their data in a fault-tolerant cloud storage system such as Azure Table Storage or Cosmos DB. While authenticated encryption ensures that unauthorized parties cannot see the sensitive data or change its contents, encryption does not prevent a compromised provider from returning encryptions of old data. This is known as a “rollback attack,” in which an attacker can return an application running in a TEE to a previous state, potentially one that is vulnerable to attacks or that causes the application to perform incorrect actions.
In a recent paper, Nimble: Rollback Protection for Confidential Cloud Services, researchers from Microsoft and academic colleagues introduce Nimble, a cloud service that helps applications running in TEE detect rollback attacks.
NEW RESEARCH
Improving machine learning force fields for molecular dynamics simulations with fine-grained force metrics
Machine learning force fields (MLFFs) provide a cost-effective alternative to ab initio molecular dynamics (MD) simulations – a computational method used in theoretical chemistry and materials science to simulate the behavior of molecules and materials at the atomic level. While they typically produce only small errors on the test set, MLFFs inherently encounter generalization and robustness issues during MD simulations.
In a recent paper: Improving machine learning force fields for molecular dynamics simulations with fine-grained force metrics, researchers from Microsoft propose alleviating those issues using global force metrics and fine-grained metrics from element and conformation aspects to systematically measure MLFFs for every atom and every conformation of molecules. Such force metrics can directly examine MLFFs without running costly MD simulations, reducing the computational cost of MLFF evaluation.
The researchers show that an accurate force prediction by MLFFs for all kinds of atom types and all possible conformations plays a crucial role in their usefulness in MD simulations. In addition, they designed continued learning and fine-tuning approaches to improve the performance of MLFFs.
NEW RESEARCH
Project Rumi: Multimodal paralinguistic prompting for LLMs
Large language models (LLMs) are algorithms that process and generate natural language, which can be used to create powerful new productivity tools. However, LLMs may not fully reflect the context and nuances of a conversation. Their performance depends in part on the quality and specificity of the user’s input, or prompt. User input data is a lexical entry, which lacks paralinguistic information (intonation, gestures, facial expressions, etc.) that may convey a speaker’s intentions. This can lead to misinterpretation, misunderstanding, or inappropriate responses from the LLM.
Conveying unspoken meaning and intention is an essential component in the next generation of AI interaction. To improve the quality of the underlying communication, researchers from Microsoft are developing a system called Project Rumi, which incorporates paralinguistic input into prompt-based interactions with LLMs. This system leverages separately trained vision and audio-based models to detect and analyze non-verbal cues extracted from data streams, assessing sentiment from cognitive and physiological data in real time. This multimodal, muti-step architecture integrates with all pretrained text-based LLMs to provide additional information on the user’s sentiment and intention that is not captured by text-based models.