{"id":989523,"date":"2023-12-11T07:00:00","date_gmt":"2023-12-11T15:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=989523"},"modified":"2024-01-04T07:08:27","modified_gmt":"2024-01-04T15:08:27","slug":"neurips-2023-highlights-breadth-of-microsofts-machine-learning-innovation","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/neurips-2023-highlights-breadth-of-microsofts-machine-learning-innovation\/","title":{"rendered":"NeurIPS 2023 highlights breadth of Microsoft&#8217;s machine learning innovation"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1.png\" alt=\"Research Focus: NeurIPS\nDecember 11, 2023\" class=\"wp-image-990225\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-1280x720.png 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<p>Microsoft is proud to sponsor the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/event\/neurips-2023\/\" target=\"_blank\" rel=\"noreferrer noopener\">37th Conference on Neural Information Processing Systems<\/a> (NeurIPS 2023). This interdisciplinary forum brings together experts in machine learning, neuroscience, statistics, optimization, computer vision, natural language processing, life sciences, natural sciences, social sciences, and other adjacent fields. We are pleased to share that Microsoft has over 100 accepted papers and is offering 18 workshops at NeurIPS 2023.\u00a0<\/p>\n\n\n\n<p>This year\u2019s conference includes three papers from Microsoft that were chosen for oral presentations, which feature groundbreaking concepts, methods, or applications, addressing pressing issues in the field. Additionally, our spotlights posters, also highlighted below, have been carefully curated by conference organizers, exhibiting novelty, technical rigor, and the potential to significantly impact the landscape of machine learning. This blog post celebrates those achievements.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"oral-presentations\">Oral Presentations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"bridging-discrete-and-backpropagation-straight-through-and-beyond\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/bridging-discrete-and-backpropagation-straight-through-and-beyond\/\" target=\"_blank\" rel=\"noreferrer noopener\">Bridging Discrete and Backpropagation: Straight-Through and Beyond<\/a><\/h3>\n\n\n\n<p>Gradient computations are pivotal in deep learning&#8217;s success, yet they predominantly depend on backpropagation, a technique limited to continuous variables. The paper <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/bridging-discrete-and-backpropagation-straight-through-and-beyond\/\" target=\"_blank\" rel=\"noreferrer noopener\">Bridging Discrete and Backpropagation: Straight-Through and Beyond<\/a>, tackles this limitation. It introduces ReinMax, extending backpropagation&#8217;s capability to estimate gradients for models incorporating discrete variable sampling. Within extensive experiments of this study, ReinMax demonstrates consistent and significant performance gain over the state of the art. More than just a practical solution, the paper sheds light on existing deep learning practices. It elucidates that the &#8216;Straight-Through&#8217; method, once considered merely a heuristic trick, is actually a viable first-order approximation for the general multinomial case. Correspondingly, ReinMax achieves second-order accuracy in this context without the complexities of second-order derivatives, thus having negligible computation overheads.&nbsp;<\/p>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1115763\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Microsoft Research Blog<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/aiopslab-building-ai-agents-for-autonomous-clouds\/\" aria-label=\"AIOpsLab: Building AI agents for autonomous clouds\" data-bi-cN=\"AIOpsLab: Building AI agents for autonomous clouds\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/12\/AIOps-Lab-BlogHeroFeature-1400x788-1.png\" alt=\"White outline illustrations for AIOps on a blue and green gradient background.\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">AIOpsLab: Building AI agents for autonomous clouds<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p class=\"large\">AIOpsLab is an open-source framework designed to evaluate and improve AI agents for cloud operations, offering standardized, scalable benchmarks for real-world testing, enhancing cloud system reliability.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/aiopslab-building-ai-agents-for-autonomous-clouds\/\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" aria-label=\"Read more\" data-bi-cN=\"AIOpsLab: Building AI agents for autonomous clouds\" target=\"_blank\">\n\t\t\t\t\t\t\tRead more\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h3 class=\"wp-block-heading\" id=\"the-minerl-basalt-competition-on-learning-from-human-feedback\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/the-minerl-basalt-competition-on-learning-from-human-feedback\/\" target=\"_blank\" rel=\"noreferrer noopener\">The MineRL BASALT Competition on Learning from Human Feedback<\/a><\/h3>\n\n\n\n<p>The growth of deep learning research, including its incorporation into commercial products, has created a new challenge: How can we build AI systems that solve tasks when a crisp, well-defined specification is lacking? To encourage research on this important class of techniques, researchers from Microsoft led <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/the-minerl-basalt-competition-on-learning-from-human-feedback\/\" target=\"_blank\" rel=\"noreferrer noopener\">The MineRL BASALT Competition on Learning from Human Feedback<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, an update to a contest <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/proceedings.mlr.press\/v176\/shah22a\/shah22a.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">first launched in 2021<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> by researchers at the University of California-Berkeley and elsewhere. The challenge of this competition was to complete fuzzy tasks from English language descriptions alone, with emphasis on encouraging different ways of learning from human feedback as an alternative to a traditional reward signal.&nbsp;<\/p>\n\n\n\n<p>The researchers designed a suite of four tasks in Minecraft for which writing hardcoded reward functions would be difficult. These tasks are defined by natural language: for example, &#8220;create a waterfall and take a scenic picture of it&#8221;, with additional clarifying details. Participants must train a separate agent for each task. Agents are then evaluated by humans who have read the task description.<\/p>\n\n\n\n<p>The competition aimed to encourage development of AI systems that do what their designers intended, even when the intent cannot be easily formalized. Besides allowing AI to solve more tasks, this can also enable more effective regulation of AI systems, as well as making progress on value alignment problems, in which the specified objectives of an AI agent differ from those of its users.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"related\">Related<\/h4>\n\n\n\n<div class=\"annotations \" data-bi-aN=\"citation\">\n\t<ul class=\"annotations__list card depth-16 bg-body p-4 \">\n\t\t<li class=\"annotations__list-item\">\n\t\t\t\t\t\t<span class=\"annotations__type d-block text-uppercase font-weight-semibold text-neutral-300 small\">Publication<\/span>\n\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/towards-solving-fuzzy-tasks-with-human-feedback-a-retrospective-of-the-minerl-basalt-2022-competition\/\" target=\"_self\" class=\"annotations__link font-weight-semibold text-decoration-none\" data-bi-type=\"annotated-link\" aria-label=\"Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition\" data-bi-aN=\"citation\" data-bi-cN=\"Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition\">\n\t\t\t\tTowards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition&nbsp;<span class=\"glyph-append glyph-append-chevron-right glyph-append-xsmall\"><\/span>\n\t\t\t<\/a>\n\t\t\t\t\t<\/li>\n\t<\/ul>\n<\/div>\n\n\n\n<div style=\"height:15px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"decodingtrust-a-comprehensive-assessment-of-trustworthiness-in-gpt-models\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/decodingtrust-a-comprehensive-assessment-of-trustworthiness-in-gpt-models\/\">DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models<\/a><\/h3>\n\n\n\n<p>This comprehensive evaluation platform aims to answer the question: How trustworthy are generative pre-trained transformer (GPT) models? In&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/decodingtrust-a-comprehensive-assessment-of-trustworthiness-in-gpt-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models<\/a>, researchers focus specifically on GPT-4, GPT-3.5, and a series of open LLMs. They consider diverse perspectives, including&nbsp;<em>toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness.<\/em><\/p>\n\n\n\n<p>The researchers\u2019 evaluations identified previously unpublished vulnerabilities relating to trustworthiness. The team worked with Microsoft product groups to confirm that the potential vulnerabilities identified do not impact current customer-facing services. This is in part true because finished AI applications apply a range of mitigation approaches to address potential harms that may occur at the model level of the technology. They also shared their findings with GPT\u2019s developer, OpenAI, which has noted the potential vulnerabilities in the system cards for relevant models.<\/p>\n\n\n\n<p>This research aims to encourage others in the research community to utilize and build upon this work, potentially pre-empting adversaries who would exploit vulnerabilities to cause harm. To facilitate collaboration, the benchmark code is very extensible and easy to use: a single command is sufficient to run the complete evaluation on a new model.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"spotlight-posters\">Spotlight Posters<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"differentially-private-approximate-near-neighbor-counting-in-high-dimensions\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/differentially-private-approximate-near-neighbor-counting-in-high-dimensions\/\" target=\"_blank\" rel=\"noreferrer noopener\">Differentially Private Approximate Near Neighbor Counting in High Dimensions<\/a><\/h3>\n\n\n\n<p>Differential privacy (DP) is a widely used tool for preserving the privacy of sensitive personal information. It allows a data structure to provide approximate answers to queries about the data it holds, while ensuring that the removal or addition of a single database entry does not significantly affect the outcome of any analysis.<\/p>\n\n\n\n<p>Range counting (counting the number of data points falling into a given query ball) under differential privacy has been studied extensively. However, current algorithms for this problem come with challenges. One class of algorithms suffers from an additive error that is a fixed polynomial in the number of points. Another class of algorithms allows for polylogarithmic additive error, but the error grows exponentially in the dimension. To achieve the latter, the problem is relaxed to allow a \u201cfuzzy\u201d definition of the range boundary, e.g., a count of the points in a ball of radius r might also include points in a ball of radius cr for some c > 1.<\/p>\n\n\n\n<p>In <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/differentially-private-approximate-near-neighbor-counting-in-high-dimensions\/\" target=\"_blank\" rel=\"noreferrer noopener\">Differentially Private Approximate Near Neighbor Counting in High Dimensions<\/a>, researchers present an efficient algorithm that offers a sweet spot between these two classes. The algorithm has an additive error that is an arbitrary small power of the data set size, depending on how fuzzy the range boundary is, as well as a small (1 + o(1)) multiplicative error. Crucially, the amount of noise added has no dependence on the dimension. This new algorithm introduces a variant of Locality-Sensitive Hashing, utilizing it in a novel manner.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"exposing-attention-glitches-with-flip-flop-language-modeling\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/exposing-attention-glitches-with-flip-flop-language-modeling\/\" target=\"_blank\" rel=\"noreferrer noopener\">Exposing Attention Glitches with Flip-Flop Language Modeling<\/a><\/h3>\n\n\n\n<p>Why do large language models sometimes output factual inaccuracies and exhibit erroneous reasoning? The brittleness of these models, particularly when executing long chains of reasoning, seems to be an inevitable price to pay for their advanced capabilities of coherently synthesizing knowledge, pragmatics, and abstract thought.<\/p>\n\n\n\n<p>To help make sense of this fundamentally unsolved problem, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/exposing-attention-glitches-with-flip-flop-language-modeling\/\" target=\"_blank\" rel=\"noreferrer noopener\">Exposing Attention Glitches with Flip-Flop Language Modeling<\/a> identifies and analyzes the phenomenon of attention glitches, in which the Transformer architecture\u2019s inductive biases intermittently fail to capture robust reasoning. To isolate the issue, the researchers introduce flip-flop language modeling (FFLM), a parametric family of synthetic benchmarks designed to probe the extrapolative behavior of neural language models. This simple generative task requires a model to copy binary symbols over long-range dependencies, ignoring the tokens in between. This research shows how Transformer FFLMs suffer from a long tail of sporadic reasoning errors, some of which can be eliminated using various regularization techniques. The preliminary mechanistic analyses show why the remaining errors may be very difficult to diagnose and resolve. The researchers hypothesize that attention glitches account for some of the closed-domain errors occurring in natural LLMs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"in-context-learning-unlocked-for-diffusion-models\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/in-context-learning-unlocked-for-diffusion-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">In-Context Learning Unlocked for Diffusion Models<\/a><\/h3>\n\n\n\n<p>An emergent behavior of large language models (LLMs) is the ability to learn from context, or <em>in-context learning. <\/em>With a properly designed prompt structure and in-context learning, LLMs can combine the pre-training of multiple language tasks and generalize well to previously unseen tasks. While in-context learning has been extensively studied in natural language processing (NLP), its applications in the field of computer vision are still limited.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/in-context-learning-unlocked-for-diffusion-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">In-Context Learning Unlocked for Diffusion Models<\/a> presents Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images and text guidance, this model understands the underlying task and performs the same task on a new query image following the text guidance. To achieve this, the researchers propose a vision-language prompt that can model a wide range of vision-language tasks, and a diffusion model that takes it as input. The diffusion model is trained jointly over six different tasks using these prompts. The resulting Prompt Diffusion model is the first diffusion-based vision-language foundation model capable of in-context learning. It demonstrates high-quality in-context generation on the trained tasks and generalizes to new, unseen vision tasks with their respective prompts. This model also shows compelling text-guided image editing results.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"optimizing-prompts-for-text-to-image-generation\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/optimizing-prompts-for-text-to-image-generation\/\" target=\"_blank\" rel=\"noreferrer noopener\">Optimizing Prompts for Text-to-Image Generation<\/a><\/h3>\n\n\n\n<p>Generative foundation models can be prompted to follow user instructions, including language models and text-to-image models. Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/optimizing-prompts-for-text-to-image-generation\/\" target=\"_blank\" rel=\"noreferrer noopener\">Optimizing Prompts for Text-to-Image Generation<\/a> proposes prompt adaptation, a general framework that automatically adapts original user input to model-preferred prompts.<\/p>\n\n\n\n<p>The researchers use reinforcement learning to explore better prompts with a language model. They define a reward function that encourages the policy network (i.e., language model) to generate more aesthetically pleasing images while preserving the original user intentions. Experimental results on Stable Diffusion show that this method outperforms manual prompt engineering in terms of both automatic metrics and human preference ratings. Reinforcement learning further boosts performance, especially on out-of-domain prompts.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"pareto-frontiers-in-neural-feature-learning-data-compute-width-and-luck\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/pareto-frontiers-in-neural-feature-learning-data-compute-width-and-luck\/\" target=\"_blank\" rel=\"noreferrer noopener\">Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck<\/a><\/h3>\n\n\n\n<p>Algorithm design in deep learning can appear to be more like \u201chacking\u201d than an engineering practice. There are numerous architectural choices and training heuristics, which can often modulate model performance and resource costs in unpredictable and entangled ways. As a result, when training large-scale neural networks (such as state-of-the-art language models), algorithmic decisions and resource allocations are foremost empirically-driven, involving the measurement and extrapolation of&nbsp;<em>scaling laws<\/em>. A precise mathematical understanding of this process is elusive, and cannot be explained by statistics or optimization in isolation.<\/p>\n\n\n\n<p>In <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/pareto-frontiers-in-neural-feature-learning-data-compute-width-and-luck\/\" target=\"_blank\" rel=\"noreferrer noopener\">Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck,<\/a>&nbsp;researchers from Microsoft, Harvard, and the University of Pennsylvania explore these algorithmic intricacies and tradeoffs through the lens of a single synthetic task: the finite-sample sparse parity learning problem. In this setting, the above complications are not only evident, but also provable: intuitively, due to the task\u2019s computational hardness, a neural network needs a sufficient&nbsp;<em>combination<\/em>&nbsp;of resources (\u201cdata \u00d7 model size \u00d7 training time \u00d7 luck\u201d) to succeed. This research shows that standard algorithmic choices in deep learning give rise to a&nbsp;<em>Pareto frontier<\/em>, in which successful learning is \u201cbought\u201d with interchangeable combinations of these resources. They show that algorithmic improvements on this toy problem can transfer to the real world, improving the data-efficiency of neural networks on small tabular datasets.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"pde-refiner-achieving-accurate-long-rollouts-with-neural-pde-solvers\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/pde-refiner-achieving-accurate-long-rollouts-with-neural-pde-solvers\/\" target=\"_blank\" rel=\"noreferrer noopener\">PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers<\/a><\/h3>\n\n\n\n<p>Time-dependent partial differential equations (PDEs) are ubiquitous in science and engineering. The high computational cost of traditional solution techniques has spurred increasing interest in deep neural network based PDE surrogates. The practical utility of such neural PDE solvers depends on their ability to provide accurate, stable predictions over long time horizons, which is a notoriously hard problem.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/pde-refiner-achieving-accurate-long-rollouts-with-neural-pde-solvers\/\" target=\"_blank\" rel=\"noreferrer noopener\">PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers<\/a> presents a large-scale analysis of common temporal rollout strategies, identifying the neglect of non-dominant spatial frequency information, often associated with high frequencies in PDE solutions, as the primary pitfall limiting stable, accurate rollout performance. Motivated by recent advances in diffusion models, the researchers developed PDE-Refiner, a novel model class that enables more accurate modeling of all frequency components via a multistep refinement process. They validate PDE-Refiner on challenging benchmarks of complex fluid dynamics, demonstrating stable and accurate rollouts that consistently outperform state-of-the-art models, including neural, numerical, and hybrid neural-numerical architectures. They also demonstrate that PDE-Refiner greatly enhances data efficiency, since the denoising objective implicitly induces a novel form of spectral data augmentation. Finally, PDE-Refiner\u2019s connection to diffusion models enables an accurate and efficient assessment of the model\u2019s predictive uncertainty, allowing researchers to estimate when the surrogate becomes inaccurate.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"should-i-stop-or-should-i-go-early-stopping-with-heterogeneous-populations\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/should-i-stop-or-should-i-go-early-stopping-with-heterogeneous-populations\/\" target=\"_blank\" rel=\"noreferrer noopener\">Should I Stop or Should I Go: Early Stopping with Heterogeneous Populations<\/a><\/h3>\n\n\n\n<p>Randomized experiments are the gold-standard method of determining causal effects, whether in clinical trials to evaluate medical treatments or in A\/B tests to evaluate online product offerings. But randomized experiments often need to be stopped prematurely when the treatment or test causes an unintended harmful effect. Existing methods that determine when to stop an experiment early are typically applied to the data in aggregate and do not account for treatment effect heterogeneity.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/should-i-stop-or-should-i-go-early-stopping-with-heterogeneous-populations\/\" target=\"_blank\" rel=\"noreferrer noopener\">Should I Stop or Should I Go: Early Stopping with Heterogeneous Populations<\/a> examines the early stopping of experiments for harm on heterogeneous populations. The paper shows that current methods often fail to stop experiments when the treatment harms a minority group of participants. The researchers use causal machine learning to develop Causal Latent Analysis for Stopping Heterogeneously (CLASH), the first broadly-applicable method for heterogeneous early stopping. They demonstrate CLASH\u2019s performance on simulated and real data and show that it yields effective early stopping for both clinical trials and A\/B tests.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"survival-instinct-in-offline-reinforcement-learning\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/survival-instinct-in-offline-reinforcement-learning-2\/\" target=\"_blank\" rel=\"noreferrer noopener\">Survival Instinct in Offline Reinforcement Learning<\/a><\/h3>\n\n\n\n<p>In offline reinforcement learning (RL), an agent optimizes its performance given an offline dataset. <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/survival-instinct-in-offline-reinforcement-learning-2\/\" target=\"_blank\" rel=\"noreferrer noopener\">Survival Instinct in Offline Reinforcement Learning<\/a> presents a novel observation: on many benchmark datasets, offline RL can produce well-performing and safe policies even when trained with \u201cwrong\u201d reward labels, such as those that are zero everywhere or are negatives of the true rewards. This phenomenon cannot be easily explained by offline RL\u2019s return maximization objective. Moreover, it gives offline RL a degree of robustness that is uncharacteristic of its online RL counterparts, which are known to be sensitive to reward design.<\/p>\n\n\n\n<p>This research demonstrates that this surprising robustness property is attributable to an interplay between the notion of pessimism in offline RL algorithms and a certain bias implicit in common data collection practices. This work shows that this pessimism endows the agent with a \u201csurvival instinct\u201d, i.e., an incentive to stay within the data support in the long term, while the limited and biased data coverage further constrains the set of survival policies. The researchers argue that the survival instinct should be taken into account when interpreting results from existing offline RL benchmarks and when creating future ones.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"timewarp-transferable-acceleration-of-molecular-dynamics-by-learning-time-coarsened-dynamics\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/timewarp-transferable-acceleration-of-molecular-dynamics-by-learning-time-coarsened-dynamics\/\" target=\"_blank\" rel=\"noreferrer noopener\">Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics<\/a><\/h3>\n\n\n\n<p>Molecular dynamics (MD) is a well-established technique for simulating physical systems at the atomic level. When performed accurately, it provides unrivalled insight into the detailed mechanics of molecular motion, without the need for wet lab experiments. MD is often used to compute equilibrium properties, which requires sampling from an equilibrium distribution such as the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/en.wikipedia.org\/wiki\/Boltzmann_distribution\" target=\"_blank\" rel=\"noreferrer noopener\">Boltzmann distribution<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. However, many important processes, such as binding and folding, occur over timescales of milliseconds or beyond, and cannot be efficiently sampled with conventional MD. Furthermore, new MD simulations need to be performed from scratch for each molecular system studied.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/timewarp-transferable-acceleration-of-molecular-dynamics-by-learning-time-coarsened-dynamics\/\" target=\"_blank\" rel=\"noreferrer noopener\">Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics<\/a> presents an enhanced sampling method which uses a normalizing flow as a proposal distribution in a Markov chain Monte Carlo method targeting the Boltzmann distribution. The flow is trained offline on MD trajectories and learns to make large steps in time, simulating the molecular dynamics of&nbsp;10^5\u221210^6fs. Crucially, Timewarp is transferable between molecular systems: the researchers show that, once trained, Timewarp generalizes to unseen small peptides (2-4 amino acids), exploring their metastable states and providing wall-clock acceleration when sampling compared to standard MD. This new method constitutes an important step towards developing general, transferable algorithms for accelerating MD.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We\u2019re proud to have 100+ accepted papers At NeurIPS 2023, plus 18 workshops. Several submissions were chosen as oral presentations and spotlight posters, reflecting groundbreaking concepts, methods, or applications. Here\u2019s an overview of those submissions.<\/p>\n","protected":false},"author":42183,"featured_media":990225,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[1],"tags":[],"research-area":[13556,13562,13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-989523","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-research-area-systems-and-networking","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199560,199565,199571,851467],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144902,144931,862206],"related-projects":[],"related-events":[968280],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-960x540.png\" class=\"img-object-cover\" alt=\"RF NeurIPS Edition December 11, 2023\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/12\/RFNeurIPS-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"December 11, 2023","formattedExcerpt":"We\u2019re proud to have 100+ accepted papers At NeurIPS 2023, plus 18 workshops. Several submissions were chosen as oral presentations and spotlight posters, reflecting groundbreaking concepts, methods, or applications. Here\u2019s an overview of those submissions.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/989523","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/42183"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=989523"}],"version-history":[{"count":26,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/989523\/revisions"}],"predecessor-version":[{"id":991278,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/989523\/revisions\/991278"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/990225"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=989523"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=989523"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=989523"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=989523"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=989523"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=989523"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=989523"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=989523"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=989523"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=989523"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=989523"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}