AWARD
Lester Mackey awarded prestigious MacArthur Fellowship
Microsoft Research congratulates principal researcher Lester Mackey (opens in new tab), who has been awarded a 2023 MacArthur Fellowship (opens in new tab) in recognition of his pioneering statistical and machine learning techniques to solve data science problems with real-world relevance.
The fellowship award, which recognizes “talented individuals who have shown extraordinary originality and dedication in their creative pursuits and a marked capacity for self-direction,” comes with a stipend intended to enable recipients to exercise their own creative instincts for the benefit of human society.
Mackey’s research focuses on improving efficiency and predictive performance in computational statistical analysis of very large data sets. His work has included designing a method for more accurately predicting disease progression rates in patients with ALS, or Lou Gehrig’s disease.
Microsoft research podcast
NEW RESEARCH
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
Algorithm design in deep learning can appear to be more like “hacking” than an engineering practice. There are numerous architectural choices and training heuristics, which can often modulate model performance and resource costs in unpredictable and entangled ways. As a result, when training large-scale neural networks (such as state-of-the-art language models), algorithmic decisions and resource allocations are foremost empirically-driven, involving the measurement and extrapolation of scaling laws. A precise mathematical understanding of this “black box of deep learning” is elusive, and cannot be explained by statistics or optimization in isolation.
In a new paper: Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck, researchers from Microsoft, Harvard, and the University of Pennsylvania explore these algorithmic intricacies and tradeoffs through the lens of a single synthetic task: the finite-sample sparse parity learning problem. In this setting, the above complications are not only evident, but also provable: intuitively, due to the task’s computational hardness, a neural network needs a sufficient combination of resources (“data × model size × training time × luck”) to succeed. This research shows that standard algorithmic choices in deep learning give rise to a Pareto frontier, in which successful learning is “bought” with interchangeable combinations of these resources. They show that algorithmic improvements on this toy problem can transfer to the real world, improving the data-efficiency of neural networks on small tabular datasets.
NEW RESEARCH
Analyzing the Impact of Cardinality Estimation on Execution Plans in Microsoft SQL Server
Cardinality estimation (CE), the task of estimating the number of output rows for a SQL expression, is a challenging problem in query optimization. Due to the richness of SQL operators, limited data statistics available during query optimization, and the need to conserve the time and resources used, today’s query optimizers typically use simplifying assumptions. This can lead to large errors in CE of query sub-expressions and significantly sub-optimal execution plans.
Prior studies evaluate the impact of CE on plan quality on a set of Select-Project-Join queries on PostgreSQL DBMS. In a recent paper: Analyzing the Impact of Cardinality Estimation on Execution Plans in Microsoft SQL Server, researchers from Microsoft broaden the scope of the empirical study in significant ways. They conduct their study using Microsoft SQL Server, a widely used database management system (DBMS) that has a state-of-the-art query optimizer. Their evaluation quantifies the importance of accurate CE using: (a) Complex SQL queries involving aggregation and nested sub-queries, which are common in data analytics, (b) Both row-store and column-store physical designs, (c) Common query runtime techniques such as bitmap filtering that have the potential to mask the impact of errors in CE.
Examples of findings from this study that are relevant to researchers and engineers of database systems are: (i) The impact of accurate CE on plan quality is significant on both row-store and column-store physical designs; (ii) Bitmap filtering noticeably masks the impact of inaccurate CE; (iii) For most queries, it is sufficient to use accurate CE for a small subset of sub-expressions of the query to ensure that plan quality is not degraded when compared to using accurate CE for all sub-expressions.