EVENT
Register now for Research Forum on September 3
Discover what’s next in the world of AI at Microsoft Research Forum (opens in new tab), an event series that explores recent research advances, bold new ideas, and important discussions with the global research community.
In Episode 4, you’ll learn about the latest multimodal AI models, advanced benchmarks for AI evaluation and model self-improvement, and an entirely new kind of computer for AI inference and hard optimization. Discover how these research breakthroughs and more can help advance everything from weather prediction to materials design.
Your one-time registration includes access to our live chat with researchers on the event day and additional resources to dive into the research.
Episode 4 will air Tuesday, September 3 at 9:00 AM Pacific Time.
Spotlight: Blog post
Eureka: Evaluating and understanding progress in AI
How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings.
NEW RESEARCH
Towards Effective AI Support for Developers: A Survey of Desires and Concerns
Talking to customers provides important insights into their challenges as well as what they love. This helps identify innovative and creative ways of solving problems (without creating new ones) and guards against ruining workflows that customers actually like. However, many AI-related development tools are currently being built without consulting developers.
In a recent paper: Towards Effective AI Support for Developers: A Survey of Desires and Concerns, researchers from Microsoft explore developers’ perspectives on AI integration in their workflows. This study reveals developers’ top desires for AI assistance along with their major concerns. The findings of this comprehensive survey among 791 Microsoft developers help the researchers identify key areas where AI can enhance productivity and how to address developers’ concerns. The findings provide actionable insights for product teams and leaders to create AI tools that truly support developers’ needs.
NEW RESEARCH
SuperBench: Improving Cloud AI Infrastructure Reliability with Proactive Validation
Cloud service providers have used geographical redundancies in hardware to ensure availability of their cloud infrastructure for years. However, for AI workloads, these redundancies can inadvertently lead to hidden degradation, also known as “gray failure.” This can reduce end-to-end performance and conceal performance issues, which complicates root cause analysis for failures and regressions.
In a recent paper: SuperBench: Improving Cloud AI Infrastructure Reliability with Proactive Validation (opens in new tab), Microsoft researchers and Azure cloud engineers introduce a proactive validation system specifically for AI infrastructure that mitigates hidden degradation caused by hardware redundancies . The paper, which won a “best paper” award at USENIX ATC (opens in new tab), outlines SuperBench’s comprehensive benchmark suite, capable of evaluating individual hardware components and representing most real AI workloads. It includes a validator, which learns benchmark criteria to clearly pinpoint defective components, and a selector, which balances validation time and issue-related penalties, enabling optimal timing for validation execution with a tailored subset of benchmarks. Testbed evaluation and simulation show SuperBench can increase the mean time between incidents by up to 22.61x. SuperBench has been successfully deployed in Azure production, validating hundreds of thousands of GPUs over the last two years.
NEW RESEARCH
Virtual Voices: Exploring Individual Differences in Written and Verbal Participation in Meetings
A key component of team performance is participation among group members. Workplace meetings provide a common stage for such participation. But with the shift to remote work, many meetings are conducted virtually. In such meetings, chat offers an alternate avenue of participation, in which attendees can synchronously contribute to the conversation through writing.
In a recent paper: Virtual Voices: Exploring Individual Differences in Written and Verbal Participation in Meetings (opens in new tab), researchers from Microsoft and external colleagues explore factors influencing participation in virtual meetings, drawing on individual differences (status characteristics theory), psychological safety perceptions, and group communication. Results of the paper, published in the Journal of Vocational Behavior (opens in new tab), reveal gender (self-identified) and job level nuances. Women engaged more in chat, while men verbally participated more frequently, as measured using meeting telemetry. Further, men highest in job level verbally contributed the most in virtual meetings, whereas women highest in job level use the chat the most frequently. Regarding type of chats sent, women use emoji reactions more often than men, and men send more attachments than women. Additionally, results revealed psychological safety moderated the relationship between job level and overall chat participation, such that employees low in job level with high perceptions of psychological safety sent more chats than their counterparts. This study provides insights into communication patterns and the impact of psychological safety on participation in technology-mediated spaces.