News & features
Eureka: Evaluating and understanding progress in AI
| Vidhisha Balachandran, Jingya Chen, Neel Joshi, Besmira Nushi, Hamid Palangi, Eduardo Salinas, Vibhav Vineet, James Woffinden-Luey, and Safoora Yousefi
How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings.
Understanding social biases through the text-to-image generation lens
| Ranjita Naik and Besmira Nushi
Gender, race, and age disparities in AI-generated images persist. This AIES 2023 study on text-to-image models shows that even basic prompts can lead to underrepresentation, calling for responsible bias mitigation strategies.
Creating better AI partners: A case for backward compatibility
| Besmira Nushi and Ece Kamar
Artificial intelligence technologies hold great promise as partners in the real world. They’re in the early stages of helping doctors administer care to their patients and lenders determine the risk associated with loan applications, among other examples. But what happens…
In the news | Microsoft Research Blog
Creating better AI partners: A case for backward compatibility
Traditional metrics on performance of the AI component are not sufficient when the AI technology is used by people to accomplish tasks.