News & features
Eureka: Evaluating and understanding progress in AI
| Vidhisha Balachandran, Jingya Chen, Neel Joshi, Besmira Nushi, Hamid Palangi, Eduardo Salinas, Vibhav Vineet, James Woffinden-Luey, and Safoora Yousefi
How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings.
Direct Nash Optimization: Teaching language models to self-improve with general preferences
This talk discusses teaching language models to self-improve using a preference oracle like GPT-4, framing it as a two-player game to find an optimal policy at a Nash equilibrium, and achieving state-of-the-art win rates against GPT-4 Turbo on benchmarks such…
Tracing the path to self-adapting AI agents
| Ching-An Cheng, Adith Swaminathan, and Allen Nie
Introducing Trace, Microsoft and Stanford University’s novel AI optimization framework, now available as a Python library. Trace adapts dynamically and optimizes a wide range of applications from language models to robot control.
Abstracts: July 18, 2024
| Gretchen Huizinga and Arindam Mitra
Senior Researcher Arindam Mitra introduces AgentInstruct. Using raw data sources, the automated multi-agent framework can create diverse, high-quality synthetic data at scale for the post-training of small and large language models.
In the news | Microsoft News Center
Why AI sometimes gets it wrong — and big strides to address it
Around the time GPT-4 was making headlines for acing standardized tests, Microsoft researchers and collaborators were putting other AI models through a different type of test — one designed to make the models fabricate information.
Introducing AutoGen Studio: A low-code interface for building multi-agent workflows
| Victor Dibia, Gagan Bansal, Jingya Chen, Suff Syed, Adam Fourney, Erkang (Eric) Zhu, Chi Wang, and Saleema Amershi
AutoGen Studio, built on Microsoft’s flexible open-source AutoGen framework for orchestrating AI agents, provides an intuitive user-friendly interface that enables developers to rapidly build, test, customize, and share multi-agent AI solutions—with little or no coding.
In the news | Wired
Chatbot teamwork makes the AI dream work
I’ve been playing this week with AutoGen, an open source software framework for AI agent collaboration developed by researchers at Microsoft and academics at Pennsylvania State University, the University of Washington, and Xidian University in China. The software taps OpenAI’s…
In the news | WIRED
Many Chatbots Make Light Work 🤝🤖 🎉
Turning to a friend or coworker can make tricky problems easier to tackle. Now it looks like having AI chatbots team up with each other can make them more effective. I’ve been playing this week with AutoGen, an open source…
Microsoft Research Forum Episode 3: Globally inclusive and equitable AI, new use cases for AI, and more
“We’re at the very early stage of generative AI and the impacts it will have on work. This is a fast-moving field, and there’s an immense opportunity to take control of the agenda and build truly globally equitable AI systems”,…