Publication Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead Vidhisha Balachandran, Jingya Chen, Lingjiao Chen, Shivam Garg, Neel Joshi, Yash Lara, John Langford, Besmira Nushi, Vibhav Vineet, Yue Wu, Safoora Yousefi MSR-TR-2025-16 | March 2025 Published by Microsoft Github
Publication Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Awadallah February 2025
Publication Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs Ching-An Cheng, Allen Nie, Adith Swaminathan NeurIPS 2024 | December 2024 NeurIPS Expo Demo Download
Publication Phi-4 Technical Report Marah I Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio CT Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Yu, Cyril Zhang, Yi Zhang MSR-TR-2024-57 | December 2024 Published by Microsoft
Publication Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models Vibhav Vineet, Xin Wang, Neel Joshi 2024 Neural Information Processing Systems | December 2024
Publication Challenges in Human-Agent Communication Gagan Bansal, Jennifer Wortman Vaughan, Saleema Amershi, Eric Horvitz, Adam Fourney, Hussein Mozannar, Victor Dibia, Daniel S. Weld MSR-TR-2024-53 | December 2024 Published by Microsoft Project
Publication Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks Adam Fourney, Gagan Bansal, Hussein Mozannar, Cheng Tan, Eduardo Salinas, Erkang (Eric) Zhu, Friederike Niedtner, Grace Proebsting, Griffin Bassman, Jack Gerrits, Jacob Alber, Peter Chang, Ricky Loynd, Robert West, Victor Dibia, Ahmed Awadallah, Ece Kamar, Rafah Hosn, Saleema Amershi MSR-TR-2024-47 | November 2024 Published by Microsoft
Publication The Belief State Transformer Edward S. Hu, Kwangjun Ahn, Qinghua Liu, Haoran Xu, Manan Tomar, Ada Langford, Dinesh Jayaraman, Alex Lamb, John Langford ICLR 2025 | October 2024
Publication Maia-2: A Unified Model for Human-AI Alignment in Chess Zhenwei Tang, Difan Jiao, Reid McIlroy-Young, Jon Kleinberg, Siddhartha Sen, Ashton Anderson NeurIPS 2024 | September 2024
Publication EUREKA: Evaluating and Understanding Large Foundation Models Vidhisha Balachandran, Jingya Chen, Neel Joshi, Besmira Nushi, Hamid Palangi, Eduardo Salinas, Vibhav Vineet, James Woffinden-Luey, Safoora Yousefi MSR-TR-2024-33 | September 2024 Published by Microsoft Github