{"id":1048506,"date":"2024-07-15T09:00:00","date_gmt":"2024-07-15T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1048506"},"modified":"2024-07-08T07:37:39","modified_gmt":"2024-07-08T14:37:39","slug":"rubicon-evaluating-conversations-between-humans-and-ai-systems","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/rubicon-evaluating-conversations-between-humans-and-ai-systems\/","title":{"rendered":"RUBICON: Evaluating conversations between humans and AI systems"},"content":{"rendered":"\n

This paper has been accepted at the <\/em><\/strong>1st<\/sup> ACM International Conference on AI-powered Software<\/em><\/strong> (opens in new tab)<\/span><\/a> (AIware 2024), co-located with <\/em><\/strong>FSE 2024<\/em><\/strong> (opens in new tab)<\/span><\/a>. AIware is the premier international forum on AI-powered software.<\/em><\/strong><\/p>\n\n\n\n

\"Rubicon<\/figure>\n\n\n\n

Generative AI has redefined the landscape of AI assistants in software development, with innovations like GitHub Copilot providing real-time, chat-based programming support. As these tools increase in sophistication and domain specialization, assessing their impact on user interactions becomes more challenging. Developers frequently question whether modifications to their AI assistants genuinely improve the user experience, as indicated in a recent paper<\/a>.<\/p>\n\n\n\n

\n\t