{"id":1155826,"date":"2025-11-16T20:21:04","date_gmt":"2025-11-17T04:21:04","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=1155826"},"modified":"2025-11-16T20:21:06","modified_gmt":"2025-11-17T04:21:06","slug":"ui-evol-compute-use-agents-act-on-knowledge","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/ui-evol-compute-use-agents-act-on-knowledge\/","title":{"rendered":"UI-Evol: Compute-use Agents Act on Knowledge"},"content":{"rendered":"\n
Computer-use agents are AI systems that autonomously navigate and interact with software applications through graphical user interfaces (GUIs), and they are emerging as a new capability in artificial intelligence. By navigating and manipulating the same visual interfaces that people use, they can perform complex tasks on behalf of users, from filling out forms to managing workflows.<\/p>\n\n\n\n
Yet despite their promise, these agents perform poorly in practice. They typically draw on external knowledge\u2014information retrieved from the web that describes how to navigate the interfaces in question\u2014and use it to interpret what\u2019s on the screen and adapt to different environments. However, these agents often fail to translate this knowledge into successful action\u2014a problem researchers call the \u201cknowledge\u2013action gap.\u201d<\/p>\n\n\n\n
A recent study shows that even when the instructions are 90% correct, agents perform tasks successfully only 41% of the time. This disconnect between having the needed information and effectively applying it, illustrated at the top of Figure 1, can lead to a frustrating user experience.<\/p>\n\n\n\n