{"id":1155843,"date":"2025-11-24T10:00:00","date_gmt":"2025-11-24T18:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/fara-7b-an-efficient-agentic-model-for-computer-use\/"},"modified":"2025-12-11T07:31:44","modified_gmt":"2025-12-11T15:31:44","slug":"fara-7b-an-efficient-agentic-model-for-computer-use","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/fara-7b-an-efficient-agentic-model-for-computer-use\/","title":{"rendered":"Fara-7B:\u00a0An Efficient Agentic Model for\u00a0Computer Use"},"content":{"rendered":"\n

Pushing the frontiers of computer-use agents with an open-weight, ultra-compact model, optimized for real-world web tasks<\/h3>\n\n\n\n
\"Three<\/figure>\n\n\n\n

In 2024, Microsoft introduced small language models (SLMs) to customers, starting with the release of Phi (opens in new tab)<\/span><\/a> models on Microsoft Foundry (opens in new tab)<\/span><\/a>, as well as deploying Phi Silica (opens in new tab)<\/span><\/a> on Copilot+ PCs powered by Windows 11. Today, we are pleased to announce Fara-7B<\/strong>, our first agentic SLM<\/strong> designed specifically for computer use.<\/p>\n\n\n\n

Unlike traditional chat models that generate text-based responses, Computer Use Agent (CUA) models like Fara-7B leverage computer interfaces, such as a mouse and keyboard, to complete tasks on behalf of users. With only 7 billion parameters, Fara-7B achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems that depend on prompting multiple large models. Fara-7B\u2019s small size now makes it possible to run CUA models directly on devices. This results in reduced latency and improved privacy, as user data remains local.<\/p>\n\n\n\n

Fara-7B is an experimental release, designed to invite hands-on exploration and feedback from the community. Users can build and test agentic experiences beyond pure research\u2014automating everyday web tasks like filling out forms, searching for information, booking travel, or managing accounts. We recommend running Fara-7B in a sandboxed environment, monitoring its execution, and avoiding sensitive data or high-risk domains. Responsible use is essential as the model continues to evolve.<\/p>\n\n\n\n

Fara-7B operates by visually perceiving a webpage and takes actions like scrolling, typing, and clicking on directly predicted coordinates. It does not rely on separate models to parse the screen, nor on any additional information like accessibility trees, and thus uses the same modalities as humans to interact with the computer. To train Fara-7B, we developed a novel synthetic data generation pipeline for multi-step web tasks, building on our prior work (AgentInstruct<\/a>). This data generation pipeline draws from real web pages and tasks sourced from human users.<\/p>\n\n\n\n

\n
<\/div>\n\n\n\n
<\/div>\n\n\n\n
<\/div>\n<\/div>\n\n\n\n
Video 1: A demo of a shopping scenario with Fara-7B through Magentic-UI. Fara-7B is asked to purchase an X-Box Spongebob controller. Fara-7B goes on to complete this task, but while doing so, also stops at every Critical Point to get input and approval from the user before proceeding.<\/figcaption><\/figure>\n\n\n\n
Video 2: A demo of Fara-7B finding relevant information online and summarizing it through Magentic-UI. We ask Fara-7B to find and summarize the latest three issues on Github Microsoft\/Magentic-UI.<\/figcaption><\/figure>\n\n\n\n