{"id":1146050,"date":"2025-07-30T20:49:30","date_gmt":"2025-07-31T03:49:30","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=1146050"},"modified":"2025-07-30T20:50:34","modified_gmt":"2025-07-31T03:50:34","slug":"ui-e2i-synth","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/ui-e2i-synth\/","title":{"rendered":"UI-E2I-Synth\uff1a\u4e3aComputer Use Agent\u6253\u9020\u66f4\u771f\u5b9e\u3001\u66f4\u5177\u6311\u6218\u6027\u7684\u5f00\u6e90Grounding\u6d4b\u8bd5\u57fa\u51c6"},"content":{"rendered":"\n

\u7f16\u8005\u6309\uff1a\u8fd1\u5e74\u6765\uff0c\u5927\u6a21\u578b\u5728\u591a\u6a21\u6001\u4ea4\u4e92\u4e2d\u7684\u5e94\u7528\u4e0d\u65ad\u6df1\u5165\uff0c\u4f7f\u5f97\u4eba\u673a\u4ea4\u4e92\u4e2d\u7684\u201c\u7406\u89e3\u80fd\u529b\u201d\u9762\u4e34\u65b0\u7684\u6311\u6218\u3002\u7279\u522b\u662f\u5728 Compute Use Agent \u7684\u53d1\u5c55\u80cc\u666f\u4e0b\uff0c\u5982\u4f55\u51c6\u786e\u7406\u89e3\u7528\u6237\u7684\u81ea\u7136\u8bed\u8a00\u6307\u4ee4\u5e76\u5c06\u5176\u6620\u5c04\u5230\u590d\u6742 GUI \u754c\u9762\u4e2d\u7684\u76f8\u5e94\u5143\u7d20\uff08\u5373 \u201cGUI Grounding\u201d \u4efb\u52a1\uff09\u6210\u4e3a\u5173\u952e\u3002\u7136\u800c\uff0c\u73b0\u6709\u7684\u516c\u5f00\u6570\u636e\u96c6\u96be\u4ee5\u5168\u9762\u8986\u76d6\u771f\u5b9e\u4f7f\u7528\u573a\u666f\u4e2d\u7684\u5404\u79cd\u96be\u70b9\u3002\u5bf9\u6b64\uff0c\u5fae\u8f6f\u4e9a\u6d32\u7814\u7a76\u9662\u63d0\u51fa\u4e86 UI-E2I-Synth\uff0c\u4e0d\u4ec5\u6784\u5efa\u4e86\u5927\u89c4\u6a21\u9ad8\u8d28\u91cf\u5408\u6210\u6570\u636e\u96c6\uff0c\u8fd8\u63a8\u51fa\u4e86\u66f4\u5177\u6311\u6218\u6027\u4e0e\u73b0\u5b9e\u6027\u7684\u6d4b\u8bd5\u57fa\u51c6 UI-I2E-Bench\uff0c\u7cfb\u7edf\u6027\u89e3\u51b3\u4e86\u5f53\u524d\u4efb\u52a1\u4e2d\u7684\u4e09\u5927\u5173\u952e\u95ee\u9898\u3002\u76f8\u5173\u8bba\u6587\u5df2\u88ab ACL 2025 \u6536\u5f55\u3002<\/p>\n\n\n\n


\n\n\n\n

\u968f\u7740\u591a\u6a21\u6001\u667a\u80fd\u4f53\u5728\u590d\u6742\u4eba\u673a\u4ea4\u4e92\u4efb\u52a1\u4e2d\u7684\u5e94\u7528\u4e0d\u65ad\u88ab\u62d3\u5c55\uff0c\u5927\u6a21\u578b\u7406\u89e3\u548c\u64cd\u4f5c\u56fe\u5f62\u7528\u6237\u754c\u9762\uff08GUI\uff09\u7684\u80fd\u529b\u6b63\u53d8\u5f97\u65e5\u76ca\u5173\u952e\u3002GUI Grounding\uff0c\u5373\u6839\u636e\u7528\u6237\u7684\u81ea\u7136\u8bed\u8a00\u6307\u4ee4\uff0c\u51c6\u786e\u5b9a\u4f4d\u5c4f\u5e55\u622a\u56fe\u4e2d\u5bf9\u5e94\u7684\u754c\u9762\u5143\u7d20\uff0c\u5df2\u6210\u4e3a\u5f53\u524d Compute Use Agent \u5f00\u53d1\u4e2d\u7684\u6838\u5fc3\u6311\u6218\u4e4b\u4e00\u3002<\/p>\n\n\n\n

\u7136\u800c\uff0c\u73b0\u6709\u7684\u516c\u5f00\u8bad\u7ec3\u6570\u636e\u96c6\u89c4\u6a21\u6709\u9650\uff0c\u4e14\u9ad8\u8d28\u91cf\u6307\u4ee4\u7684\u4eba\u5de5\u6807\u6ce8\u6210\u672c\u9ad8\u6602\uff0c\u5236\u7ea6\u4e86\u8be5\u4efb\u52a1\u7684\u8fdb\u4e00\u6b65\u53d1\u5c55\u3002\u5f53\u524d\u7814\u7a76\u9762\u4e34\u7740\u4e09\u4e2a\u5c1a\u672a\u88ab\u5145\u5206\u63a2\u7d22\u7684\u5173\u952e\u95ee\u9898\uff1a<\/p>\n\n\n\n

1. \u5143\u7d20\u4e0e\u5c4f\u5e55\u7684\u6bd4\u4f8b\u5931\u771f<\/strong>\uff1a\u73b0\u6709\u57fa\u51c6\u6d4b\u8bd5\uff08\u5982ScreenSpot\uff09\u7684\u754c\u9762\u5143\u7d20\u76f8\u5bf9\u5c4f\u5e55\u660e\u663e\u504f\u5927\uff0c\u8fdc\u79bb\u771f\u5b9e\u7528\u6237\u573a\u666f\u4e2d\u7684\u9ad8\u5206\u8fa8\u7387\u3001\u5c0f\u5143\u7d20\u754c\u9762\u3002\u8fd9\u53ef\u80fd\u5bfc\u81f4\u6a21\u578b\u7684\u6027\u80fd\u88ab\u9ad8\u4f30\u3002<\/p>\n\n\n\n

2. \u5143\u7d20\u7c7b\u578b\u5206\u5e03\u5931\u8861<\/strong>\uff1a\u4e0d\u540c\u7c7b\u578b\u7684 GUI \u5143\u7d20\uff08\u5982\u6587\u672c\u6309\u94ae\u3001\u590d\u9009\u6846\u7b49\uff09\u5177\u6709\u4e0d\u540c\u7684\u89c6\u89c9\u98ce\u683c\u548c\u4ea4\u4e92\u65b9\u5f0f\u3002\u4f46\u73b0\u6709\u57fa\u51c6\u4e2d\u6587\u672c\u548c\u56fe\u6807\u7c7b\u578b\u5360\u6bd4\u8fc7\u9ad8\uff0c\u7f3a\u4e4f\u5bf9\u5176\u4ed6\u7c7b\u578b\u7684\u8986\u76d6\u3002<\/p>\n\n\n\n

3. \u9690\u5f0f\u6307\u4ee4\u7406\u89e3\u80fd\u529b\u5f31<\/strong>\uff1a\u73b0\u5b9e\u7528\u6237\u5728\u53d1\u51fa\u6307\u4ee4\u65f6\u5f80\u5f80\u4f9d\u8d56\u4e8e\u5bf9\u5143\u7d20\u529f\u80fd\u6216\u4f4d\u7f6e\u7684\u76f4\u89c2\u7406\u89e3\uff0c\u800c\u975e\u76f4\u63a5\u5f15\u7528\u5c4f\u5e55\u4e0a\u7684\u53ef\u89c1\u6587\u672c\uff0c\u8fd9\u7c7b\u201c\u9690\u5f0f\u6307\u4ee4\u201d\u5bf9\u89c6\u89c9\u8bed\u8a00\u6a21\u578b\uff08VLMs\uff09\u7684\u7406\u89e3\u548c\u63a8\u7406\u80fd\u529b\u63d0\u51fa\u4e86\u66f4\u9ad8\u7684\u8981\u6c42\u3002<\/p>\n\n\n\n

\u9488\u5bf9\u8fd9\u4e9b\u95ee\u9898\uff0c\u7814\u7a76\u5458\u4eec\u63d0\u51fa\u4e86\u4e00\u4e2a\u7aef\u5230\u7aef\u7684\u5927\u89c4\u6a21\u6570\u636e\u5408\u6210\u65b9\u6cd5 UI-E2I-Synth\uff0c\u5e76\u4f7f\u7528\u534a\u81ea\u52a8\u65b9\u5f0f\u6784\u5efa\u4e86\u4e00\u4e2a\u5168\u65b0\u7684\u6d4b\u8bd5\u57fa\u51c6 UI-I2E-Bench\u3002\u76f8\u5173\u8bba\u6587\u5df2\u88ab ACL 2025 \u63a5\u6536\u3002<\/p>\n\n\n\n

UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis<\/strong><\/p>\n\n\n\n

\u8bba\u6587\u94fe\u63a5\uff1ahttps:\/\/arxiv.org\/abs\/2504.11257 (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

GitHub \u94fe\u63a5\uff1ahttps:\/\/github.com\/microsoft\/FIVE-UI-Evol\/tree\/main (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

UI-I2E-Bench Leaderboard \u94fe\u63a5\uff1ahttps:\/\/microsoft.github.io\/FIVE-UI-Evol (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

\"chart,
\u56fe1\uff1a\u76f8\u6bd4\u4e8e\u5df2\u6709\u7684\u6d4b\u8bd5\u57fa\u51c6 ScreenSpot\uff0cUI-I2E-Bench \u7684\u5c4f\u5e55\u5143\u7d20\u6bd4\u4f8b\u66f4\u63a5\u8fd1\u4e8e\u5e38\u7528\u7684 1080P \u663e\u793a\u5668\uff0c\u5e76\u6839\u636e\u6307\u4ee4\u7684\u9690\u542b\u7a0b\u5ea6\u8fdb\u884c\u4e86\u6807\u6ce8\u4e14\u5177\u6709\u66f4\u4e30\u5bcc\u3001\u5e73\u8861\u7684\u5143\u7d20\u7c7b\u578b\u3002<\/figcaption><\/figure>\n\n\n\n

\u6784\u5efa\u4ece\u6570\u636e\u5230\u6a21\u578b\u7684\u5b8c\u6574\u94fe\u6761\uff1a\u5408\u6210\u65b9\u6cd5\u3001\u8bc4\u6d4b\u57fa\u51c6\u4e0eVLM\u6027\u80fd\u7a81\u7834<\/h2>\n\n\n\n

UI-E2I-Synth \u7684\u6838\u5fc3\u5728\u4e8e\uff0c\u5229\u7528\u5927\u8bed\u8a00\u6a21\u578b\uff08\u5982GPT-4o\uff09\u800c\u975e\u4eba\u5de5\u6807\u6ce8\u6765\u5408\u6210\u5404\u79cd\u590d\u6742\u5ea6\u7684\u771f\u5b9e\u6307\u4ee4\u3002\u8be5\u7ba1\u9053\u9075\u5faa\u201c\u5206\u800c\u6cbb\u4e4b\u201d\u539f\u5219\uff0c\u6574\u4f53\u6d41\u7a0b\u5206\u4e3a\u4e09\u4e2a\u6b65\u9aa4\uff1a<\/p>\n\n\n\n

\u9996\u5148\u662f\u539f\u59cb\u6570\u636e\u7684\u6536\u96c6\u4e0e\u89e3\u6790<\/strong>\u3002\u4ece\u7f51\u9875\u3001Windows \u548c Android \u7b49\u591a\u4e2a\u5e73\u53f0\u6536\u96c6\u622a\u56fe-\u5143\u6570\u636e\u5bf9\uff0c\u4f7f\u7528\u542f\u53d1\u5f0f\u89e3\u6790\u5668\u63d0\u53d6\u5143\u7d20\u7c7b\u578b\u3001\u5185\u5bb9\u548c\u8fb9\u754c\u6846\u7b49\u53ef\u9760\u5c5e\u6027\uff0c\u4ee5\u7f13\u89e3\u5e7b\u89c9\u95ee\u9898\uff0c\u786e\u4fdd\u7ed3\u6784\u53ef\u9760\u4e14\u53ef\u7528\u4e8e\u540e\u7eed\u5efa\u6a21\u3002\u5176\u6b21\u662f\u6307\u4ee3\u8868\u8fbe\u5f0f\uff08referring expressions, RE\uff09\u751f\u6210<\/strong>\u3002\u7ed3\u5408 Set-of-Marks \u622a\u56fe\u548c\u5143\u7d20\u5c5e\u6027\uff0c\u901a\u8fc7 GPT-4o \u751f\u6210\u663e\u5f0f\u548c\u9690\u5f0f\u4e24\u79cd\u7c7b\u578b\u7684\u6307\u4ee3\u8868\u8fbe\u5f0f\uff0c\u4ee5\u589e\u52a0\u6307\u4ee4\u7684\u591a\u6837\u6027\u3002\u6700\u540e\u662f\u6307\u4ee4\u5408\u6210<\/strong>\u3002\u518d\u6b21\u5229\u7528 GPT-4o \u6a21\u62df\u7528\u6237\u884c\u4e3a\uff0c\u751f\u6210\u7279\u5b9a\u7684\u52a8\u4f5c\u53c2\u6570\uff08\u5982\u70b9\u51fb\u3001\u8f93\u5165\u5185\u5bb9\uff09\uff0c\u5e76\u4e0e\u6307\u4ee3\u8868\u8fbe\u5f0f\u7ed3\u5408\uff0c\u5408\u6210\u6700\u7ec8\u7684\u771f\u5b9e\u7528\u6237\u6307\u4ee4\u3002<\/p>\n\n\n\n

\"diagram\"
\u56fe2\uff1aUI-E2I-Synth \u7684\u4e09\u4e2a\u9636\u6bb5<\/figcaption><\/figure>\n\n\n\n

\u5728\u5408\u6210\u6570\u636e\u7684\u57fa\u7840\u4e0a\uff0c\u7814\u7a76\u5458\u4eec\u901a\u8fc7\u4eba\u5de5\u7cbe\u6821\u7684\u65b9\u5f0f\u5f15\u5165\u4e86\u4e00\u4e2a\u5168\u65b0\u7684\u6d4b\u8bd5\u57fa\u51c6 UI-I2E-Bench\u3002\u8be5\u57fa\u51c6\u7eb3\u5165\u4e86\u66f4\u4e30\u5bcc\u7684\u6807\u6ce8\u7ef4\u5ea6\uff0c\u5305\u62ec\u5143\u7d20\u7c7b\u578b\u3001\u5143\u7d20\u4e0e\u5c4f\u5e55\u7684\u6bd4\u4f8b\u3001\u6307\u4ee4\u9690\u542b\u7a0b\u5ea6\u7b49\uff0c\u4e3a\u672a\u6765\u7684 GUI Grounding \u6a21\u578b\u5f00\u53d1\u63d0\u4f9b\u4e86\u8fdb\u4e00\u6b65\u7684\u89c1\u89e3\u3002<\/p>\n\n\n\n

\u5229\u7528 UI-E2I-Synth \u751f\u6210\u7684990\u4e07\u6761\u6307\u4ee4\u6570\u636e\uff0c\u7814\u7a76\u5458\u4eec\u8bad\u7ec3\u4e86 UI-I2E-VLM-4B \u548c UI-I2E-VLM-7B\u3002\u8bc4\u4f30\u7ed3\u679c\u663e\u793a\uff0cUI-I2E-VLM-7B \u5728 UI-I2E-Bench\u3001ScreenSpot \u548c ScreenSpot-Pro \u7b49\u57fa\u51c6\u6d4b\u8bd5\u4e2d\uff0c\u4f7f\u7528\u66f4\u5c11\u7684\u8bad\u7ec3\u6570\u636e\uff08\u4ec5\u4e3a OS-Atlas \u768472%\uff09\u5c31\u53d6\u5f97\u4e86\u4f18\u5f02\u7684\u8868\u73b0\u3002UI-I2E-VLM \u5728\u5904\u7406\u9690\u5f0f\u6307\u4ee4\u548c\u5c0f\u5143\u7d20\uff08\u4f4e\u5143\u7d20-\u5c4f\u5e55\u6bd4\u4f8b\uff09\u65b9\u9762\u8868\u73b0\u51fa\u663e\u8457\u4f18\u52bf\uff0c\u5e76\u63d0\u5347\u4e86\u5bf9\u56fe\u6807\u548c\u8f93\u5165\u6846\u7b49\u957f\u5c3e\u5143\u7d20\u7c7b\u578b\u7684\u8bc6\u522b\u80fd\u529b\u3002\u7814\u7a76\u8fd8\u53d1\u73b0\uff0c\u73b0\u6709\u57fa\u51c6\u5982 ScreenSpot \u53ef\u80fd\u56e0\u5176\u8f83\u4f4e\u7684\u96be\u5ea6\u9ad8\u4f30\u4e86\u6a21\u578b\u6027\u80fd\u3002<\/p>\n\n\n\n

\"table\"
\u88681\uff1a\u901a\u8fc7 UI-E2I-Synth\uff0c\u7814\u7a76\u5458\u4eec\u5408\u6210\u4e86\u9ad8\u8d28\u91cf\u8bad\u7ec3\u6570\u636e\u5e76\u5f00\u53d1\u4e86 UI-I2E-VLM \u7cfb\u5217\u6a21\u578b\uff0c\u53d6\u5f97\u4e86\u9886\u5148\u7684 GUI Grounding \u6027\u80fd<\/figcaption><\/figure>\n\n\n\n

\u501f\u52a9 UI-I2E-Bench \u7684\u7ec6\u81f4\u6807\u6ce8\uff0c\u7814\u7a76\u5458\u4eec\u8fdb\u4e00\u6b65\u5bf9\u6a21\u578b\u6027\u80fd\u5c55\u5f00\u4e86\u8bca\u65ad\u5206\u6790\uff0c\u53d1\u73b0\u9488\u5bf9\u6307\u4ee4\u7c7b\u578b\uff0c\u6a21\u578b\u5728\u5904\u7406\u9690\u5f0f\u6307\u4ee4\u65f6\u63d0\u5347\u663e\u8457\uff0c\u800c\u4ee5\u5f80\u4f4e\u4f30\u4e86\u6307\u4ee4\u7684\u590d\u6742\u6027\u3002\u4f8b\u5982\uff0c\u9886\u5148\u6a21\u578b OS-Atlas-7B \u5728\u6b64\u7ef4\u5ea6\u4e0a\u843d\u540e12.1\u4e2a\u767e\u5206\u70b9\u3002OmniParser \u4f9d\u6258 GPT-4o\uff0c\u5728\u9690\u5f0f\u6307\u4ee4\u4efb\u52a1\u4e0a\u8868\u73b0\u7a81\u51fa\uff0c\u4f46\u5728\u663e\u5f0f\u6307\u4ee4\u4e0a\u8868\u73b0\u4e00\u822c\uff0c\u5176\u4e3b\u8981\u74f6\u9888\u5728\u4e8e\u7ec6\u5c0f\u548c\u957f\u5c3e\u5143\u7d20\u7684\u5b9a\u4f4d\u3002\u5173\u4e8e\u5143\u7d20\u4e0e\u5c4f\u5e55\u6bd4\u4f8b\uff0c\u5206\u6790\u7ed3\u679c\u663e\u793a\u5143\u7d20\u5c3a\u5bf8\u8d8a\u5c0f\uff0c\u5176\u51c6\u786e\u7387\u4e0b\u964d\u8d8a\u660e\u663e\u3002\u8fd9\u8bf4\u660e\u5c0f\u5c3a\u5bf8\u5143\u7d20\u548c\u9ad8\u5206\u8fa8\u7387\u56fe\u50cf\u5bf9\u57fa\u51c6\u6d4b\u8bd5\u81f3\u5173\u91cd\u8981\u3002UI-I2E-VLM \u51ed\u501f\u66f4\u4e30\u5bcc\u7684\u8bad\u7ec3\u6570\u636e\u548c\u66f4\u591a\u8f93\u5165\u56fe\u50cf token\uff0c\u5728\u5c0f\u5c3a\u5bf8\u5143\u7d20\u5b9a\u4f4d\u4e0a\u8868\u73b0\u66f4\u4f73\u3002\u5728\u5143\u7d20\u7c7b\u578b\u65b9\u9762\uff0c\u73b0\u6709\u6a21\u578b\u5728\u201c\u56fe\u6807\u201d\u548c\u201c\u8f93\u5165\u6846\u201d\u8fd9\u7c7b\u957f\u5c3e\u7c7b\u522b\u4e0a\u7684\u8868\u73b0\u5b58\u5728\u660e\u663e\u77ed\u677f\u3002\u5bf9\u6b64\uff0cUI-E2I-Synth \u901a\u8fc7\u5747\u8861\u5143\u7d20\u7c7b\u578b\u5206\u5e03\uff0c\u63d0\u5347\u4e86\u6a21\u578b\u5bf9\u8fd9\u4e9b\u7c7b\u522b\u7684\u5904\u7406\u80fd\u529b\u3002<\/p>\n\n\n\n

\"table\"
\u88682\uff1a\u501f\u52a9 UI-I2E-Bench \u4e0a\u7684\u4e30\u5bcc\u6807\u6ce8\uff0c\u7814\u7a76\u5458\u4eec\u5bf9\u6a21\u578b\u7ed3\u679c\u8fdb\u884c\u4e86\u8fdb\u4e00\u6b65\u7684\u7cbe\u7ec6\u5206\u6790<\/figcaption><\/figure>\n\n\n\n

\u91cd\u5851\u8bc4\u4f30\u6807\u51c6\uff0c\u8d4b\u80fd\u667a\u80fd\u4ea4\u4e92\u672a\u6765<\/h2>\n\n\n\n

UI-E2I-Synth \u548c UI-I2E-Bench \u7684\u63d0\u51fa\uff0c\u4e0d\u4ec5\u89e3\u51b3\u4e86 GUI Grounding \u8bad\u7ec3\u4e0e\u8bc4\u4f30\u4e2d\u7684\u5173\u952e\u95ee\u9898\uff0c\u4e5f\u4e3a\u672a\u6765\u667a\u80fd\u52a9\u624b\u5728\u771f\u5b9e GUI \u573a\u666f\u4e2d\u7684\u90e8\u7f72\u5960\u5b9a\u4e86\u57fa\u7840\u3002<\/p>\n\n\n\n

\u901a\u8fc7\u6df1\u5165\u7406\u89e3\u6307\u4ee4\u591a\u6837\u6027\u4e0e UI \u590d\u6742\u6027\u7684\u5173\u7cfb\uff0c\u7814\u7a76\u5458\u4eec\u4e3a\u591a\u6a21\u6001\u6a21\u578b\u80fd\u529b\u7684\u63d0\u5347\u6253\u5f00\u4e86\u65b0\u7684\u7a7a\u95f4\uff0c\u4e5f\u4e3a\u884c\u4e1a\u63d0\u4f9b\u4e86\u66f4\u5177\u6311\u6218\u6027\u3001\u66f4\u6709\u6307\u5bfc\u610f\u4e49\u7684\u57fa\u51c6\u6807\u51c6\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"

\u7f16\u8005\u6309\uff1a\u8fd1\u5e74\u6765\uff0c\u5927\u6a21\u578b\u5728\u591a\u6a21\u6001\u4ea4\u4e92\u4e2d\u7684\u5e94\u7528\u4e0d\u65ad\u6df1\u5165\uff0c\u4f7f\u5f97\u4eba\u673a\u4ea4\u4e92\u4e2d\u7684\u201c\u7406\u89e3\u80fd\u529b\u201d\u9762\u4e34\u65b0\u7684\u6311\u6218\u3002\u7279\u522b\u662f\u5728 Compute Use Agent \u7684\u53d1\u5c55\u80cc\u666f\u4e0b\uff0c\u5982\u4f55\u51c6\u786e\u7406\u89e3\u7528\u6237\u7684\u81ea\u7136\u8bed\u8a00\u6307\u4ee4\u5e76\u5c06\u5176\u6620\u5c04\u5230\u590d\u6742 GUI \u754c\u9762\u4e2d\u7684\u76f8\u5e94\u5143\u7d20\uff08\u5373 \u201cGUI Grounding\u201d \u4efb\u52a1\uff09\u6210\u4e3a\u5173\u952e\u3002\u7136\u800c\uff0c\u73b0\u6709\u7684\u516c\u5f00\u6570\u636e\u96c6\u96be\u4ee5\u5168\u9762\u8986\u76d6\u771f\u5b9e\u4f7f\u7528\u573a\u666f\u4e2d\u7684\u5404\u79cd\u96be\u70b9\u3002\u5bf9\u6b64\uff0c\u5fae\u8f6f\u4e9a\u6d32\u7814\u7a76\u9662\u63d0\u51fa\u4e86 UI-E2I-Synth\uff0c\u4e0d\u4ec5\u6784\u5efa\u4e86\u5927\u89c4\u6a21\u9ad8\u8d28\u91cf\u5408\u6210\u6570\u636e\u96c6\uff0c\u8fd8\u63a8\u51fa\u4e86\u66f4\u5177\u6311\u6218\u6027\u4e0e\u73b0\u5b9e\u6027\u7684\u6d4b\u8bd5\u57fa\u51c6 UI-I2E-Bench\uff0c\u7cfb\u7edf\u6027\u89e3\u51b3\u4e86\u5f53\u524d\u4efb\u52a1\u4e2d\u7684\u4e09\u5927\u5173\u952e\u95ee\u9898\u3002\u76f8\u5173\u8bba\u6587\u5df2\u88ab ACL 2025 \u6536\u5f55\u3002 \u968f\u7740\u591a\u6a21\u6001\u667a\u80fd\u4f53\u5728\u590d\u6742\u4eba\u673a\u4ea4\u4e92\u4efb\u52a1\u4e2d\u7684\u5e94\u7528\u4e0d\u65ad\u88ab\u62d3\u5c55\uff0c\u5927\u6a21\u578b\u7406\u89e3\u548c\u64cd\u4f5c\u56fe\u5f62\u7528\u6237\u754c\u9762\uff08GUI\uff09\u7684\u80fd\u529b\u6b63\u53d8\u5f97\u65e5\u76ca\u5173\u952e\u3002GUI Grounding\uff0c\u5373\u6839\u636e\u7528\u6237\u7684\u81ea\u7136\u8bed\u8a00\u6307\u4ee4\uff0c\u51c6\u786e\u5b9a\u4f4d\u5c4f\u5e55\u622a\u56fe\u4e2d\u5bf9\u5e94\u7684\u754c\u9762\u5143\u7d20\uff0c\u5df2\u6210\u4e3a\u5f53\u524d Compute Use Agent \u5f00\u53d1\u4e2d\u7684\u6838\u5fc3\u6311\u6218\u4e4b\u4e00\u3002 \u7136\u800c\uff0c\u73b0\u6709\u7684\u516c\u5f00\u8bad\u7ec3\u6570\u636e\u96c6\u89c4\u6a21\u6709\u9650\uff0c\u4e14\u9ad8\u8d28\u91cf\u6307\u4ee4\u7684\u4eba\u5de5\u6807\u6ce8\u6210\u672c\u9ad8\u6602\uff0c\u5236\u7ea6\u4e86\u8be5\u4efb\u52a1\u7684\u8fdb\u4e00\u6b65\u53d1\u5c55\u3002\u5f53\u524d\u7814\u7a76\u9762\u4e34\u7740\u4e09\u4e2a\u5c1a\u672a\u88ab\u5145\u5206\u63a2\u7d22\u7684\u5173\u952e\u95ee\u9898\uff1a 1. \u5143\u7d20\u4e0e\u5c4f\u5e55\u7684\u6bd4\u4f8b\u5931\u771f\uff1a\u73b0\u6709\u57fa\u51c6\u6d4b\u8bd5\uff08\u5982ScreenSpot\uff09\u7684\u754c\u9762\u5143\u7d20\u76f8\u5bf9\u5c4f\u5e55\u660e\u663e\u504f\u5927\uff0c\u8fdc\u79bb\u771f\u5b9e\u7528\u6237\u573a\u666f\u4e2d\u7684\u9ad8\u5206\u8fa8\u7387\u3001\u5c0f\u5143\u7d20\u754c\u9762\u3002\u8fd9\u53ef\u80fd\u5bfc\u81f4\u6a21\u578b\u7684\u6027\u80fd\u88ab\u9ad8\u4f30\u3002 2. \u5143\u7d20\u7c7b\u578b\u5206\u5e03\u5931\u8861\uff1a\u4e0d\u540c\u7c7b\u578b\u7684 GUI \u5143\u7d20\uff08\u5982\u6587\u672c\u6309\u94ae\u3001\u590d\u9009\u6846\u7b49\uff09\u5177\u6709\u4e0d\u540c\u7684\u89c6\u89c9\u98ce\u683c\u548c\u4ea4\u4e92\u65b9\u5f0f\u3002\u4f46\u73b0\u6709\u57fa\u51c6\u4e2d\u6587\u672c\u548c\u56fe\u6807\u7c7b\u578b\u5360\u6bd4\u8fc7\u9ad8\uff0c\u7f3a\u4e4f\u5bf9\u5176\u4ed6\u7c7b\u578b\u7684\u8986\u76d6\u3002 3. \u9690\u5f0f\u6307\u4ee4\u7406\u89e3\u80fd\u529b\u5f31\uff1a\u73b0\u5b9e\u7528\u6237\u5728\u53d1\u51fa\u6307\u4ee4\u65f6\u5f80\u5f80\u4f9d\u8d56\u4e8e\u5bf9\u5143\u7d20\u529f\u80fd\u6216\u4f4d\u7f6e\u7684\u76f4\u89c2\u7406\u89e3\uff0c\u800c\u975e\u76f4\u63a5\u5f15\u7528\u5c4f\u5e55\u4e0a\u7684\u53ef\u89c1\u6587\u672c\uff0c\u8fd9\u7c7b\u201c\u9690\u5f0f\u6307\u4ee4\u201d\u5bf9\u89c6\u89c9\u8bed\u8a00\u6a21\u578b\uff08VLMs\uff09\u7684\u7406\u89e3\u548c\u63a8\u7406\u80fd\u529b\u63d0\u51fa\u4e86\u66f4\u9ad8\u7684\u8981\u6c42\u3002 \u9488\u5bf9\u8fd9\u4e9b\u95ee\u9898\uff0c\u7814\u7a76\u5458\u4eec\u63d0\u51fa\u4e86\u4e00\u4e2a\u7aef\u5230\u7aef\u7684\u5927\u89c4\u6a21\u6570\u636e\u5408\u6210\u65b9\u6cd5 UI-E2I-Synth\uff0c\u5e76\u4f7f\u7528\u534a\u81ea\u52a8\u65b9\u5f0f\u6784\u5efa\u4e86\u4e00\u4e2a\u5168\u65b0\u7684\u6d4b\u8bd5\u57fa\u51c6 UI-I2E-Bench\u3002\u76f8\u5173\u8bba\u6587\u5df2\u88ab ACL 2025 \u63a5\u6536\u3002 UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis \u8bba\u6587\u94fe\u63a5\uff1ahttps:\/\/arxiv.org\/abs\/2504.11257 (opens in new tab) GitHub \u94fe\u63a5\uff1ahttps:\/\/github.com\/microsoft\/FIVE-UI-Evol\/tree\/main (opens in new tab) […]<\/p>\n","protected":false},"author":34512,"featured_media":1146071,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-content-parent":1012650,"msr_hide_image_in_river":null,"footnotes":""},"research-area":[13556],"msr-locale":[268881],"msr-post-option":[],"class_list":["post-1146050","msr-blog-post","type-msr-blog-post","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-zh_cn"],"msr_assoc_parent":{"id":1012650,"type":"lab"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/1146050","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-blog-post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/34512"}],"version-history":[{"count":7,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/1146050\/revisions"}],"predecessor-version":[{"id":1146072,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/1146050\/revisions\/1146072"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1146071"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1146050"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1146050"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1146050"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1146050"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}