{"id":144711,"date":"2020-02-25T11:36:42","date_gmt":"2000-03-27T00:12:25","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/group\/internet-media\/"},"modified":"2022-07-05T20:40:04","modified_gmt":"2022-07-06T03:40:04","slug":"internet-media","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/group\/internet-media\/","title":{"rendered":"Intelligent Multimedia Group"},"content":{"rendered":"
The Intelligent Multimedia (IM) group aims to build seamless yet efficient multimedia applications and services through breakthroughs in fundamental theory and innovations in algorithm and system technology. We address the problems of intelligent multimedia content sensing, processing, analysis, services, and the generic scalability issues of multimedia computing systems. Current research focus is on video analytics to support intelligent cloud and intelligent edge media services.\u00a0Current research interests include, but are not limited to, object detection, tracking, semantic segmentation, human pose estimation, people re-ID, action recognition, depth estimation, SLAM, scene understanding, multimodality analysis, etc.<\/p>\n<\/div>\n<\/div>\n
Video is the biggest big data that contains an enormous amount of information. We are leveraging computer vision and deep learning to develop both cloud-based and edge-based intelligence engines that can turn raw video data into insights to facilitate various applications and services. Target application scenarios include video augmented reality, smart home surveillance,\u00a0business (retail store, office) intelligence, public security, video storytelling and sharing, etc. We have taken a human centric approach where\u00a0a significant effort has been focused on understanding human, human attributes and human behaviors. Our research has\u00a0contributed to\u00a0a number of video APIs offered in Microsoft Cognitive Services (https:\/\/www.microsoft.com\/cognitive-services (opens in new tab)<\/span><\/a>), Azure Media Analytics Services, Windows Machine Learning, Office Media (Stream\/Teams), and Dynamics\/Connected Store.<\/p>\n – Video API R&D, 3 technologies (intelligent motion detection, face detection\/tracking, face redaction), deployed in Microsoft Cognitive Services and Azure Media Services (2016) – Developed, released\/deployed human pose estimation (2019.5) and object tracking (2019.10) technologies as vision skills on the Windows Machine Learning platform. – Speech denoising technologies deployed in Microsoft Stream 1.0 (GA, 2020.6) and 2.0 (Internal Preview 2020.12) – Multi object tracking (FairMOT), Multiview 3D pose estimation (VoxelPose), person re-ID technologies shipped to the Microsoft Dynamics\/Connected Store Product. (2020, and ongoing) – Screen content understanding (element detection\/screen tree) technologies shipped to Microsoft\u2019s mobile robotic process automation (RPA) product (2020, and ongoing)<\/p>\n<\/div>\n
\n\uf0a7 Announcing: Motion detection for Azure Media Analytics (opens in new tab)<\/span><\/a> (2016)
\n\uf0a7 Announcing face and emotion detection for Azure Media Analytics | Azure Blog and Updates | Microsoft Azure (opens in new tab)<\/span><\/a> (2016)
\n\uf0a7 Announcing Face Redaction for Azure Media Analytics | Azure Blog and Updates | Microsoft Azure (opens in new tab)<\/span><\/a> (2016)
\n\uf0a7 Redact faces with Azure Media Analytics | Microsoft Docs (opens in new tab)<\/span><\/a><\/p>\n
\n\uf0a7 \u5fae\u8f6f\u53d1\u5e03Windows Vision Skills\u9884\u89c8\u7248\uff0c\u8f7b\u677e\u8c03\u7528\u8ba1\u7b97\u673a\u89c6\u89c9 (opens in new tab)<\/span><\/a>
\n\uf0a7 NuGet Gallery | Microsoft.AI.Skills.Vision.ObjectTrackerPreview 0.0.0.3 (opens in new tab)<\/span><\/a><\/p>\n
\n\uf0a7 \u4ece\u5608\u6742\u89c6\u9891\u4e2d\u63d0\u53d6\u8d85\u6e05\u4eba\u58f0\uff0c\u8bed\u97f3\u589e\u5f3a\u6a21\u578bPHASEN\u5df2\u52a0\u5165\u5fae\u8f6f\u89c6\u9891\u670d\u52a1 (opens in new tab)<\/span><\/a><\/p>\n
\n\uf0a7 \u4eceFairMOT\u5230VoxelPose\uff0c\u63ed\u79d8\u5fae\u8f6f\u4ee5\u201c\u4eba\u201d\u4e3a\u4e2d\u5fc3\u7684\u6700\u65b0\u89c6\u89c9\u7406\u89e3\u6210\u679c (opens in new tab)<\/span><\/a><\/p>\n
\n Cross View Fusion for 3D Human Pose Estimation (opens in new tab)<\/span><\/a>
\n Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach (opens in new tab)<\/span><\/a><\/div>\n
\n SPM-Tracker (opens in new tab)<\/span><\/a> Siamese network based tracker (opens in new tab)<\/span><\/a> (a comprehensive PyTorch based toolbox that supports a series of Siamese-network-based tracking methods like SiamFC \/ SiamRPN \/ SPM)
\n A Simple Baseline for One-Shot Multi-Object Tracking (opens in new tab)<\/span><\/a> (2.2K stars)<\/div>\n