{"id":162558,"date":"2012-01-01T00:00:00","date_gmt":"2012-01-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/conditional-regression-forests-for-human-pose-estimation\/"},"modified":"2018-10-16T20:29:21","modified_gmt":"2018-10-17T03:29:21","slug":"conditional-regression-forests-for-human-pose-estimation","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/conditional-regression-forests-for-human-pose-estimation\/","title":{"rendered":"Conditional Regression Forests for Human Pose Estimation"},"content":{"rendered":"<div class=\"asset-content\">\n<p>Random forests have been successfully applied to various high level computer vision tasks such as human pose estimation and object segmentation. These models are extremely efficient but work under the assumption that the output variables (such as body part locations or pixel labels) are independent. In this paper, we present a conditional regression forest model for human pose estimation that incorporates dependency relationships between output variables through a global latent variable while still maintaining a low computational cost. We show that the incorporation of a global latent variable encoding torso orientation, or human height, etc., can dramatically increase the accuracy of body joint location prediction. Our model also allows efficient and seamless incorporation of prior knowledge about the problem instance such as the height or orientation of the human subject which can be available from the problem context or via a temporal model. We show that our method significantly outperforms state-of-the-art methods for pose estimation from depth images. The conditional regression model proposed in the paper is general and can be applied to other problems where random forests are used.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Random forests have been successfully applied to various high level computer vision tasks such as human pose estimation and object segmentation. These models are extremely efficient but work under the assumption that the output variables (such as body part locations or pixel labels) are independent. In this paper, we present a conditional regression forest model [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13562],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-162558","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-computer-vision","msr-locale-en_us"],"msr_publishername":"IEEE","msr_edition":"Proc. CVPR","msr_affiliation":"","msr_published_date":"2012-01-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"Proc. CVPR","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"206230","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"skt_cvpr2012_tr.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/skt_cvpr2012_tr.pdf","id":206230,"label_id":0},{"type":"file","title":"skt_cvpr2012.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/skt_cvpr2012.pdf","id":206229,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":206230,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/skt_cvpr2012_tr.pdf"},{"id":206229,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/skt_cvpr2012.pdf"}],"msr-author-ordering":[{"type":"text","value":"Min Sun","user_id":0,"rest_url":false},{"type":"user_nicename","value":"pkohli","user_id":33269,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=pkohli"},{"type":"user_nicename","value":"jamiesho","user_id":32162,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=jamiesho"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[171004,170652],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":171004,"post_title":"Decision Forests","post_name":"decision-forests","post_type":"msr-project","post_date":"2012-07-25 01:35:22","post_modified":"2017-06-06 12:09:49","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/decision-forests\/","post_excerpt":"Decision Forests for Computer Vision and Medical Image Analysis A. Criminisi and J. Shotton Springer 2013, XIX, 368 p. 143 illus., 136 in color. ISBN 978-1-4471-4929-3 \u00a0","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171004"}]}},{"ID":170652,"post_title":"Human Pose Estimation for Kinect","post_name":"human-pose-estimation-for-kinect","post_type":"msr-project","post_date":"2011-01-25 09:18:30","post_modified":"2022-09-07 10:53:34","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/human-pose-estimation-for-kinect\/","post_excerpt":"Kinect for Xbox 360 and Windows makes you the controller by fusing 3D imaging hardware with markerless human-motion capture software. Our group investigates such software. Mixing computer vision, graphics, and machine learning techniques, we look at how to build algorithms that can learn to recognize human poses quickly and reliably. Images Traditional RGB image Image from new depth sensing camera Body parts inferred by our recognition algorithm 3D body part position proposals Related Press Binary&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/170652"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/162558","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/162558\/revisions"}],"predecessor-version":[{"id":528122,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/162558\/revisions\/528122"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=162558"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=162558"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=162558"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=162558"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=162558"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=162558"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=162558"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=162558"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=162558"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=162558"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=162558"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=162558"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=162558"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=162558"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=162558"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=162558"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}