{"id":153616,"date":"2006-01-01T00:00:00","date_gmt":"2006-01-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/object-class-recognition-at-a-glance\/"},"modified":"2018-10-16T20:14:25","modified_gmt":"2018-10-17T03:14:25","slug":"object-class-recognition-at-a-glance","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/object-class-recognition-at-a-glance\/","title":{"rendered":"Object Class Recognition at a Glance"},"content":{"rendered":"

This video shows our real-time object class recognition system at work. Object class recognition is a very challenging problem. The dif\ufb01culty lies in capturing the variability of appearance and shape of different objects belonging to the same class, while avoiding confusing objects from different classes. However, state of the art algorithms such as [2] are capable of delivering high classi\ufb01cation accuracy at interactive speed when dealing with a limited number of classes (around ten). Following the texton-based modeling approach in [2] we have developed an application for real-time segmentation and recognition of objects placed on a table top (\ufb01gure 1). The system comprises two steps: object segmentation and classi\ufb01cation. First, each object region is separated from the table top. This happens by running a patch-based classi\ufb01er which discriminates between the class \u201ctable\u201d and everything else. This technique is very different from more conventional background subtraction and is robust with respect to shadows, light changes and camera shake or motion. Second, once all the non-table connected regions have been extracted, they are classi\ufb01ed as belonging to one of \ufb01fteen object classes using the same discriminative technique. Each classi\ufb01er is a random forests discriminative model [1], [3] using pixel difference features. The classi\ufb01er is learned similarly to [2] to achieve maximum generalisation with high ef\ufb01ciency and is designed to be invariant both to rotation and to small changes in scale. Our features are computed on both the RGB image (thus providing information about appearance) and on the binary segmentation mask (thus capturing information about object shape). In the learnt random decision trees each node is hence associated to either appearance or shape. Figure 2 illustrates such an example. The use of shape features is a key component of the recognition classi\ufb01er since the shape information provided signi\ufb01cantly boosts the accuracy (by more than 10%). Our algorithm runs on 320 \u00d7 240 images at up to 20 frames per second, with an overall accuracy of around 90%. Training our discriminative class models for 15 classes from 600 training images takes only about ten minutes.<\/p>\n","protected":false},"excerpt":{"rendered":"

This video shows our real-time object class recognition system at work. Object class recognition is a very challenging problem. The dif\ufb01culty lies in capturing the variability of appearance and shape of different objects belonging to the same class, while avoiding confusing objects from different classes. However, state of the art algorithms such as [2] are […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13556],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-153616","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us"],"msr_publishername":"","msr_edition":"Proc. Conf. Computer Vision and Pattern Recognition (CVPR) -- Video Track","msr_affiliation":"","msr_published_date":"2006-01-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"209197","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"Criminisi_cvpr2005_video.zip","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/Criminisi_cvpr2005_video.zip","id":209197,"label_id":0},{"type":"file","title":"Criminisi_cvpr2006_video.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/Criminisi_cvpr2006_video.pdf","id":209196,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":209197,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/Criminisi_cvpr2005_video.zip"},{"id":209196,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/Criminisi_cvpr2006_video.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"jwinn","user_id":32457,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=jwinn"},{"type":"user_nicename","value":"antcrim","user_id":31055,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=antcrim"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[169717],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":169717,"post_title":"Image Understanding","post_name":"image-understanding","post_type":"msr-project","post_date":"2008-10-07 05:23:23","post_modified":"2023-05-15 09:56:35","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/image-understanding\/","post_excerpt":"At Microsoft Research in Cambridge we are developing new machine vision algorithms for automatic recognition and segmentation of many different object categories. We are interested in both the supervised and unsupervised scenarios.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169717"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/153616"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/153616\/revisions"}],"predecessor-version":[{"id":524639,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/153616\/revisions\/524639"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=153616"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=153616"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=153616"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=153616"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=153616"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=153616"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=153616"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=153616"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=153616"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=153616"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=153616"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=153616"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=153616"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=153616"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=153616"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=153616"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}