{"id":1131261,"date":"2025-02-22T17:41:31","date_gmt":"2025-02-23T01:41:31","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=1131261"},"modified":"2025-02-22T17:41:32","modified_gmt":"2025-02-23T01:41:32","slug":"deeper-evaluation-of-a-single-cell-foundation-model","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/deeper-evaluation-of-a-single-cell-foundation-model\/","title":{"rendered":"Deeper evaluation of a single-cell foundation model"},"content":{"rendered":"
Large-scale foundation models, which are pre-trained on massive, unlabelled datasets and subsequently fine-tuned on specific tasks, have recently achieved unparalleled success on a wide array of applications, including in healthcare and biology (opens in new tab)<\/span><\/a>. The success of these models has showcased the power of leveraging generalizable features and contextual understanding to improve a model\u2019s performance. Single-cell bidirectional encoder representations from transformers (scBERT) by Yang et al.7 (opens in new tab)<\/span><\/a><\/sup>\u00a0is one of several recently developed foundation models to learn representations of single-cell RNA-sequencing data. Yang et al. pre-trained their model on 1.12\u2009million cells to impute masked gene-expression values and characterize the performance of their model on a fine-tuning task to annotate cell types. We reproduce their results, and provide additional baselines and ablation studies (that is, remove components of the model\u2019s architecture or training process) to develop a deeper understanding of their results and the potential benefits and limitations of single-cell foundation models.<\/p>\n","protected":false},"excerpt":{"rendered":" Large-scale foundation models, which are pre-trained on massive, unlabelled datasets and subsequently fine-tuned on specific tasks, have recently achieved unparalleled success on a wide array of applications, including in healthcare and biology. The success of these models has showcased the power of leveraging generalizable features and contextual understanding to improve a model\u2019s performance. Single-cell bidirectional […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13553],"msr-publication-type":[193715],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[269148,269142],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[267945],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1131261","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-medical-health-genomics","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-include-in-river"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-12-12","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":0,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"doi","viewUrl":"false","id":"false","title":"https:\/\/doi.org\/10.1038\/s42256-024-00949-w","label_id":"243106","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/www.nature.com\/articles\/s42256-024-00949-w","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Rebecca Boiarsky","user_id":0,"rest_url":false},{"type":"text","value":"Nalini M. Singh","user_id":0,"rest_url":false},{"type":"text","value":"Alejandro Buendia","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ava Amini","user_id":40432,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ava Amini"},{"type":"text","value":"Gad Getz","user_id":0,"rest_url":false},{"type":"text","value":"David Sontag","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[199563,849856],"msr_event":[],"msr_group":[703342],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"article","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1131261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1131261\/revisions"}],"predecessor-version":[{"id":1131264,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1131261\/revisions\/1131264"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1131261"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=1131261"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=1131261"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1131261"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=1131261"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=1131261"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=1131261"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=1131261"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=1131261"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1131261"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1131261"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=1131261"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=1131261"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=1131261"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1131261"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1131261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}