{"id":1010823,"date":"2024-02-29T10:30:27","date_gmt":"2024-02-29T18:30:27","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=1010823"},"modified":"2024-02-29T10:30:27","modified_gmt":"2024-02-29T18:30:27","slug":"flame-a-small-language-model-for-spreadsheet-formulas","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/flame-a-small-language-model-for-spreadsheet-formulas\/","title":{"rendered":"FLAME: A Small Language Model for Spreadsheet Formulas"},"content":{"rendered":"

Spreadsheets are a vital tool for end-user data management.
\nUsing large language models for formula authoring assistance in these environments can be difficult, as these models are expensive to train and challenging to deploy due to
\ntheir size (up to billions of parameters). We present FLAME,
\na transformer-based model trained exclusively on Excel formulas that leverages domain insights to achieve competitive
\nperformance while being substantially smaller (60M parameters) and training on two orders of magnitude less data. We
\ncurate a training dataset using sketch deduplication, introduce
\nan Excel-specific formula tokenizer, and use domain-specific
\nversions of masked span prediction and noisy auto-encoding
\nas pre-training objectives. We evaluate FLAME on formula
\nrepair, formula completion, and similarity-based formula retrieval. FLAME can outperform much larger models, such as
\nthe Davinci (175B) and Cushman (12B) variants of Codex
\nand CodeT5 (220M), in 10 of 14 evaluation settings for the
\nrepair and completion tasks. For formula retrieval, FLAME
\noutperforms CodeT5, CodeBERT, and GraphCodeBERT.<\/p>\n","protected":false},"excerpt":{"rendered":"

Spreadsheets are a vital tool for end-user data management. Using large language models for formula authoring assistance in these environments can be difficult, as these models are expensive to train and challenging to deploy due to their size (up to billions of parameters). We present FLAME, a transformer-based model trained exclusively on Excel formulas that […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13556],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-field-of-study":[246694,255019],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1010823","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-field-of-study-artificial-intelligence","msr-field-of-study-spreadsheets"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-2","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"Association for the Advancement of Artificial Intelligence","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/02\/flame-cr.pdf","id":"1010826","title":"flame-cr","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":1010826,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/02\/flame-cr.pdf"}],"msr-author-ordering":[{"type":"text","value":"Harshit Joshi","user_id":0,"rest_url":false},{"type":"text","value":"Abishai Ebenezer","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Jos\u00e9 Cambronero","user_id":40531,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Jos\u00e9 Cambronero"},{"type":"user_nicename","value":"Sumit Gulwani","user_id":33755,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Sumit Gulwani"},{"type":"user_nicename","value":"Aditya Kanade","user_id":42066,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Aditya Kanade"},{"type":"user_nicename","value":"Vu Le","user_id":39174,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Vu Le"},{"type":"text","value":"Ivan Radicek","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Gust Verbruggen","user_id":41605,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Gust Verbruggen"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[663303],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1010823"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1010823\/revisions"}],"predecessor-version":[{"id":1010829,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1010823\/revisions\/1010829"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1010823"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=1010823"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=1010823"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1010823"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=1010823"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=1010823"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=1010823"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=1010823"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=1010823"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1010823"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=1010823"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=1010823"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=1010823"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1010823"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1010823"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}