{"id":168346,"date":"2015-06-01T00:00:00","date_gmt":"2015-06-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/efficient-synthesis-of-probabilistic-programs\/"},"modified":"2018-10-16T20:16:15","modified_gmt":"2018-10-17T03:16:15","slug":"efficient-synthesis-of-probabilistic-programs","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/efficient-synthesis-of-probabilistic-programs\/","title":{"rendered":"Efficient Synthesis of Probabilistic Programs"},"content":{"rendered":"<div class=\"asset-content\">\n<p>We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of \u201csketching\u201d from synthesis of deterministic programs, and allow the programmer to write a skeleton program with \u201choles\u201d. Sketches enable the programmer to communicate domain-specific intuition about the structure of the desired program and prune the search space, and (2) we design an efficient Markov Chain Monte Carlo (MCMC) based synthesis algorithm to instantiate the holes in the sketch with program fragments. Our algorithm efficiently synthesizes a probabilistic program that is most consistent with the data.<\/p>\n<p>A core difficulty in synthesizing probabilistic programs is computing the likelihood <em>L<\/em> (<em>P | D<\/em>) of a candidate program <em>P<\/em> generating data <em>D<\/em>. We propose an approximate method to compute likelihoods using mixtures of Gaussian distributions, thereby avoiding expensive computation of integrals. The use of such approximations enables us to speed up evaluation of the likelihood of candidate programs by a factor of 1000, and makes Markov Chain Monte Carlo based search feasible. We have implemented our algorithm in a tool called PS KETCH, and our results are encouraging\u2014 PSKETCH is able to automatically synthesize 16 non-trivial realworld probabilistic programs.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of \u201csketching\u201d from synthesis of deterministic programs, and allow the programmer to write a skeleton program with \u201choles\u201d. Sketches enable the programmer to communicate domain-specific intuition about [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13556,13560],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-168346","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-programming-languages-software-engineering","msr-locale-en_us"],"msr_publishername":"ACM - Association for Computing Machinery","msr_edition":"Programming Language Design and Implementation (PLDI)","msr_affiliation":"","msr_published_date":"2015-06-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"204322","msr_publicationurl":"","msr_doi":"10.1145\/2737924.2737982","msr_publication_uploader":[{"type":"file","title":"paper.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/paper-5.pdf","id":204322,"label_id":0},{"type":"file","title":"PLDI.mp4","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/PLDI.mp4","id":204321,"label_id":0},{"type":"doi","title":"10.1145\/2737924.2737982","viewUrl":false,"id":false,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":204322,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/paper-5.pdf"},{"id":204321,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/PLDI.mp4"}],"msr-author-ordering":[{"type":"user_nicename","value":"adityan","user_id":30829,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=adityan"},{"type":"text","value":"Sherjil Ozair","user_id":0,"rest_url":false},{"type":"user_nicename","value":"sriram","user_id":33711,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=sriram"},{"type":"text","value":"Deepak Vijaykeerthy","user_id":0,"rest_url":false},{"type":"user_nicename","value":"t-dvijay","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[199562],"msr_event":[],"msr_group":[144939],"msr_project":[171174],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":171174,"post_title":"R2: A Probabilistic Programming System","post_name":"r2-a-probabilistic-programming-system","post_type":"msr-project","post_date":"2013-07-16 23:44:21","post_modified":"2017-06-14 09:01:38","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/r2-a-probabilistic-programming-system\/","post_excerpt":"What is R2? R2 is a probabilistic programming system that uses powerful techniques from program analysis and verification for efficient Markov Chain Monte Carlo (MCMC) inference. The language that is used to describe probabilistic models in R2 is based on C#.R2 compiles the given model into executable code to generate samples from the posterior distribution. The inference algorithm currently implemented in R2 is a variation of the Metropolis-Hastings sampling algorithm. Getting R2 Click on this&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171174"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168346"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168346\/revisions"}],"predecessor-version":[{"id":525130,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168346\/revisions\/525130"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=168346"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=168346"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=168346"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=168346"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=168346"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=168346"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=168346"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=168346"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=168346"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=168346"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=168346"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=168346"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=168346"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=168346"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=168346"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=168346"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}