{"id":157306,"date":"2009-01-01T00:00:00","date_gmt":"2009-01-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/tolerating-latency-in-replicated-state-machines-through-client-speculation\/"},"modified":"2018-10-16T21:53:29","modified_gmt":"2018-10-17T04:53:29","slug":"tolerating-latency-in-replicated-state-machines-through-client-speculation","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/tolerating-latency-in-replicated-state-machines-through-client-speculation\/","title":{"rendered":"Tolerating Latency in Replicated State Machines Through Client Speculation"},"content":{"rendered":"<p>Replicated state machines are an important and widelystudied<br \/>\nmethodology for tolerating a wide range of<br \/>\nfaults. Unfortunately, while replicas should be distributed<br \/>\ngeographically for maximum fault tolerance,<br \/>\ncurrent replicated state machine protocols tend to magnify<br \/>\nthe effects of high network latencies caused by geographic<br \/>\ndistribution. In this paper, we examine how to<br \/>\nuse speculative execution at the clients of a replicated<br \/>\nservice to reduce the impact of network and protocol latency.<br \/>\nWe first give design principles for using client<br \/>\nspeculation with replicated services, such as generating<br \/>\nearly replies and prioritizing throughput over latency. We<br \/>\nthen describe a mechanism that allows speculative clients<br \/>\nto make new requests through replica-resolved speculation<br \/>\nand predicated writes. We implement a detailed case<br \/>\nstudy that applies this approach to a standard Byzantine<br \/>\nfault tolerant protocol (PBFT) for replicated NFS and<br \/>\ncounter services. Client speculation trades in 18% maximum<br \/>\nthroughput to decrease the effective latency under<br \/>\nlight workloads, letting us speed up run time on singleclient<br \/>\nmicro-benchmarks 1.08\u201319\u00d7 when the client is<br \/>\nco-located with the primary. On a macro-benchmark, reduced<br \/>\nlatency gives the client a speedup of up to 5\u00d7.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Replicated state machines are an important and widelystudied methodology for tolerating a wide range of faults. Unfortunately, while replicas should be distributed geographically for maximum fault tolerance, current replicated state machine protocols tend to magnify the effects of high network latencies caused by geographic distribution. In this paper, we examine how to use speculative execution [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-157306","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"USENIX","msr_edition":"The 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI '09)","msr_affiliation":"","msr_published_date":"2009-01-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"207819","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"nsdi09.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/nsdi09.pdf","id":207819,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":207819,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/nsdi09.pdf"}],"msr-author-ordering":[{"type":"text","value":"Benjamin Wester","user_id":0,"rest_url":false},{"type":"text","value":"James Cowling","user_id":0,"rest_url":false},{"type":"text","value":"Edmund B. Nightingale","user_id":0,"rest_url":false},{"type":"text","value":"Peter M. Chen","user_id":0,"rest_url":false},{"type":"text","value":"Jason Flinn","user_id":0,"rest_url":false},{"type":"text","value":"Barbara Liskov","user_id":0,"rest_url":false},{"type":"user_nicename","value":"edn","user_id":31714,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=edn"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[144936],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/157306"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/157306\/revisions"}],"predecessor-version":[{"id":540006,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/157306\/revisions\/540006"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=157306"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=157306"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=157306"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=157306"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=157306"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=157306"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=157306"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=157306"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=157306"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=157306"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=157306"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=157306"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=157306"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=157306"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=157306"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}