{"id":723253,"date":"2021-02-11T11:23:51","date_gmt":"2021-02-11T19:23:51","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=723253"},"modified":"2021-03-24T13:17:46","modified_gmt":"2021-03-24T20:17:46","slug":"denoised-smoothing-provably-defending-pretrained-classifiers-against-adversarial-examples","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/denoised-smoothing-provably-defending-pretrained-classifiers-against-adversarial-examples\/","title":{"rendered":"Denoised smoothing: Provably defending pretrained classifiers against adversarial examples"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_no_logo_denoised_denoised_sensing.gif\" alt=\"An animation comparing public image classification APIs to the proposed denoised smoothing framework applied to a public image classification API. In the first sequence, an image of an elephant is input into a public image classification API, represented by an arrow leading to a gray square labeled as such. An arrow from the square points to a correct prediction of elephant, enclosed in a green square, with the words \u201cno robustness guaranteed\u201d under it. In the second sequence, an arrow points from an image of an elephant to six noisy copies of the image. An arrow then points from the copies to a square labeled \u201cCustom-trained Denoiser,\u201d which outputs six clean versions of the images. An arrow points from the clean copies to a square labeled \u201cPublic Image Classification API.\u201d The classifier provides predictions for each copy, of which four correctly identify their respective images as elephant. Adjacent to the predictions is a pie chart labeled \u201cMajority Rules!\u201d with one-third of the pie in red and two-thirds in green. Arrows point to the output of the process: a final prediction of elephant, enclosed in a green square, and a strong robustness guarantee, denoted by the words \u201ccertified radius\u201d enclosed in a green square.\"\/><\/figure>\n\n\n\n<p><em>Editor\u2019s note: This post and its research are the result of the collaborative efforts of a team of researchers comprising former Microsoft Research Engineer <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/hadisalman.com\">Hadi Salman<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, CMU PhD student <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/eric-mingjie.github.io\/\">Mingjie Sun<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Researcher <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/gregyang\/\">Greg Yang<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Partner Research Manager <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/akapoor\/\">Ashish Kapoor<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and CMU Associate Professor <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/zicokolter.com\/\">J. Zico Kolter<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/em><\/p>\n\n\n\n<p>It\u2019s been well-documented that subtle modifications to the inputs of image classification systems can lead to bad predictions. Take, for example, a model trained to classify images of an elephant. The model easily classifies an image of the animal grazing in a grassy field. Now, if just a few pixels in that image are maliciously altered, you can get a very different\u2014and <em>wrong<\/em>\u2014prediction despite the image appearing unchanged to the human eye. Sensitivity to such input perturbations, which are known as adversarial examples, raises security and reliability issues for the vision-based systems that we deploy in the real world. To tackle this challenge, recent research has revolved around building defenses against such adversarial examples. However, most of these adversarial defenses, such as randomized smoothing, require specifically training a classifier with a custom objective, which can be computationally expensive.<\/p>\n\n\n\n<div class=\"annotations \" data-bi-aN=\"margin-callout\">\n\t<ul class=\"annotations__list card depth-16 bg-body p-4 annotations__list--left\">\n\t\t<li class=\"annotations__list-item\">\n\t\t\t\t\t\t<span class=\"annotations__type d-block text-uppercase font-weight-semibold text-neutral-300 small\">Publication <\/span>\n\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/denoised-smoothing-a-provable-defense-for-pretrained-classifiers\/\" target=\"_self\" class=\"annotations__link font-weight-semibold text-decoration-none\" data-bi-type=\"annotated-link\" aria-label=\"Denoised Smoothing: A Provable Defense for Pretrained Classifiers\" data-bi-aN=\"margin-callout\" data-bi-cN=\"Denoised Smoothing: A Provable Defense for Pretrained Classifiers\">\n\t\t\t\tDenoised Smoothing: A Provable Defense for Pretrained Classifiers&nbsp;<span class=\"glyph-append glyph-append-chevron-right glyph-append-xsmall\"><\/span>\n\t\t\t<\/a>\n\t\t\t\t\t<\/li>\n\t<\/ul>\n<\/div>\n\n\n\n<div class=\"annotations \" data-bi-aN=\"margin-callout\">\n\t<ul class=\"annotations__list card depth-16 bg-body p-4 annotations__list--right\">\n\t\t<li class=\"annotations__list-item\">\n\t\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/event\/neurips-2020\/\" target=\"_self\" aria-label=\"Microsoft at NeurIPS 2020\" data-bi-type=\"annotated-link\" data-bi-cN=\"Microsoft at NeurIPS 2020\" class=\"annotations__list-thumbnail\" >\n\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"170\" height=\"96\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-300x169.jpg\" class=\"mb-2\" alt=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-1024x577.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-768x433.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-1536x865.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-2048x1154.jpg 2048w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-16x9.jpg 16w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/12\/1400x788_Neurips_Banner_withcopy-1-1920x1080.jpg 1920w\" sizes=\"(max-width: 170px) 100vw, 170px\" \/>\t\t\t\t<\/a>\n\t\t\t\t\t\t\t<span class=\"annotations__type d-block text-uppercase font-weight-semibold text-neutral-300 small\">Event<\/span>\n\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/event\/neurips-2020\/\" target=\"_self\" class=\"annotations__link font-weight-semibold text-decoration-none\" data-bi-type=\"annotated-link\" aria-label=\"Microsoft at NeurIPS 2020\" data-bi-aN=\"margin-callout\" data-bi-cN=\"Microsoft at NeurIPS 2020\">\n\t\t\t\tMicrosoft at NeurIPS 2020&nbsp;<span class=\"glyph-append glyph-append-chevron-right glyph-append-xsmall\"><\/span>\n\t\t\t<\/a>\n\t\t\t\t\t<\/li>\n\t<\/ul>\n<\/div>\n\n\n\n<p>We consider a slightly different scenario: can we generate a provably robust classifier from off-the-shelf pretrained classifiers <em>without <\/em>retraining them specifically for robustness? In the paper \u201c<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/denoised-smoothing-a-provable-defense-for-pretrained-classifiers\/\">Denoised Smoothing: A Provable Defense for Pretrained Classifiers<\/a>,\u201d which we presented at the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/event\/neurips-2020\/\">34th Conference on Neural Information Processing Systems (NeurIPS 2020)<\/a>, we introduce denoised smoothing. Via the simple addition of a pretrained denoiser, we can apply randomized smoothing to make existing pretrained classifiers provably robust against adversarial examples without custom training. We envision our method being particularly helpful to those who don\u2019t have the ability or access to train a classifier from scratch, such as users of public image classification APIs.<\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/denoised-smoothing-a-provable-defense-for-pretrained-classifiers\/\">Read Paper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>   &nbsp; &nbsp;&nbsp;&nbsp; &nbsp; &nbsp;            &nbsp;              &nbsp;                                &nbsp;  &nbsp;   &nbsp;<\/strong>&nbsp;<strong>  &nbsp;     <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/denoised-smoothing\">Code + Models<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> <\/strong><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"260\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Animation_DenoisedSmoothing-1024x260.jpg\" alt=\"An image of an elephant labeled x being input into a public image classification API, represented by an arrow leading to a gray square labeled as such. An arrow from the square points to a correct prediction of elephant, enclosed in a green square. \" class=\"wp-image-724132\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Animation_DenoisedSmoothing-1024x260.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Animation_DenoisedSmoothing-300x76.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Animation_DenoisedSmoothing-768x195.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Animation_DenoisedSmoothing-16x4.jpg 16w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Animation_DenoisedSmoothing.jpg 1079w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Figure 1: Can we get any robustness guarantees when using pretrained classifiers, like those deployed in public image classification APIs, while having only query access to them?<\/figcaption><\/figure><\/div>\n\n\n\n<h2 id=\"randomized-smoothing-an-effective-provable-defense-for-classifiers\">Randomized smoothing: An effective provable defense for classifiers<\/h2>\n\n\n\n<p>At the core of our approach is randomized smoothing, a promising and provable adversarial defense that has been shown to scale to large networks and datasets. Essentially, randomized smoothing <em>smooths out <\/em>a classifier: it takes a \u201cbrittle,\u201d or \u201cjittery,\u201d function and makes it more stable, helping to ensure predictions for inputs in the neighborhood of a specific data point are constant.<\/p>\n\n\n\n<p>Consider, as described in our paper, \u201ca classifier \\(f\\) mapping inputs in \\(R^d\\) to classes in \\(Y\\); the randomized smoothing procedure converts the base classifier \\(f\\) into a new, smoothed classifier \\(g\\). Specifically, for input \\(x\\), \\(g\\) returns the class that is most likely to be returned by the base classifier \\(f\\) under isotropic Gaussian noise perturbations\u201d (which are required for randomized smoothing) of \\(x\\):<\/p>\n\n\n\n<p class=\"has-text-align-center\">\\( \\begin{align} g(x) = \\arg\\max_{c \\in \\mathcal{Y}} \\; \\mathbb{P}[f(x+\\delta) = c] \\label{eq:smoothed-hard} \\quad \\text{where} \\; \\delta \\sim \\mathcal{N}(0, \\sigma^2 I) \\;  \\end{align}\\)<\/p>\n\n\n\n<div style=\"height:12px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>In the above, \u201cthe noise level \\(\\sigma\\) controls the tradeoff between robustness and accuracy: as \\(\\sigma\\) increases, the robustness of the smoothed classifier increases while its standard accuracy decreases.\u201d In other words, as the classifier becomes more robust to perturbations, there could be a loss of accuracy on images that have no perturbations, requiring users to balance the tradeoff depending on the use case.<\/p>\n\n\n\n<p>A team of researchers from Carnegie Mellon University (CMU) and the Bosch Center for Artificial Intelligence showed that <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/1902.02918\">the above procedure leads to a robustness guarantee<\/a> against \\(\\ell_2\\) adversarial attacks, and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/randomized-smoothing-of-all-shapes-and-sizes\/\">subsequent works derived similar guarantees for other threat models<\/a>.<\/p>\n\n\n\n<h2 id=\"randomized-smoothing-in-practice\">Randomized smoothing in practice<\/h2>\n\n\n\n<p>In practice, \\(g(x)\\) is hard to calculate, so it\u2019s estimated by querying the classifier multiple times with modified versions of an image. So given a data point \\(x\\) for which we want a prediction using randomized smoothing, we would take the following steps:<\/p>\n\n\n\n<p>1. Replicate <em>n <\/em>copies of the image, adding random Gaussian noise to each of them.<\/p>\n\n\n\n<p>2. Query the base classifier \\(f\\) at each of these data points to get a prediction \\(y_i\\).  <\/p>\n\n\n\n<p>3. Take the majority vote of the \\(y_i\\)&#8216;s as the final prediction of the smoothed classifier \\(g\\) at \\(x\\).<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"147\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Denoised-Smoothing_Figure2.png\" alt=\"A flow diagram of randomized smoothing. An arrow points from an image of an elephant labeled x to six noisy copies of the image. An arrow then points from the copies to a square labeled \u201cCustom-trained Classifier.\u201d The classifier provides predictions for each copy: elephant, cup, elephant, cup, elephant, elephant. The correct predictions are enclosed in green squares; the incorrect predictions are enclosed in red squares. Adjacent to the predictions is a pie chart labeled \u201cmajority RULES!\u201d with one-third of the pie in red and two-thirds of the pie in green. Arrows point from the predictions to the output of the process: a final prediction of elephant, enclosed in a green square, and a strong robustness guarantee, denoted by the words \u201ccertified radius\u201d enclosed in a green square.  \" class=\"wp-image-724240\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Denoised-Smoothing_Figure2.png 624w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Denoised-Smoothing_Figure2-300x71.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/Denoised-Smoothing_Figure2-16x4.png 16w\" sizes=\"(max-width: 624px) 100vw, 624px\" \/><figcaption>Figure 2: To obtain a prediction using randomized smoothing, multiple noisy copies of the image of interest are generated. A classifier trained to classify well under noisy inputs is queried with the noisy copies, producing a prediction for each. The majority vote is then taken as the final prediction and accompanied by a robustness guarantee in the form of a certified radius.<\/figcaption><\/figure><\/div>\n\n\n\n<p>For a base classifier \\(f\\), one can apply the above procedure to get a prediction of any data point \\(x\\) along with a robustness guarantee in the form of a certified radius, the radius around a given input for which the prediction is guaranteed to be fixed. So, for example, if you pass an image of an elephant with a specific certified radius into a classifier and then pass the same image with a small perturbation through the classifier and the new image falls within that radius, the classifier will give the same prediction for both.<\/p>\n\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1044939\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">on-demand event<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/story\/sep-2024-brief\/?OCID=msr_researchforum_MCR_Blog_Promo\" aria-label=\"Microsoft Research Forum Episode 4\" data-bi-cN=\"Microsoft Research Forum Episode 4\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/08\/RF4_Panel_1400x788.jpg\" alt=\"Research Forum | Episode 4 Panel | John Langford, Hoifung Poon, Katja Hofmann, Jianwei Yang\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Microsoft Research Forum Episode 4<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p class=\"large\">Learn about the latest multimodal AI models, advanced benchmarks for AI evaluation and model self-improvement, and an entirely new kind of computer for AI inference and hard optimization. <\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/story\/sep-2024-brief\/?OCID=msr_researchforum_MCR_Blog_Promo\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" aria-label=\"Watch on-demand\" data-bi-cN=\"Microsoft Research Forum Episode 4\" target=\"_blank\">\n\t\t\t\t\t\t\tWatch on-demand\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n\n<p>But the above procedure assumes that the base classifier \\(f\\) classifies well under Gaussian perturbations of its inputs. Specifically, in Step 3, we take the majority votes over noisy perturbation of the input \\(x\\). So the base classifier \\(f\\) must be trained in a way that classifies well under noisy inputs. Several papers have proposed techniques to train the base classifier \\(f\\), including the CMU-Bosch Center for AI paper, which uses Gaussian noise data augmentation, and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/provably-robust-deep-learning-via-adversarially-trained-smoothed-classifiers\/\">our 2019 NeurIPS paper, which uses adversarial training<\/a>.<\/p>\n\n\n\n<h2 id=\"applying-randomized-smoothing-to-pretrained-classifiers\">Applying randomized smoothing to pretrained classifiers<\/h2>\n\n\n\n<p>Now what if the base classifier \\(f\\) is some off-the-shelf classifier that wasn\u2019t trained specifically for randomized smoothing\u2014that is, it doesn\u2019t classify well under noisy perturbations of its inputs? For example, what if we applied randomized smoothing to a public vision API? Do we get good robustness guarantees? Figure 3 illustrates what happens in this case. Basically, we get poor predictions and poor robustness guarantees since these APIs aren\u2019t trained to classify well under substantial levels of noise added to their inputs (which, to reiterate, is required for randomized smoothing). Therefore, randomized smoothing leads to trivial guarantees when applied as is to these pretrained classifiers, as we demonstrate in our paper, and thus isn\u2019t effective.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"243\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothing_Figure3-1024x243.png\" alt=\"A flow diagram of randomized smoothing applied to a public image classification API. An arrow points from an image of an elephant labeled x to six noisy copies of the image. An arrow then points from the copies to a square labeled \u201cPublic Image Classification API.\u201d The classifier provides predictions for each copy: elephant, cup, cup, cup, elephant, cup. The correct predictions are enclosed in green squares; the incorrect predictions are enclosed in red squares. Adjacent to the prediction is a pie chart labeled \u201cmajority RULES!\u201d with one-third of the pie in green and two-thirds of the pie in red. Arrows point from the predictions to the output of the process: a final prediction of cup, enclosed in a red square, and a poor robustness guarantee, denoted by the words \u201ccertified radius\u201d enclosed in a red square.\" class=\"wp-image-724249\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothing_Figure3-1024x243.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothing_Figure3-300x71.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothing_Figure3-768x182.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothing_Figure3-16x4.png 16w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothing_Figure3.png 1430w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Figure 3: If we apply randomized smoothing to off-the-shelf classifiers, bad results are obtained, as these classifiers don\u2019t classify well when substantial noise levels are added to their inputs, which is required for randomized smoothing.<\/figcaption><\/figure><\/div>\n\n\n\n<p>With our work on denoised smoothing, we make randomized smoothing effective for classifiers that aren\u2019t trained specifically for randomized smoothing. Our proposed method is straightforward; as mentioned above, instead of applying randomized smoothing to these classifiers, we prepend a <em>custom-trained <\/em>denoiser in front of these classifiers and then apply randomized smoothing (Figure 4). The denoiser helps by removing noise from the synthetic noisy copies \\(x\\)+\\(\\delta_i\\) of the input \\(x\\), which allows the pretrained classifiers to give better predictions, as they now see denoised images they\u2019re able to classify better.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothingFigure4-1024x246.png\" alt=\"A flow diagram of denoised smoothing. An arrow points from an image of an elephant labeled x to six noisy copies of the image. An arrow then points from the copies to a square labeled \u201cCustom-trained Denoiser,\u201d which outputs six clean versions of the images. An arrow points from the clean copies to a square labeled \u201cPublic Image Classification API.\u201d The classifier provides predictions for each copy: elephant, cup, elephant, cup, elephant, elephant. The correct predictions are enclosed in green squares; the incorrect predictions are enclosed in red squares. Adjacent to the prediction is a pie chart labeled \u201cmajority RULES!\u201d with one-third of the pie in red and two-thirds of the pie in green. Arrows point from the predictions to the output of the process: a final prediction of elephant, enclosed in a green square, and a strong robustness guarantee, denoted by the words \u201ccertified radius\u201d enclosed in a green square.\" class=\"wp-image-724252\" width=\"1024\" height=\"246\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothingFigure4-1024x246.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothingFigure4-300x72.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothingFigure4-768x185.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothingFigure4-16x4.png 16w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/DenoisedSmoothingFigure4.png 1430w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Figure 4: Denoised smoothing, which adds a custom-trained denoiser before the classifier to be defended, makes randomized smoothing effective for classifiers that aren\u2019t trained specifically for it. The denoiser helps by removing noise from the copies of the input, allowing a pretrained classifier to give better predictions, as it now sees denoised images it\u2019s able to classify better.<\/figcaption><\/figure><\/div>\n\n\n\n<h2 id=\"how-to-train-the-denoiser\">How to train the denoiser<\/h2>\n\n\n\n<p>In our paper, we investigate several ways for training the denoiser \\(\\mathcal{D}_{\\theta}\\).<\/p>\n\n\n\n<p><strong>Mean square error objective:<\/strong> simply train the denoiser to minimize the mean square error (MSE) between the clean image and the denoised image<\/p>\n\n\n\n<p class=\"has-text-align-center\">\\(\\begin{equation}L_{\\text{MSE}}= \\mathbb{E}_{\\mathcal{S}, \\delta}||\\mathcal{D}_{\\theta}(x_i + \\delta)-x_i||_{2}^{2}\\end{equation}\\)<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><strong>Stability objective:<\/strong> given a classifier \\(F\\) attached to the denoiser \\(\\mathcal{D}_{\\theta}\\), minimize the cross-entropy loss between the prediction of \\(F(\\mathcal{D}_{theta})\\) at the noisy input \\(x_i + \\delta\\) and the prediction of the classifier at the clean data point \\(x_i\\)<\/p>\n\n\n\n<p class=\"has-text-align-center\">\\(\\begin{align}L_{\\text{Stab}}&=\\mathbb{E}_{\\mathcal{S}, \\delta} \\mathcal{\\ell_\\text{CE}}(F(\\mathcal{D}_\\theta(x_i + \\delta)), f(x_i)) \\quad \\text{where} \\; \\delta \\sim \\mathcal{N}(0, \\sigma^2 I) \\; \\end{align}\\)<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><strong>MSE-stability hybrid: <\/strong>train the denoiser with MSE, then fine-tune it with the stability objective<\/p>\n\n\n\n<h2 id=\"improvements-in-certified-accuracy\">Improvements in certified accuracy<\/h2>\n\n\n\n<p>Our method allows us to boost the certified accuracy of a ResNet-110 pretrained on CIFAR-10 and ResNet-18\/34\/50 classifiers pretrained on ImageNet as shown in the tables below. We test our method in two scenarios.  In the white-box scenario, the pretrained classifier is assumed to be known, which allows for the exploration of more ways to train the denoiser. In theory, since we know the specific classifier being used, we can backpropagate gradients through it to update our denoiser. In the black-box scenario, the pretrained classifier is not known; we don\u2019t have any information about its architecture or parameters. We compare our method deployed in these scenarios to a \u201cno denoiser\u201d baseline, which applies randomized smoothing to these pretrained classifiers directly without any denoising step. We observe substantial improvements in the robustness guarantee (certified accuracy) at various \\(\\ell_2\\) radii (the higher the certified accuracy, the more robust the classifier is).<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>L<sub>2<\/sub> Radius (CIFAR-10) <\/td><td>.25<\/td><td>.5<\/td><td>.75<\/td><td>1.0<\/td><td>1.25<\/td><td>1.5<\/td><\/tr><tr><td>No Denoiser (baseline) (%)<\/td><td>7<\/td><td>3<\/td><td>0<\/td><td>0<\/td><td>0<\/td><td>0<\/td><\/tr><tr><td>Denoised Smoothing (black box) (%) <\/td><td>45<\/td><td>20<\/td><td>15<\/td><td>13<\/td><td>11<\/td><td>10<\/td><\/tr><tr><td>Denoised Smoothing (white box) (%)<\/td><td>56<\/td><td>41<\/td><td>28<\/td><td>19<\/td><td>16<\/td><td>13<\/td><\/tr><\/tbody><\/table><figcaption>Certified accuracy of ResNet-110 on CIFAR-10 at various L<sub>2<\/sub> radii.<\/figcaption><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>L<sub>2<\/sub> Radius (ImageNet)<\/td><td>.25<\/td><td>.5<\/td><td>.75<\/td><td>1.0<\/td><td>1.25<\/td><td>1.5<\/td><\/tr><tr><td>No Denoiser (baseline) (%)<\/td><td>32<\/td><td>4<\/td><td>2<\/td><td>0<\/td><td>0<\/td><td>0<\/td><\/tr><tr><td>Denoised Smoothing (black box) (%)<\/td><td>48<\/td><td>31<\/td><td>19<\/td><td>12<\/td><td>7<\/td><td>4<\/td><\/tr><tr><td>Denoised Smoothing (white box) (%)<\/td><td>50<\/td><td>33<\/td><td>20<\/td><td>14<\/td><td>11<\/td><td>6<\/td><\/tr><\/tbody><\/table><figcaption>Certified top-1 accuracy of ResNet-50 on ImageNet at various L<sub>2<\/sub> radii.<\/figcaption><\/figure>\n\n\n\n<p>Finally, we apply denoised smoothing to four public vision APIs and show how we can get certified predictions using these APIs without having access to the underlying models at all (see the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/denoised-smoothing-a-provable-defense-for-pretrained-classifiers\/\">paper<\/a> for the results). While our method can help make possible the protection of vision-based systems from adversarial examples without having to custom train the pretrained classifiers powering them, this work is only a starting point in making pretrained classifiers provably robust. Next steps for denoised smoothing can include working to achieve even stronger guarantees, particularly in scenarios in which the classifier being used is not known.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Editor\u2019s note: This post and its research are the result of the collaborative efforts of a team of researchers comprising former Microsoft Research Engineer Hadi Salman (opens in new tab), CMU PhD student Mingjie Sun (opens in new tab), Researcher Greg Yang (opens in new tab), Partner Research Manager Ashish Kapoor (opens in new tab), [&hellip;]<\/p>\n","protected":false},"author":38838,"featured_media":725623,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-723253","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[708199],"related-researchers":[{"type":"guest","value":"hadi-salman","user_id":"624675","display_name":"Hadi Salman","author_link":"Hadi Salman","is_active":true,"last_first":"Salman, Hadi","people_section":0,"alias":"hadi-salman"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-960x540.jpg\" class=\"img-object-cover\" alt=\"An visual comparing public image classification APIs to the proposed denoised smoothing framework applied to a public image classification API. In the first sequence, an image of an elephant is input into a public image classification API, represented by an arrow leading to a gray square labeled as such. An arrow from the square points to a correct prediction of elephant, enclosed in a green square, with the words \u201cno robustness guaranteed\u201d under it. In the second sequence, an arrow points from an image of an elephant to six noisy copies of the image. An arrow then points from the copies to a square labeled \u201cCustom-trained Denoiser,\u201d which outputs six clean versions of the images. An arrow points from the clean copies to a square labeled \u201cPublic Image Classification API.\u201d The classifier provides predictions for each copy, of which four correctly identify their respective images as elephant. Adjacent to the predictions is a pie chart labeled \u201cMajority Rules!\u201d with one-third of the pie in red and two-thirds in green. Arrows point to the output of the process: a final prediction of elephant, enclosed in a green square, and a strong robustness guarantee, denoted by the words \u201ccertified radius\u201d enclosed in a green square.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-1536x864.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-2048x1152.jpg 2048w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-16x9.jpg 16w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/02\/1400x788_denoised_sensing_still_nologo-1920x1080.jpg 1920w\" sizes=\"(max-width: 960px) 100vw, 960px\" \/>","byline":"Hadi Salman","formattedDate":"February 11, 2021","formattedExcerpt":"Editor\u2019s note: This post and its research are the result of the collaborative efforts of a team of researchers comprising former Microsoft Research Engineer Hadi Salman (opens in new tab), CMU PhD student Mingjie Sun (opens in new tab), Researcher Greg Yang (opens in new&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/723253"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/38838"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=723253"}],"version-history":[{"count":129,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/723253\/revisions"}],"predecessor-version":[{"id":725689,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/723253\/revisions\/725689"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/725623"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=723253"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=723253"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=723253"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=723253"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=723253"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=723253"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=723253"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=723253"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=723253"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=723253"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=723253"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}