{"id":554805,"date":"2018-12-05T08:02:13","date_gmt":"2018-12-05T16:02:13","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=554805"},"modified":"2023-03-23T14:30:46","modified_gmt":"2023-03-23T21:30:46","slug":"chasing-convex-bodies-and-other-random-topics-with-dr-sebastien-bubeck","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/chasing-convex-bodies-and-other-random-topics-with-dr-sebastien-bubeck\/","title":{"rendered":"Chasing convex bodies and other random topics with Dr. S\u00e9bastien Bubeck"},"content":{"rendered":"<div id=\"attachment_555147\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-555147\" class=\"wp-image-555147 size-large\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-1024x576.png\" alt=\"Senior Researcher S\u00e9bastien Bubeck\" width=\"1024\" height=\"576\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><p id=\"caption-attachment-555147\" class=\"wp-caption-text\">Senior Researcher S\u00e9bastien Bubeck<\/p><\/div>\n<h3>Episode 53 | December 5, 2018<\/h3>\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebubeck\/\">Dr. S\u00e9bastien Bubeck<\/a> is a mathematician and a senior researcher in the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/mlog\/\">Machine Learning and Optimization group<\/a> at Microsoft Research. He\u2019s also a self-proclaimed \u201cbandit\u201d who claims that, despite all the buzz around AI, it\u2019s still a science in its infancy. That\u2019s why he\u2019s devoted his career to advancing the mathematical foundations behind the machine learning algorithms behind AI.<\/p>\n<p>Today, Dr. Bubeck explains the difficulty of the multi-armed bandit problem in the context of a parameter- and data-rich online world. He also discusses a host of topics from randomness and convex optimization to metrical task systems and log n competitiveness to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics.<\/p>\n<h3>Related:<\/h3>\n<ul type=\"disc\">\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\">Microsoft Research Podcast<\/a>: View more podcasts on Microsoft.com<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/itunes.apple.com\/us\/podcast\/microsoft-research-a-podcast\/id1318021537?mt=2\">iTunes<\/a>: Subscribe and listen to new podcasts each week on iTunes<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/subscribebyemail.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\">Email<\/a>: Subscribe and listen by email<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/subscribeonandroid.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\">Android<\/a>: Subscribe and listen on Android<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/open.spotify.com\/show\/4ndjUXyL0hH1FXHgwIiTWU\">Spotify<\/a>: Listen on Spotify<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.blubrry.com\/feeds\/microsoftresearch.xml\">RSS feed<\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/note.microsoft.com\/ww-registration-microsoft-research-newsletter-s.html?wt.mc_id=S-webpage_podcast\">Microsoft Research Newsletter<\/a>: Sign up to receive the latest news from Microsoft Research<\/li>\n<\/ul>\n<hr \/>\n<h3>Episode Transcript<\/h3>\n<p>S\u00e9bastien Bubeck: You want science to be based on scientific facts. And that\u2019s what I like about the theoretical computer science community is that they have well-established, long-standing open problems, and if you do make progress on those problems, then people will listen to you, rightfully, because it means you have some new techniques, something that people, before, couldn\u2019t do.<\/p>\n<p><strong>Host: You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/strong><\/p>\n<p><strong>Host: Dr. S\u00e9bastien Bubeck is a mathematician and a senior researcher in the Machine Learning and Optimization group at Microsoft Research. He\u2019s also a self-proclaimed \u201cbandit\u201d who claims that, despite all the buzz around AI, it\u2019s still a science in its infancy. That\u2019s why he\u2019s devoted his career to advancing the mathematical foundations behind the machine learning algorithms behind AI.<\/strong><\/p>\n<p><strong>Today, Dr. Bubeck explains the difficulty of the multi-armed bandit problem in the context of a parameter- and data-rich online world. He also discusses a host of topics from randomness and convex optimization to metrical task systems and log n competitiveness to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics. That and much more on this episode of the Microsoft Research Podcast.<\/strong><\/p>\n<p><strong>Host: S\u00e9bastien Bubeck, welcome to the podcast.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Thank you.<\/p>\n<p><strong>Host: You\u2019re a senior researcher in the newly-minted, or relatively newly-minted, Machine Learning and Optimization group at MSR which used to be called the Theory Group and now there\u2019s a couple of sub-groups under this new umbrella, so, give us a quick picture of the org chart and who does what and then tell us how we can best frame the work you do. What gets you up in the morning?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Sure. So, I will start with the new organization that we have here at Microsoft Research in Redmond. So, I guess, a little bit more than a year ago, we started a new lab which is called Microsoft Research AI, for artificial intelligence. So, this is following, of course, all the hype and the real scientific excitement around machine learning. So, we created Microsoft Research AI and some of the members of the former Theory Group decided to join this new endeavor and that\u2019s when we decided to create the Machine Learning and Optimization group. And then the other half of the group went to do the Algorithms Group in just the basic Microsoft Research lab in Redmond.<\/p>\n<p><strong>Host: So, is that actually a group, the Algorithms Group?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: There is a group called the Algorithms Group. So, of course, in Machine Learning and Optimization, we also do a lot of algorithms. In fact, it is the main focus of our group, but specifically around machine learning.<\/p>\n<p><strong>Host: So, you\u2019re in the Machine Learning and Optimization side of this group.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yes.<\/p>\n<p><strong>Host: And you also work on algorithms. But what get you, particularly, up in the morning? What questions are you asking, what problems are you solving?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: So, I guess, in general, I am really passionate about online decision making, or decision making in general. I think, you know no matter what you think will happen in the future, humans or robots or AI, there will always be a need for making decisions. And the questions in mathematics that surround decision making, I find them really fascinating, and they are still in their infancy. So, there is really a lot to be done there. And somehow, traditionally, pure mathematicians have not looked at those questions so much. And so there is really this gap between beautiful mathematics that has been developed, say, for physics or, slightly more recently for computer science, but I would argue that decision making, online decision making, is just as important today as those more traditional topics.<\/p>\n<p><strong>Host: How does your group, doing very theoretical work on the math of machine learning, interact with other groups at Microsoft Research, or even the discipline, writ large?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Right. So, you know, these days in the media, we hear a lot about artificial intelligence. As much as we would want to have it, an artificial intelligence, that, you know, makes recommendations, that tells us what we should cook for the next day, you know, help us sleep better, whatever\u2026 it doesn\u2019t exist yet. And what this means is that we are still very much working on the foundations. So, we have a couple of techniques that do work great, including gradient decent, convolutional neural networks, things like that. They really do work very well. And naturally, big companies are investing a lot in those and trying to, you know, engineer around them as much as possible, get the full stack, get the hardware ready, you know, all of those things. But there are many, many other things that we would want to do that are out of reach for the current techniques. So, what we really need to do is develop fundamentally new tools. And that is what our group is doing.<\/p>\n<p><strong>Host: Well, we couldn\u2019t talk about the mathematics of machine learning and sequential decision making without talking about the multi-armed bandit problem. It\u2019s known, as you say to be simple \u2013 I\u2019m making air-quotes here \u2013 and yet incredibly rich. So, explain this essential problem and tell us what\u2019s new in the research right now.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Right. So, the multi-armed bandit problem is actually the first real research problem that I looked at now, a little bit more than a decade ago. So, it\u2019s a mathematical question that conceptualizes the problem of exploration versus exploitation. So, you know, in our everyday life, we have to deal with this problem, you know, do we order from the same restaurant that we know we love, or do we try this new restaurant that we don\u2019t really know yet, whether it\u2019s good or not? So, maybe I should explain where does the name come from?<\/p>\n<p><strong>Host: That\u2019s a good idea.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: So, in American slang, a one-armed bandit is just a slot machine at the casino. And you can imagine that in the multi-armed bandit, you are facing, you know, many of those one-armed bandits and for every coin that you have, you have to choose which bandit you are going to put your coin in and, you know, pull the arm. So, we refer to the different actions as different arms of the bandit, OK? So, that\u2019s where the name comes from. And it really got popularized in the 1950s by Herbert Robbins. And at the time, it\u2019s actually quite interesting, the multi-armed bandit problem was viewed as a negative result in the sense that in the 1930s, people developed classical statistics, so people like Fischer or Neiman and etc. So, they developed classical statistics and then they wanted, very naturally, to move to online statistics. So, what happens if the data is not given to you all at once, but it\u2019s coming to you sequentially? And then they realized that the questions were really tricky, really difficult. And then Robbins came along and said, \u201cLook at this multi-armed bandit problem, it\u2019s so simple, yet we have no idea how to solve it.\u201d So, it was more viewed as a negative problem. But then what happened is that in the early 2000s, a huge, very important application appeared on the scene which is ad placement on the internet. So, ad placement on the internet is exactly a bandit problem. You can imagine each user coming to your website as, you know, one time-step of your problem, and then for that user, you have to select which ad to present. So, each ad is an arm of your bandit and then when you present the ad, you see whether the person clicks on it. You see whether the treatment was efficient or not, and then you move on to the next one. So, it\u2019s a new important application of multi-armed bandit. So, now what has happened since 100 years ago? So, what has happened is that now we really care about applications where the number of actions, the number of arms is huge. It is very, very large. Another application, in fact, is website display, so you can imagine that you have many different ways to display your website and, you know, users are going to prefer certain ways. But this is a combinatorial problem, you know, whether you place the ad at the top or at the bottom of the page, whether you use the color green or the color red. So, you have many, many possible choices which leads to a combinatorial explosion of possibilities, and in turn, this means that you have to really think very deeply on how to leverage the experience that you had with a certain arm to other arms. You have to understand, what are the correlations between those different arms. So, this is what modern multi-armed bandits is about.<\/p>\n<p><strong>(music plays)<\/strong><\/p>\n<p><strong>Host: Let\u2019s talk about a project you\u2019re involved in on the foundations of optimization. So, optimization methods are, as you call them, the engine of machine learning algorithms. Give us an overview of the research directions in this arena, particularly convex optimization, and explain why they\u2019re important to the theory behind machine learning algorithms.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: So, today\u2019s optimization problems in machine learning are really characterized by two key features. One of them is that they deal with optimization problems in very high dimension. So, by that, what I mean is, the number of parameters of your model, these are the number of variables that you are going to do optimization over. We are not talking about optimizing a single parameter in dimension one, or, you know, two or three parameter dimensions two or three, that are easy to visualize for humans. We\u2019re talking about millions, tens of millions, probably billions in the future. That\u2019s the space we\u2019re dealing with in optimization today. So that\u2019s one of the big obstacles. The other one is just the sheer amount of data. So, the data is so huge that you have to be very careful about the operations that you do with them. You cannot, you know, visit your whole data set a million times. It\u2019s just not possible. You have to do that more parsimoniously. So, this begs the question, in turn, when you go back to the optimization literature, on how to make the most of the calculations that you have already done. And what are the basic calculations that you can do to optimize a function? One of the basic things you can do is to calculate its derivative, its gradient. And now, you are asking the question, given the gradients that you have observed so far, how to best combine them to calculate the next gradient. So, this optimal usage of your information, it has a very long history in optimization, in particular in the Russian school, and they developed optimal methods in the 70s and the 80s and we are only now starting to really understand the fundamental concept behind it.<\/p>\n<p><strong>Host: Really?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yes. So, it\u2019s actually a quite interesting story. At the end of the 70s, Arkadi Nemirovski, who is the leading figure in the Russian school in optimization, he discovered that when the function that you are optimizing is smooth, you can actually do better than the simple gradient descent technique. You can leverage the past information in a better way. So, he designed an accelerated technique, which later, Yurii Nesterov, who is now a legendary figure in machine learning, simplified dramatically and he made a very practical algorithm out of it, which is now called Nesterov\u2019s Momentum Technique. So, this momentum technique was first designed in the context of convex optimization as a way to obtain an optimal method. And nowadays, it\u2019s one of the most practical techniques for training deep neural networks which are non-convex functions and, you know, we don\u2019t really understand why momentum is working there, but that\u2019s the story of that technique.<\/p>\n<p><strong>Host: So, there are things you do that you don\u2019t totally understand, but they\u2019re working, so you keep doing them.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: That\u2019s right. But what our group is actually doing is trying to understand why they are working. So, in particular, for this momentum technique, we worked a lot on it, and a couple of years ago, with my good collaborator, Yin Tat Lee, who is now is an Assistant Professor at the University of Washington, we found a way to understand the momentum from a geometric perspective. And this is very useful because when you start to be able to draw a picture, you get better intuition. So, this was, at least for us, a very important step towards understanding better, momentum.<\/p>\n<p><strong>Host: You have two big areas of interest. Well you have a lot of varies of interest, let\u2019s be honest. But you\u2019ve noted that two big areas of your interest are the interplay between convexity and randomness in optimization, and also inference problems on random graphs. So, I wonder if you could just unpack those a little bit. Why are you interested in them? What do they address? Why are they important enough for someone with a brain like yours to tackle?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: So, as I said before, nowadays, we really deal with very high dimensional spaces, and one key technique to explore those high dimensional spaces is to use randomness. And in classical convex optimization, randomness was not really a tool that people were looking at. They were more focused on trying to understand, what can you do with deterministic methods? So, it seems to me that there was, and there still is to some extent, a real opportunity to bring in those randomized algorithm tools into convex optimization and hopefully get a couple of breakthroughs, you know, on the way. So that was for the first area. So, the second area actually, was one of my areas of focus for a couple of years which comes from the observation that, nowadays, many data sets come in the form of a graph. And what you want to do is you want to do basic statistics with it. What are, you know, the basic things you want to compute with those graphs? To my surprise, there were some questions that seemed really fundamental to me that were not looked at. So, there are a couple of models which are very well-known now in random graph theory and in network analysis to explain, say, how the Facebook graph grew, or, you know, how the Twitter graph is growing. And one basic question that I wondered about is, okay, you have those evolution processes, but they start with some seed graphs. So, the seed is not empty. It\u2019s not like, you know, Facebook started with one user, it was actually seeded with hundreds of users, or maybe more. So, what I was curious about is that there was this discrepancy between the reality, where people were designing, very carefully, a good seed graph in order to enhance the growth of the network, and the mathematical analysis, which was thinking about, what if the seed is empty? So, the first question that I asked was, is a theory without a seed relevant to the practice with a seed? And what we proved is that it\u2019s not! That, you know, if you have two different seeds, you can get two completely different graph evolution processes.<\/p>\n<p><strong>Host: Quelle surprise!<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yeah. It was actually a surprise, yeah! We didn\u2019t expect it. We thought that maybe, you know, in the very large sample regime, at the end of time, once everything has settled down, then maybe the seed doesn\u2019t matter anymore. But that\u2019s not the case.<\/p>\n<p><strong>Host: Interesting. You have a blog called, \u201cI\u2019m a bandit\u201d and it deals with random topics in optimization, probability and statistics.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: That\u2019s right.<\/p>\n<p><strong>Host: So, first of all, I\u2019m glad you\u2019re not knocking over banks. Uhh\u2026 But the big topic here is bandit convex optimization. Talk about this in general, and then talk about your polynomial time algorithm for bandit convex optimization. We\u2019ll go kind of deep here. I\u2019d like you to get as technical as you like, but also, we\u2019ll nod to the fact that you\u2019ve got a great video on this, on the website, and three big blog posts on it that kind of goes deep into the weeds, yeah?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yes. So, bandit convex optimization is the following problem. So, as I said before, the new research on the multi-armed bandit problem is really interested in the case where there are many, many actions. So now you can ask, okay, what about the limit case where there is actually an infinite number of actions, a continuum of actions? So, then, of course, if you don\u2019t make any assumptions about how the rewards are shaped over this set of actions, there is no hope. You cannot hope to find, you know, what is the best action. But as we discussed already, one of the very basic assumptions, which has found a lot of application in machine learning, is convexity. So, then the question becomes, in these bandit scenarios, where you have a continuum of actions, and the loss function is shaped as a convex function, what can you do? Can you still recover the optimal guarantee that we had before, with just two or three actions? So, this was an open problem that was published in 2004. And when I joined Microsoft Research five years ago, the Theory Group, and some of the members in the Theory Group, in particular Ofer Dekel, who is now the manager of the Machine Learning and Optimization group, and Yuval Peres, who is a distinguished probabilist, they were talking and thinking about this problem. And when I joined, you know, I was coming with my own expertise on bandits and I was really excited to join the group and start thinking about the questions that those guys cared about. And we were very fortunate that we made good progress on it. So, our first stride into the problem was, in fact, to show something a little bit unique in machine learning which was, we showed that the problem is solvable, but we didn\u2019t actually give a technique to solve it. So, that was both interesting and disappointing.<\/p>\n<p><strong>Host: \u201cThat don\u2019t make so sense!\u201d<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yes. It was strange. We were all surprised by this result, to some extent. But also, very excited because it meant that there is something to be discovered there. The problem is solvable, but it is certainly not solvable with the current techniques that we have.<\/p>\n<p><strong>Host: I want you to drill in there for a second, because you\u2019re actually saying something that baffles me. You could solve it, but you don\u2019t know how you solved it?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: So, it\u2019s more as follows. There was this very concrete question: can you get sqrt{T} regret for bandit convex optimization? What we showed is that the answer to this question is, yes. Yes, it is, in principle, possible to get sqrt{T} regret for bandit convex optimization, but we did not give an algorithm that actually achieves this regret guarantee. So, then the question remained, can you actually get a polynomial time algorithm, like an efficient algorithm, something that will actually work on real data, and that gets the sqrt{T}-type behavior. So, that was the state of this problem when Yin Tat Lee came for an internship with me in the summer of 2016 and we started thinking about this question. And, again, we were lucky to come up with a new approach based on kernels. So, kernel methods have a long and distinguished history in machine learning. In fact, they were the most popular technique before deep learning overtook the field. And what we found is a connection between bandit-type information and kernels. That there really is an interplay between the two. And while we were exploring this, to our own bafflement, we discovered that there is a connection between kernels and what are called Bernoulli Convolutions. So, Bernoulli Convolutions, they are mathematical objects that were defined in the 1930s that Paul Erdos, the famous mathematician, thought of as one of the most beautiful objects in mathematics. And these Bernoulli Convolutions, they are directly related to the Gaussian kernel in machine learning and they are exactly what we needed to solve bandit convex optimization. And it turned out that the world expert on this topic was, you know, my office mate, sitting next door, Yuval Peres. So, it was very convenient that I could go to him and ask him all the questions around this. And then later, Ronen Eldan joined us on this project, who was also one of the members of the team with which we proved that there is a solution, but we don\u2019t know, what is the solution!<\/p>\n<p><strong>Host: I just have the biggest grin on my face. I, I\u2026 Every podcast I hear things that delight me, and nobody can see my face!<\/strong><\/p>\n<p><strong>(music plays)<\/strong><\/p>\n<p><strong>Host: Let\u2019s talk about task systems. As a level set, tell us what they are, and what they are for, and then talk about metrical task systems and how they\u2019re different.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: So, we have discussed, so far, the multi-armed bandit problem, which is a very fundamental and important problem, but it\u2019s missing a notion of a state. So, in most of the real-life problems that we care about, it\u2019s not like you take an action and it has no consequence on the future. In fact, when you take action, often the state of the world, or the state of the system that you are acting on, is evolving and you have to take that into account. So, task systems are a very fundamental mathematical modelization of this problem and it goes like this. So, in task systems, you have a finite state space \u2013 I\u2019m going to say that N is the size of the state space \u2013 and you are, again, going to play a sequential game, just like in multi-armed bandit. And at every time step, what you\u2019re presented with is a task. And what is a task? A task is a cost function on the state space. So, a task just tells you, for every state, how long it\u2019s going to take you to complete the task if you\u2019re sitting in that state. So, now, in the game, you are sitting in a certain state, you know, because of what you have done in the past. You see the new task coming at you. And the question is, to which new state do you move to complete the task? Now, of course, stated like this, you would just move to the state that has the minimal completion time for the task. But that\u2019s where the metrical part comes into play, the metrical task system. So, the metrical part is that states, they can be far away from each other. It\u2019s not like you can easily move, you know, from a state where, say, you know no mathematics to a state where you are a world-expert in mathematics. This is going to cost you a lot to make this movement, you know, to change your own state. So, now you can think of the state space as a metric space, meaning that there is a notion of distance between two states. And when you\u2019re sitting in a certain state and you get your new task, when you move to a new state, you are going to pay the distance to travel there, plus the completion time. So, these metrical task systems were introduced in the 80s in the theoretical computer science community. And, as I defined it, it\u2019s a very discrete problem. And, as expected, people used discrete mathematics techniques to try to solve this problem. So, the big conjecture, the big open problem in that sub-field, is the following. So, you say that an algorithm for the metrical task system problem is competitive if it has the following property: no matter what the task sequence is, the algorithm\u2019s performance, the total cost of the algorithm, is going to be comparable to the best it could have done had it known in hindsight what was the sequence of tasks to be presented. That\u2019s what it means to be competitive. It basically means you can see into the future. So, we\u2019re going to say that you are alpha competitive if, not only you are comparable to the best you could have done, but you are at most paying alpha times what the best could have been. So, you know, maybe you want to be two competitive or three competitive. And the conjecture in that field is that you can be log n competitive for any metrical task system. So, we are not very far from proving this conjecture. We are basically a polynomial factor off. But the important thing is that people were really looking at this problem from a discrete mathematics perspective. And what we brought in a year ago with my great collaborators, Michael Cohen, Yin Tat Lee and James Lee is that we looked at it from a convex optimization perspective. So, we viewed this very discrete problem in continuous space and continuous time. And once you make that modification, then it\u2019s very natural to again talk about gradient descent, and in fact, this older strategy which is called mirror descent, and using this new continuous strategy, we were able to refine the guarantees that we know about this problem. So, this was for the discrete metrical task system. Now there is a convex version of the metrical task system. Because just as, for discrete multi-arm bandits, you know, there is a lot of interest into going into the case where there is huge set of actions, set of arms, it\u2019s the same thing for metrical task systems. You might be interested in a case where the set of states is, in fact, infinite. So then, if N is infinity, you know log n is also infinity, and you are not competitive. So that\u2019s the question\u2026.<\/p>\n<p><strong>Host: Dang it!<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yes, yes indeed. Indeed. This is very disappointing. And, so the question is, just in metrical task systems, when you task a convex function over the state space, can you be competitive at all? So, this was an open problem from the early 90s, from Linial and Freedman, and what they proved is that in dimension two, you can be competitive. Very restrictive, I mean, certainly not talking about the millions of dimensions that we opened with. And what we just proved recently, we just put out the paper a month ago, we proved that, indeed, you can be competitive for convex metrical task systems in any dimension. But the competitive ratio that you get is going to depend on the dimension. And a great open question is whether you can polynomial in the dimension scale. That seems very difficult.<\/p>\n<p><strong>Host: All the rest of it seems very difficult also. And this then, as you reference, solves a 30-year old problem.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Right. So, in this literature, they don\u2019t talk about convex metrical task systems, instead they talk about these problems that they call chasing convex bodies. So, in chasing convex bodies, you are moving a point in Euclidian space, or let\u2019s say you are moving a point with D parameters. And every time-step, you are a given a convex set of constraints and what you have to do is you have to move your set of parameters inside this convex set. So, you have to chase the convex set, you have to go inside the convex set. And you see this sequence of convex sets that are given to you and you keep moving. And then again, you want to be competitive, so you want to compare yourself to the best you could have done in hindsight had you known the sequence of convex bodies right from the start. And so, indeed, we do show that you can be competitive to chase convex bodies.<\/p>\n<p><strong>(music plays)<\/strong><\/p>\n<p><strong>Host: So, going philosophical a little bit here. We talked about some of the issues with what we call the current machine learning culture versus the culture of theoretical computer science, which is more traditional. What are some of the things we ought to be paying attention to right now? Is there anything that keeps you up at night? Anything we ought to be thinking about?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Yeah, sure. So, I personally believe that science is a fundamentally slow enterprise. It takes time to digest new concepts. It takes time to really recognize whether something that looks new on the face of it is actually new, or it\u2019s just you know a slightly different point of view, a slightly different perspective. Our time is a little bit different from, you know, say, two decades ago, and one reason is just because the amount of data that we have. So, we have this influx of data, and people want to do something with it. So, this, I believe, is not going away, and there will be a need for, you know, data scientists and people who really think deeply about how to use and how to leverage information that you can acquire from data. So, that will not change. What I worry a little bit more about is the culture which is centered around claims which have no, as far as I can tell, no mathematical basis. You want science to be based on scientific facts. And that\u2019s what I like about the theoretical computer science community is that they have well-established, long-standing open problems, and if you do make progress on those problems, then people will listen to you, rightfully, because it means you have some new techniques, something that people, before, couldn\u2019t do. So, there are pros and cons for these very well-defined open problems. The cons is, of course, that those open problems, they are not solving burning practical questions that we need to resolve right now. So, then, you know, you come up with a solution and it\u2019s not like people are going to implement your algorithm right away. So, these are maybe not the most important questions if you really want to get new algorithms right away. But on the other hand, progress on these questions, this is deep progress. This is progress where you really develop new tools which, hopefully, maybe a couple of years from now, some more practical methods are going to emerge out of it.<\/p>\n<p><strong>Host: Tell us a bit about your history, academic and otherwise. It\u2019s always great to hear personal anecdotes about where people came from and how they ended up where they are. What\u2019s your story?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Sure. So, I grew up in a small town around Strasbourg in France next to Germany. Ummm\u2026<\/p>\n<p><strong>Host: What\u2019s it called?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: It\u2019s called Oberhausbergen, which means \u201cthe house at the bottom of the hill.\u201d And then, you know, there\u2019s a city next to it which is Mittelhausbergen, \u201cthe house on the middle of the hill.\u201d And then there is Niederhausbergen, \u201cthe house at the top of the hill.\u201d So, these are the three towns where, you know, I was hanging out until I was 18. So during those first 18 years, yeah, I didn\u2019t know anything about mathematics, science\u2026 but, then at 18, I went into this thing which is called Prepas, so that\u2019s famous in France, that\u2019s where you train to go into the Grandes Ecoles, Engineering Grandes Ecoles, or, you know, Ecoles Normales Superieures, which is where I went, where you learn how to do mathematics. So that\u2019s where I really fell in love with mathematics. I always remember it, like, the very first lecture, I immediately thought, OK, this is the first time that somebody\u2019s saying something that really resonates with me. So, two years of Prepas, one year in Ecoles Normales Superieures and then I moved on to do my PhD. What I thought at the time I wanted to do was mathematics of the brain, but then I realized that there was no such thing. But, as I realized that, I also noticed this field of machine learning which seemed very exciting. So, I started to learn more about it and then I decided to do a PhD in machine learning and, specifically in multi-armed bandit problems. So, that\u2019s what I did in France.<\/p>\n<p><strong>Host: So how did you end up at Microsoft Research?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: After my PhD, I did one year in Barcelona for my post doc, which was just amazing, scientifically and otherwise, and then, with my wife, we moved to Princeton, New Jersey, where I was an Assistant Professor in the OR department. I stayed there for three years. And then I did a semester at the Simons Institute in Berkeley and that\u2019s where I met Yuval Peres, and that\u2019s also where I discovered the field of theoretical computer science which, again, I fell in love with. And that\u2019s really where I could see, you know, my second calling, I\u2019d say. After mathematics, really, theoretical computer science. I thought the people in the field were amazing. I thought the questions they were looking at were fantastic. I thought their work ethic was very refreshing. So, that\u2019s where I met Yuval Peres and then after that is history. That\u2019s why I joined Microsoft Research. I came to visit, and they made me an offer and I decided to join.<\/p>\n<p><strong>Host: So as of two days ago, you have a pretty cool announcement to make.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Right. Our paper, a joint work with researchers at Inria in France, and Yin Tat Lee, again, Yin Tat Lee, just got awarded the best paper at NIPS. So, it\u2019s pretty exciting.<\/p>\n<p><strong>Host: Tell us a little bit about the paper and what it\u2019s adding to the literature in the field.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Right, so it\u2019s, in fact, tying back to one of my main areas of interest which is convexity and randomness. So, the question we were looking at is a problem of distributed convex optimization. So, what happens when the objective function that you want to optimize which depends on your data, it\u2019s not all in a single place, but the data is distributed among many servers. Then how do you optimize both the communication between the different servers as well as the work that each server is doing? So, that\u2019s the question we were looking at. Of course, we were very far from being the first ones to look at\u2026 there are literally hundreds, if not thousands, of papers written on this topic. But, somewhat surprisingly, it seems that nobody has really looked at optimal methods, meaning provably optimal, both upper and lower bounds. So, in classical serial optimization, this is, again, work done by the Russian school, Arkadi Nemirovski, Yurii Nesterov, and those people. But they didn\u2019t look so much at the distributed scenario. And one reason, I think, is because in this distributed scenario, in fact, randomness is key. Randomization is very, very important. And that\u2019s what our paper proves is that if you use randomizations then you can get optimal guarantees.<\/p>\n<p><strong>Host: That\u2019s awesome. Congratulations.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Thank you.<\/p>\n<p><strong>Host: S\u00e9bastien, as we close, I always ask my guests to offer some parting advice to our listeners, whether it be in the form of inspiration for future research or challenges that remain to be solved. And your recent paper kind of tees that up a little bit. What would you say to researchers who might just be getting interested in this field?<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: Maybe a good advice to any starting researcher is to become an expert in a very narrow field or even narrow question, but really become the world expert in that question and that will yield rewards for a very long time. So that\u2019s what I did with multi bandit problem and I think this was very successful and, you know, later on I could see connections with this problem that I wouldn\u2019t have made if I didn\u2019t know all of this about the multi-bandit problem. Now, in terms of research directions, there are really many, many questions emerging. So, one of them we already mentioned is in this chasing convex bodies problem. We showed that you can be competitive, but our dependency on the dimension is very bad. It\u2019s exponential in the dimension. So, a great open question, but difficult one, is whether you can be polynomial. Another open question is this metrical task system to actually solve the long-standing conjecture of whether you can be log n competitive. This is a fantastic question that relates to how do you define a good notion of entropy for general metric spaces? It\u2019s really a very deep and beautiful question. We don\u2019t know how to solve this. Related to this latest best paper award at NIPS, in fact, you can think about the multi-armed bandit version of this problem which is exactly distributed bandit. This problem is basically entirely open, and I believe it\u2019s a very rich question and, in fact, it ties into a broader question of how do you think about multi-agent problems? And this, yeah, is a very rich field yet to emerge.<\/p>\n<p><strong>(music plays)<\/strong><\/p>\n<p><strong>Host: S\u00e9bastien Bubeck. Thank you very much for coming on the podcast. It\u2019s been really, really interesting.<\/strong><\/p>\n<p>S\u00e9bastien Bubeck: My pleasure. I really enjoyed the questions.<\/p>\n<p><strong>Host: To learn more about Dr. S\u00e9bastien Bubeck and how theoretical computer scientists are tackling the multi-armed bandit and other combinatorial conundrums, visit <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/\">Microsoft.com\/research<\/a><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Episode 53, December 5, 2018 &#8211; Dr. S\u00e9bastien Bubeck explains the difficulty of the multi-armed bandit problem in the context of a parameter- and data-rich online world. He also discusses a host of topics from randomness and convex optimization to metrical task systems and log n competitiveness to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics.<\/p>\n","protected":false},"author":37074,"featured_media":555147,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/39845392\/","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[240054],"tags":[],"research-area":[13561],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-554805","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-algorithms","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/39845392\/","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[392777],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788.png\" class=\"img-object-cover\" alt=\"Senior Researcher S\u00e9bastien Bubeck\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/11\/Sebastien-Bubeck_POD_Site_11_2018_1400x788-343x193.png 343w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"December 5, 2018","formattedExcerpt":"Episode 53, December 5, 2018 - Dr. S\u00e9bastien Bubeck explains the difficulty of the multi-armed bandit problem in the context of a parameter- and data-rich online world. He also discusses a host of topics from randomness and convex optimization to metrical task systems and log&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/554805","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37074"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=554805"}],"version-history":[{"count":10,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/554805\/revisions"}],"predecessor-version":[{"id":930501,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/554805\/revisions\/930501"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/555147"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=554805"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=554805"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=554805"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=554805"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=554805"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=554805"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=554805"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=554805"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=554805"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=554805"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=554805"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}