{"id":660594,"date":"2020-05-27T03:00:26","date_gmt":"2020-05-27T10:00:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=660594"},"modified":"2020-06-18T07:29:11","modified_gmt":"2020-06-18T14:29:11","slug":"harvesting-randomness-haibrid-algorithms-and-safe-ai-with-dr-siddhartha-sen","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/harvesting-randomness-haibrid-algorithms-and-safe-ai-with-dr-siddhartha-sen\/","title":{"rendered":"Harvesting randomness, HAIbrid algorithms and safe AI with Dr. Siddhartha Sen"},"content":{"rendered":"
Dr. Siddhartha Sen (opens in new tab)<\/span><\/a> is a Principal Researcher in MSR\u2019s New York City lab, and his research interests are, if not impossible, at least impossible sounding: optimal decision making, universal data structures, and verifiably safe AI.<\/p>\n Today, he tells us how he\u2019s using reinforcement learning and HAIbrid algorithms to tap the best of both human and machine intelligence and develop AI that\u2019s minimally disruptive, synergistic with human solutions, and safe.<\/p>\n Sid Sen:\u00a0I feel like we\u2019re a little too quick to apply AI, especially when it comes to deep neural networks, just because of how effective they are when we throw a lot of data and computation at them. So much so,\u00a0that we might even be overlooking what the best human\u00a0baseline is. We might not even be comparing against the best human baseline. So, a lot of what this agenda is trying to do is trying\u00a0to push the human limit as far as it can and then kind of integrate AI where it makes\u00a0sense to do that.<\/p>\n Host:\u00a0<\/b>You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/b><\/p>\n Host:\u00a0<\/b>Dr. Siddhartha Sen is a Principal Researcher in MSR\u2019s New York City lab, and his research interests are, if not impossible, at least impossible sounding: optimal decision making, universal data structures<\/b>,<\/b>\u00a0and verifiably safe AI.<\/b>\u00a0<\/b>Today, he tells us how he\u2019s using reinforcement learning and\u00a0<\/b>HAIbrid<\/b>\u00a0algorithms to tap the best of both human and machine intelligence<\/b>,<\/b>\u00a0and develop AI<\/b>\u00a0that\u2019s minimally disruptive,\u00a0<\/b>synergistic with human solutions<\/b>,<\/b>\u00a0and safe.<\/b>\u00a0<\/b>That and much more on this episode of the Microsoft Research Podcast.<\/b><\/p>\n Host: Sid Sen, welcome to the podcast.<\/b><\/p>\n Sid Sen: Thank you.<\/p>\n Host: So fun to have you here today! You\u2019re a\u00a0<\/b>P<\/b>rincipal <\/b>R<\/b>esearcher at the MSR lab in New York City, which is a lab with\u00a0<\/b>a really interesting<\/b>\u00a0history and mission. But since research is constantly moving and evolving, why don\u2019t you tell us what\u2019s new in the Big Apple from your perspective. What big questions are you asking these days and what big problems are you trying to solve as a unit there?<\/b><\/p>\n Sid Sen: Our lab is\u00a0really unique. We have four main disciplines represented,\u00a0in addition to what I represent,\u00a0which is systems,\u00a0and\u00a0machine learning and systems. We have a machine learning group. We have a computational social science group. We have an economics and computation group,\u00a0and we have a FATE group,\u00a0which is this new group on fairness, accountability, transparency and ethics. And, you know, I would say that,\u00a0if I had to sum up the kind of work we\u2019re doing,\u00a0and\u00a0a lot of us work on decision-making, decision-making with opportunities and under certain kinds of constraints.\u00a0And\u00a0I think a lot of the work we\u2019re talking about today is trying to understand where computation,\u00a0and in particular artificial intelligence,\u00a0can help us and\u00a0where\u00a0it\u2019s not so strong,\u00a0and where,\u00a0you know, the more human approach is\u00a0more appropriate,\u00a0or is even better,\u00a0and then trying to figure out a way for these things to work together. So that\u2019s going to be a very core kind of a theme in what we\u2019re going to talk about.<\/p>\n Host: The word hybrid is going to come up a lot in our conversation today, but in different ways with different spellings. So, let\u2019s start with your own hybrid nature, Sid, and talk about what gets you up in the morning. Tell us about your personal research passions and how you bring what I would call hybrid vigor to the research gene pool.<\/b><\/p>\n Sid Sen: So,\u00a0I think I\u2019ve always been a bit interdisciplinary in everything I\u2019ve done. And\u00a0ever since I\u2019ve, kind of, grown up,\u00a0I\u2019ve always been able to do like a lot of different things reasonably well. And so, you can imagine I always questioned myself in terms of how much depth I had in anything, right?\u00a0I even questioned myself throughout my PhD. When I went to do my PhD, I was co-advised in more systems-oriented work as well as theoretical work. I had two different advisors and I essentially ended up doing two separate PhDs because I couldn\u2019t quite get those two disciplines to gel well together. It\u2019s very hard to do that. And you know, a lot of students have asked me, oh,\u00a0I want to do what you did,\u00a0and I said yes, that\u2019s a nice idea in principle, but it\u2019s not so easy to synergize two different areas. Since then,\u00a0I think I\u2019ve kind of figured out\u00a0how to do that,\u00a0so how to,\u00a0kind of,\u00a0leverage and appreciate the breadth I have and the appreciation I have for different disciplines and make these things work together. I figured out how to synergize ideas from different disciplines to solve bigger problems, problems like,\u00a0you know, how do we use AI responsibly in our systems? How do we leverage what humans are good at,\u00a0or how do we keep things safe? I mean, I think these kinds of problems are not problems you can solve with expertise in one area and one field.<\/p>\n Host: Right.\u00a0<\/b><\/p>\n Sid Sen: And so, a lot of what I figured out how to do is how to bring these different ideas together and work with different colleagues to bring these solutions out.<\/p>\n Host: Well your current research mission is to \u2013 and I quote<\/b>\u00a0\u2013<\/b>\u00a0<\/b>o<\/b>ptimize cloud infrastructure decisions with AI in a way that\u2019s<\/b>,<\/b>\u00a0as you call it, minimally disruptive, synergistic with human solutions, and safe. And to do this, you\u2019ve got three correlated research agendas which I\u2019ll have you dig into shortly. But before we do that, I want to go upstream a bit and talk about the conceptual framework for your work which we might call reinforcement learning in real life or RL IRL. Give us a rationale for this framework and then get we\u2019ll get technical on how you are going about it.<\/b><\/p>\n Sid Sen: I think a lot of AI is going to be this kind of gradual integration into the background of our lives,\u00a0into the systems that we use. So, I think it\u2019s important to first,\u00a0see how much we can learn and how much we can do without disrupting existing systems, just by kind of observing what they\u2019re doing now.\u00a0Without having to change them so much, what can we do to learn as much as we can from what they are doing and to kind of even reason about what would happen if we tried different things or if we used AI in different ways without actually changing what they\u2019re doing?\u00a0So that\u2019s why I want to take this minimally disruptive approach.<\/p>\n Host:\u00a0<\/b>Mmm<\/b>-hmm.<\/b><\/p>\n Sid Sen: Then,\u00a0at the same time, I think it\u2019s important to understand where AI should fit in, because if you just come in barging in with an AI solution, that is not something that is so easily digestible by existing systems and,\u00a0you know,\u00a0processes that we have in place. And so understanding where the AI is good, where it should fit in, where the human solutions are good, and finding a way that they can complement each other is something that,\u00a0to me,\u00a0is critical to\u00a0ensuring that you can gradually make this change. And then finally,\u00a0when you do make the change,\u00a0you know,\u00a0how do you maintain people\u2019s trust and how do you keep things safe?\u00a0You know,\u00a0I don\u2019t think AI and humans are ever going to be held to the same standards. Even if an AI system statistically affects or harms fewer people than a human operator would, they\u2019re still going to be held to a different kind of standard.<\/p>\n Host: Right.\u00a0<\/b><\/p>\n Sid Sen:\u00a0And so, I think it\u2019s ultra, ultra-important, once we make this kind of change,\u00a0to try to keep things safe.<\/p>\n Host: Well, let\u2019s talk in turn now about the three research agendas<\/b>,<\/b>\u00a0or roadmaps we might call them<\/b>,<\/b>\u00a0that you\u2019re following on the road to optimal decisions online. So, the first one is something you call harvesting randomness. And this one takes on the minimally disruptive challenge. So why don\u2019t you start by giving us a level set on the problem<\/b>,<\/b>\u00a0and then tell us all about harvesting randomness and how the idea of counterfactual evaluation is gaining traction in reasoning about systems.<\/b><\/p>\n Sid Sen: This project kind of came out of a simple observation. One thing I observed was that, whenever we want to try something new, or figure out how something new is going to work, we often\u00a0have to\u00a0deploy it and try it out. This is what we call an A\/B test.\u00a0And people use it all over the place. They use it in medical trials, they use it online. At any given time,\u00a0when you\u2019re using,\u00a0you know,\u00a0Bing or Google or any service, you\u2019re seeing something different than what I\u2019m seeing because they\u2019re continuously trying out different things by deploying these live A\/B tests. And the reason that works is because they randomize what people see. And because they randomize what people see, they can evaluate how good different alternative solutions are, and avoid any kind of biases or confounding issues that might come into place. And I realized that randomness is all over the place. We randomize stuff all the time, like when we load balance requests, when we decide where to replicate or put data, when we decide what to do if too many people are trying to do the same thing at the same time, we make these randomized decisions all the time. And I thought,\u00a0what if I could, you know, harvest all that randomness to do\u00a0all of\u00a0these experiments without actually having to do the experiments? But that\u2019s,\u00a0in essence,\u00a0what\u00a0counterfactual evaluation or counterfactual reasoning is: what if I had done this? Can I answer that question without\u00a0actually running\u00a0a big A\/B test and exposing the world to it? Because that\u2019s risky and that\u2019s costly.<\/p>\n Host: Right.\u00a0<\/b><\/p>\n Sid Sen: And so, that was a theory behind this project is that I\u2019m going to go around to all these systems that are already doing all this randomized stuff and,\u00a0using\u00a0a\u00a0slightly different set of statistical techniques, I can maybe answer questions about what would have happened if I tried this other policy. And that way, I could,\u00a0you know, without disrupting or changing the system, reason about what would have happened if I had changed things. And I think it turns out that it\u2019s not as easy to do that as we thought it would be. And the reason is,\u00a0all these systems that are randomized are very complicated and stateful systems.\u00a0We don\u2019t really have good enough practical techniques to understand them. But what we have found is that,\u00a0in the systems that surround us, there\u2019s a lot of information that\u2019s\u00a0similar to\u00a0the kind of information you would get from randomization that\u2019s already just sitting there. And I call that implicit feedback, and this is something that we\u2019re trying to leverage. It\u2019s kind of a very simple concept in systems. But when you wait for something to happen, and we do that all the time in systems, you know, we\u2019re waiting for some event to happen, if you wait for five minutes, it turns out,\u00a0you find out what would have happened if you waited for four minutes or three minutes or two minutes. Or if you provide a resource to someone, you give them ten cores,\u00a0or ten virtual machines, you might find out what would have happened if you gave them nine or eight or seven, because you gave them more\u00a0than those numbers. And so because of this kind of implicit structure, we\u2019re able to look at a lot of the decisions that existing systems are making now, and we\u2019re able to say, oh look, here are situations where they\u2019re actually making decisions where we could answer the counterfactual question of what would have happened if they made smaller decisions?\u00a0Can we leverage that feedback to learn more from that existing data and then,\u00a0using that information, come up with better policies for making those decisions? So, it\u2019s not the same as,\u00a0like,\u00a0the natural randomness that we started out initially envisioning we could harvest, but it\u2019s similar because it gives us the same kind of information. And now that we have a technique for harvesting this extra feedback, we can basically take this feedback, you know, use probabilities to weight\u00a0things in the right way so that we get what I call an unbiased view of the world, and,\u00a0you know, this way we can kind of find a better policy to put in there without actually changing the system that much at all.<\/p>\n Host: So<\/b>,<\/b>\u00a0what kind of results are you getting with this? Are you using it or deploying it in places where it would matter?<\/b><\/p>\n Sid Sen:\u00a0Yeah.\u00a0So\u00a0one of the teams that we worked with for a while is the team that deals with,\u00a0you know, unhealthy machines in Azure. What happens when you can\u2019t contact a machine? It\u2019s unresponsive. What do you do? How long do you wait for it?\u00a0And\u00a0how long should I wait for that machine to come back alive,\u00a0or should I just go ahead and reboot it, recycle it and put it back into the pipeline or the process of what we call our fabric controller that takes machines and recycles them and gets them ready for production use?\u00a0So that\u2019s a decision where we\u00a0are able to,\u00a0kind of,\u00a0harvest existing data that they have to find a better policy for choosing how long to wait.\u00a0Another\u00a0example is,\u00a0whenever you ask Azure for a certain number of virtual machines, you say, hey, give me a hundred machines, they\u2019ll actually allocate more than a hundred because they\u2019re not\u00a0sure if the hundred will succeed\u00a0and\u00a0how long it will take for them to deliver those hundred to you. So, they might say, allocate a hundred and twenty,\u00a0and they\u2019ll just give you the first\u00a0hundred that get started up the fastest. And so here\u2019s an opportunity where we can harvest that information because of this over-allocation, that implicit information that\u2019s in there, to do something that\u2019s much more optimal and save on the amount of extra, unnecessary work that they have to do when they\u2019re spinning up these large clusters of virtual machines.<\/p>\n Host:\u00a0<\/b>L<\/b>et\u2019s say you wait for ten minutes and something happens, but you also know what happened at nine minutes, at eight minutes, at seven minutes, right?\u00a0<\/b>So<\/b>,<\/b>\u00a0this is how you were explaining it<\/b>.\u00a0<\/b>Is it always going to happen then at ten minutes? I know that I\u2019ll wait for nine\u2026 if I just wait one more minute, I\u2019ll be fine, instead of having to go to the expense of rebooting?<\/b><\/p>\n Sid Sen:\u00a0No.\u00a0Yeah, I think that\u2019s a great point. We might not know. And one thing that\u2019s unique about the feedback we\u2019re getting is,\u00a0it\u00a0kind of depends on what happens in the real world. I might wait for five minutes and,\u00a0you know,\u00a0nothing happened. In that situation, I only learn what would have happened if I waited less than that amount of time, but I don\u2019t\u00a0learn\u00a0anything about what would have happened if I waited longer than that amount of time. Whereas, sometimes,\u00a0if I wait for five minutes and the machine comes up in two minutes. And now I know everything. I know all the information that would have happened if I had waited for one minute or two minutes or ten minutes or nine minutes or whatever it is. So, there\u2019s kind of an asymmetry in the kind of feedback you get back. And this is a place where existing counterfactual techniques fall short. They are not able to deal with this kind of \u2013 the different amount of feedback you get which depends on the outcome\u00a0of\u00a0what\u00a0actually plays\u00a0out in the real world. And so,\u00a0this is where we\u00a0have to\u00a0do most of our innovation on the, you know, the\u00a0statistical and reinforcement learning side to develop the counterfactual techniques that work for this scenario.<\/p>\n Host: Right.\u00a0<\/b><\/p>\n Sid Sen: And you know, I should say that this work is something that we see come up in many different system decisions across our cloud infrastructure. Any time we\u2019re allocating any resource, whether that resource is time, machines, memory, cores, any time we\u2019re doing that, there\u2019s this implicit information that we should be\u00a0harvesting. It\u2019s like a crop that we\u2019re not harvesting and there\u2019s an opportunity cost\u00a0to\u00a0not doing that.\u00a0And so part of the goal of this work is to show people that there is that opportunity there, but also to kind of show them how we can leverage this in a principled way so that our system designers know how they can design their systems in a way that allows them to continuously evolve them and optimize them. And a lot of what this harvesting randomness work is showing you how to do is,\u00a0how do I,\u00a0in a statistically correct way,\u00a0collect data and improve my system, collect more data and improve my system, and keep doing this?\u00a0Because the way we\u2019re doing it now is not actually,\u00a0you know,\u00a0fully correct.<\/p>\n (music plays)<\/i><\/b><\/p>\n Host:\u00a0<\/b>N<\/b>o one can ever see the grim that I have on my face.<\/b>\u00a0<\/b>Also, on a podcast, people can\u2019t see how things are spelled, and in the case of your second research agenda, it\u00a0<\/b>actually\u00a0<\/b>matters<\/b>. You<\/b>\u2019<\/b>re developing what you call\u00a0<\/b>HAIbrid<\/b>\u00a0algorithms, here we come on the\u00a0<\/b>HAIbrid<\/b>. And that\u2019s spelled\u00a0<\/b>H<\/b>–<\/b>A-I<\/b>-b-r-<\/b>i<\/b>-d. So, what are\u00a0<\/b>HAIbrid<\/b>\u00a0algorithms, Sid, and how do they move the ball forward in<\/b>\u00a0<\/b>the world of data structure research?<\/b><\/p>\n Sid Sen: So, I always get a snicker or two when people see the spelling of this, and I don\u2019t know if that\u2019s because it\u2019s lame or if it\u2019s because they like it, but,\u00a0it\u2019s spelled\u00a0H-A-I\u00a0because the H stands for human and the AI stands for AI, for artificial intelligence. And so, the idea of these algorithms is to basically find the right synergy between human and AI solutions, to find the right balance, the right way to use AI.\u00a0And\u00a0I do strongly believe in that because\u00a0I feel like we\u2019re a little too quick to apply AI, especially when it comes to deep neural networks, just because of how effective they are when we throw a lot of data and computation at them. So much so,\u00a0that we might even be overlooking what the best human\u00a0baseline is. We might not even be comparing against the best human baseline. So, a lot of what this agenda is trying to do is trying\u00a0to push the human limit as far as it can and then kind of integrate AI where it makes sense to do that.<\/p>\n Host:\u00a0<\/b>Mmm<\/b>-hmm.\u00a0<\/b><\/p>\n Sid Sen: And we\u2019re starting very classical. We\u2019re starting as simple as it gets. We\u2019re looking at data structures. You know, there was a lot of buzz a couple of years ago when folks at Google and MIT were able to replace classical data structures with very simple neural networks. A lot of what data structures are used for is to organize and store data. And so,\u00a0oftentimes you use them by giving them\u00a0some kind of key\u00a0and telling them,\u00a0hey, can you look up this piece of information? Find it for me and give it back to me. And they kind of observed that the key,\u00a0and its position in the data, look like training data for a machine learning model. So why don\u2019t we just shove it into a machine learning model and make it answer the question for us? And you know, they showed that if the data set is fixed, if\u00a0it doesn\u2019t change, then you can train this kind of layer of neural network to answer that question for you and they can do it way faster, using way less memory than, let\u2019s say, a classical structure like a B-tree. The problem with a lot of that was that the human baselines they were comparing against were not nearly the best baselines you would use. You wouldn\u2019t use this big fat B-tree to find something in a sorted data set that doesn\u2019t change. You would use something very simple, like maybe a binary search or some multi-way search. That\u2019s something that is way more efficient and doesn\u2019t require much space at all or much time. And so,\u00a0when I saw this work, I thought about every human baseline they were comparing against and I felt bad for those human baselines. I said,\u00a0oh, that\u2019s not fair! These human baselines weren\u2019t designed for that, they were designed for something much more. They were designed for these dynamic changing workloads and data sets\u00a0and that\u2019s why they are all bloated and,\u00a0you know,\u00a0inefficient just because they are keeping space so that they can accommodate future stuff, right? They are not designed for this\u00a0particular thing\u00a0you\u2019re looking at. And so,\u00a0we set out to kind of design what we thought was the best human data structure for this problem. We call it, a little bit presumptuously, we call it a universal index structure. And the idea of this universal index structure is that it\u2019s going to basically give us the best performance all the time. More concretely,\u00a0what that means is that,\u00a0I\u2019m going to look at what the workload looks like now and then I\u2019m going to try to metamorphose myself into the data\u00a0structure that\u2019s perfect for that workload and then I\u2019m going to give that to you. So, it sounds like an impossible thing, but you know, Gretchen, I think a lot of the work I do is\u00a0impossible-sounding\u00a0at a high level. And I kind of like that because one of the reasons I\u2019ve been able to appreciate kind of my hybrid view of things is that I come up with a crazier idea or approach than most people would.\u00a0So, something like universal data structure to me sounds like an impossibility. But then what you can do is you can say okay suppose I had access to\u00a0some kind of oracle, some supreme being that gave me, for free, the things that I wanted. Then maybe I could solve this problem. So that\u2019s kind of how I plan my research. I say okay, if someone told me what the current workload was, and if someone told me, for this workload, what is the best structure to use, and if someone told me,\u00a0how do I go from what I have now to this best structure, and I got all these answers for free, well then I would solve my problem and I would just use it.<\/p>\n Host: Right.<\/b><\/p>\n Sid Sen: I\u2019d\u00a0have my universal data structure right there. So now we go about trying to solve these oracles, each of which is hard. But you know, when it\u2019s a hard oracle, you can break that down into sub-oracles and make a roadmap for solving that sub-oracle, right? And then the problems slowly get easier and easier until there\u00a0are things that are\u00a0tractable enough for you to actually solve. And then you can put the pieces back together and you have a solution to the higher-level problem.<\/p>\n Host:\u00a0<\/b>All right\u2026 So,\u00a0<\/b>how are you doing that?<\/b><\/p>\n Sid Sen: Right. How are we doing that? So, what we\u2019re doing now is we\u2019re taking a black box approach. We\u2019re taking existing data structures without changing any of them. All the classical stuff that people use: hash tables, trees, skip lists, radix trees, and we\u2019re saying, how can I move efficiently from one of those structures to another? So, what we\u2019ve developed is an efficient transition mechanism that allows us to take data from one of these structures and gradually move it to another structure if we feel like that other structure is going to be better at handling the current workload. So, we use some ideas from machine learning to profile these data structures, to try to understand what regimes are good at. Okay? So, we are using ML a little bit here and there when we think it makes sense.\u00a0So, we\u2019re trying to understand the space of when these data structures are good. Once we understand that space, then we\u2019re trying to come up with an efficient way to move from one of these data structures to the other. And so, the big innovation of this work is in coming up with that transition mechanism. It\u2019s a way of sorting the data that you are working with and gradually moving the data from one structure to another in a way that you can do piecemeal where you don\u2019t lose your work, where you don\u2019t affect the correctness of the queries that you\u2019re trying to respond to\u00a0\u2013 people are going to be continuously asking your system to answer questions while you are doing all of this\u00a0\u2013 and in a way that it makes it worthwhile. Like we try to make sure that this transition mechanism is efficient enough that we\u2019re\u00a0actually getting\u00a0a benefit from it. And if we find that we\u2019re just flopping around all over the place, if we keep moving from data structure A to B back to A, back to C, well, we check for that and if that\u2019s happening we kind of back off and say no, let\u2019s not do that. This is not worth it\u00a0right now. And so,\u00a0this transition mechanism is kind of where most of the innovation has happened.\u00a0It\u2019s a simple transition mechanism because\u00a0it just intelligently uses the existing APIs provided by these data structures to do this kind of transitioning between them. And later\u00a0on\u00a0in the future, I hope that we can open up the black box a little bit. And by understanding these data structures from the inside, maybe we can\u00a0actually do\u00a0something even more intelligent and more efficient.<\/p>\n Host: Let\u2019s talk about a specific example of that while we\u2019re on the topic of\u00a0<\/b>HAIbrid<\/b>\u00a0algorithms. One<\/b>\u00a0<\/b>example of some very current work you are doing involves using the game of chess as a model system for bridging the gap between what you call superhuman AI and human behavior. So, this is\u00a0<\/b>really interesting<\/b>\u00a0to me on several levels. Tell us about it.<\/b><\/p>\n Sid Sen: Yeah, so whereas the universal index structure was trying to say,\u00a0let\u2019s take a pause and not use ML just yet, let\u2019s see how far I can push the human design and find a way to just,\u00a0you know,\u00a0switch between them and use all of them in some fluid way so that I can do as good a job as I can. And then the idea would be, how can we then take that strong human baseline and improve on that with AI. And I think the verdict is still out there, it\u2019s still unclear to me which one is a stronger force there and how they\u2019ll work together. In chess, it\u2019s a little different because we\u00a0actually know,\u00a0today,\u00a0that chess engines and AI are far stronger than any kind of human player can be. So, chess is an example of a task where AI has surpassed human play. And yet, we\u00a0still continue\u00a0to play it because it\u2019s fun, right? A lot of times you\u2019ll find\u00a0that\u00a0when AI does better at humans, it will just take over.\u00a0It just takes over that job from who was doing it before, whether it was a human or a human-designed algorithm or heuristic.\u00a0But chess is a game\u00a0where\u00a0that hasn\u2019t really happened because we still play the game, we enjoy it so much. And so that\u2019s why I think it\u2019s an interesting case study. The problem with chess is that the way humans play, and the way engines\u00a0play are very, very different. So, they\u2019ve kind of diverged. And it\u2019s not\u00a0really fun\u00a0to play the engine, you just lose. You just, you know,\u00a0lose all the time. So now,\u00a0interestingly, people are using the engines now to train themselves which is kind of an interesting\u2026<\/p>\n Host: Oh\u2026.<\/b><\/p>\n Sid Sen: \u2026situation because it\u2019s almost\u00a0like\u00a0this subtle overtaking of AI. It\u2019s like our neural networks, our brains,\u00a0are now being exposed to answers that the engine is giving us without explanation. The engine just tells\u00a0you,\u00a0this is the best move here, right?\u00a0It\u2019s figured it out. It\u2019s explored the game tree. It\u00a0knows\u00a0AI to tell you this is the best move and you are like,\u00a0oh, okay, and then you try to reason,\u00a0why is that the best move?\u00a0Oh, I think I kind of get it. Okay. So, a lot of the top chess players today,\u00a0you know,\u00a0they spend a lot of time using the engine to train their own neural nets inside their brains. So, I\u00a0actually think\u00a0that our brains, our neural nets,\u00a0they\u2019re a product of everything we put in there. And right now, one of the things that they are getting in as input are answers from the engine. So that\u2019s the subtle way that AI is seeping into your brain and into your life. But going back to what we did, I mean, we basically realized that there\u2019s this gap between how the engines play and how humans play, and when there\u2019s this gap, maybe the AI engine should be helping us get better, right? But right now, we don\u2019t have that kind of engine.\u00a0These engines just tell us what the best move is. They don\u2019t have what I would call a teaching ability. And so, what we\u2019re trying to do in this project is trying to bridge that gap. We\u2019re trying to come up with an engine that can teach humans at different levels, that can understand how a weak player plays, understand how a strong player plays, and try to suggest things that they could do at different levels. So, we looked at all the existing chess engines out there, ones that are not based on AI, and ones that are based on AI. And we found that these engines, if you try to use them to predict what humans will do at different levels, they either predict kind of uniformly well across all of the humans,\u00a0or their predictions get better as the humans get better, which means that none of them really are understanding what different humans at different levels play like.<\/p>\n Host: Hmm.<\/b><\/p>\n Sid Sen: So what we did was we kind of took this neural network framework\u00a0that\u2019s underneath it,\u00a0and we repurposed it to try to predict what humans would do and we were able to develop engines that are good at predicting human moves at every different level of play.\u00a0So this is,\u00a0I think,\u00a0a small building block in getting at this idea of a real,\u00a0like,\u00a0AI teacher, someone who can sit with you, understand where you\u2019re at,\u00a0and then,\u00a0from there, maybe suggest things that you might try, that are appropriate for your level.<\/p>\n Host: Let\u2019s talk about your third research agenda and it\u2019s all about safeguards. The metaphors abound: training wheels, bumpers on a bowling alley,\u00a0<\/b>a\u00a0<\/b>parent. Explain the problem this research addresses,\u00a0<\/b>in light of<\/b>\u00a0the reinforcement learning literature to date, and then tell us what you\u2019re doing in this space to ensure that the RL policies that my AI proposes are actually safe and reliable.<\/b><\/p>\n Sid Sen: Yeah, so safety is something that has come up in\u00a0all of\u00a0the things we\u2019ve talked about. It comes up repeatedly in everything we\u2019ve done, in the harvesting randomness agenda.\u00a0Um, you know, after harvesting information and coming up with a better policy, every time we try to deploy something, with\u00a0a\u00a0product team or group, they always have some safeguard in place, some guardrail in place. If this happens, page me,\u00a0or if this happens, shut this off,\u00a0or things like that. It comes up even in the universal data structure we\u2019ve talked about. What happens when I keep flip-flopping around these different data structures and I\u2019m wasting all this time and my performance is going down?\u00a0Well, maybe I need to put a brake on things. In the chess work as well it happens because one of the things that the chess teacher is trying to do is prevent the human from walking into a trap or walking down a path of moves that they can\u2019t handle because they are not strong enough yet as a player to handle it. And so, you can reason about safe moves versus unsafe moves. And so, because I just saw this cropping up all over the place, I realized that it\u2019s time to take a step back and reason about this and formalize this. So, this is something that systems people do which,\u00a0I think they are good at doing,\u00a0is they see people doing all kinds of ad hoc stuff and they say,\u00a0oh, this is an opportunity here to come up with a new abstraction. And so, safeguards are that new abstraction. I think the sign of a good abstraction is,\u00a0it\u2019s something that,\u00a0like,\u00a0everyone is already using it, but they don\u2019t know it. They don\u2019t give it a name. And so,\u00a0if everyone is already doing something like this,\u00a0in their own ad hoc ways, what we\u2019re trying to do is extract this component \u2013 we\u2019re calling it a safeguard \u2013 and we\u2019re trying to say, what is its role? So, to me a safeguard is something that kind of sits alongside any artificially intelligent system that you have, treats it like a black box,\u00a0and it protects it from making bad decisions by occasionally overriding its decisions when it thinks that they are about to violate some kind of safety specification. So, what is a safety specification? Well it can be anything. It can be a performance guarantee that you want.\u00a0It can be a fairness guarantee. Maybe it can even be a privacy guarantee. And we haven\u2019t explored\u00a0all of\u00a0the realms. Right now, we\u2019ve been focusing on performance guarantees. And I think the point is that,\u00a0if you have this component sitting outside the AI system, it doesn\u2019t necessarily need to understand what that AI system is doing. It just needs to observe what its decisions are and as long as it has an idea, like the safety specification, it can say,\u00a0oh, I think what you are going to do now is going to violate the safety spec, so I\u2019m going to override it. But if I see in the future that you are doing a better job,\u00a0or you are within the safety realm, then I\u2019m going to back off and I\u2019m not going to change your decisions. So, the safeguard is like this living,\u00a0breathing component that itself might use artificial intelligence and it can use AI to adapt. It can say, oh, I see that the system is doing a good job, okay I\u2019m going to back off. I don\u2019t need to provide that much of a safety buffer. But the moment I see that the system is not doing a good job, then, maybe I need to put the clamp down and the safeguard will say,\u00a0okay,\u00a0I need a big safety buffer around you. And so that\u2019s why I like to think of this analogy of parenting, because I think that that\u2019s what we do with kids a lot. In the beginning, you know, you childproof your house, you put training wheels on bikes and you constrain and restrict a lot of what they can do and then slowly you observe that now they are getting bigger. They are\u00a0falling down\u00a0less, they\u2019re able to handle corners and edges more and you start removing the bumpers. You might start raising the training wheels and things like that. But maybe you go to a new house or a new environment or you\u2019re on vacation and then maybe you need to clamp down a little bit more, right? Because you realize that,\u00a0oh, they\u2019re in a new environment now,\u00a0and\u00a0they might start hurting themselves again. And so, that\u2019s kind of the inspiration behind this idea of a safeguard that adapts to the actual system.<\/p>\n Host: Okay. So, I want to drill in again on the technical side of this because I get the metaphor, and how are you doing that<\/b>,<\/b>\u00a0technically?<\/b><\/p>\n Sid Sen: Yeah. So, what we\u2019re doing now is we\u2019re taking some example systems from the previous work we talked about and we\u2019re hand-coding safeguards based on the safety specs of those systems. Someone might say,\u00a0well,\u00a0how do you even know what a safety spec should be? What we found is that people usually know what the safety spec is. Most of the teams we work with, they usually know what things they check for, what things they monitor that trigger actions to happen, like guardrails that they have in place. So, most of the teams we\u2019ve talked to, they check for,\u00a0like,\u00a0six different things,\u00a0and if those six different things exceed these thresholds, they do some big action. So, we can usually derive the safety spec for those systems by talking to the product team and looking at what they do right now. Once we get that, we\u2019re kind of hand-coding and designing a safeguard and we\u2019re using tools from program verification to prove that the safeguard that we code up satisfies what we call an inductive safety\u00a0invariant, which means\u00a0that it satisfies some safety property and every action that that safeguard takes will always continue to be safe. So\u00a0as long as\u00a0you\u2019re in the safe zone, no matter what you do, as long as the safeguard is active and in place, you will stay in the safe zone. And so, we can use tools from program verification to write these safeguards, prove that they satisfy the safety\u00a0invariant,\u00a0then what we do is we add a buffer to that safeguard.\u00a0Think of the analogy of,\u00a0like,\u00a0adding more padding to the bumpers on the corners of tables. Or putting the training wheels a little lower. So, I can use the same kind of verification approach to verify the safety of the safeguard plus\u00a0some kind of buffer. And then what I\u2019ll do is, I\u2019ll use AI to adapt that buffer depending on how good the system is doing.\u00a0So now,\u00a0I\u2019ve guaranteed to you that everything is going to be safe, even if you have this kind of buffer. I\u2019m like okay, great.\u00a0And now I can shrink and grow this buffer. You\u2019re doing well? The buffer will be\u00a0really small, which is good for a lot of systems because if you keep that small buffer, it allows systems to be aggressive.\u00a0And when you\u2019re aggressive, you can optimize better. You can use more resources. You can push the systems to their limits a little more,\u00a0which is good for you. But if you\u2019re in a situation where you\u2019re not doing well or there\u2019s some uncertainty or the environment has changed, then we kind of increase that safety buffer. And the whole time, you\u2019re still guaranteed that everything is going to be correct. So that\u2019s kind of what we have now. What I really want to do,\u00a0again, because no project is worth it if it\u2019s not\u00a0impossible-sounding, is to automatically synthesize these safeguards. Like we\u2019re coding them by hand now. I want to use fancy program synthesis magic\u00a0to\u2026<\/p>\n Host: I was just going to ask you if that was next.<\/b><\/p>\n Sid Sen: Yeah. Yeah. So that\u2019s, that\u00a0would be ideal. Like,\u00a0I didn\u2019t want to have to sit and do these things by hand, I want to get a safety spec and some structured\u00a0information about the application and I want to automatically synthesize the safeguard and\u00a0then show you that it is\u2026 and\u00a0prove that it\u2019s correct. So, we\u2019re working with an amazing colleague in the program synthesis and program verification world who\u2019s\u00a0actually the\u00a0director of our MSR India lab,\u00a0Sriram\u00a0Rajamani.<\/p>\n Host: I had him on the podcast! He\u2019s amazing!<\/b><\/p>\n Sid Sen: Oh, not only is he amazing, he\u2019s maybe the nicest person I know\u2026\u00a0Like, in the world!\u00a0He\u2019s working with us on this and it\u2019s a lot of fun. I love working on projects where I love working with the collaborators. In fact, I think I find that these days I tend to pick projects more based on the collaborators than the topic sometimes. But this is one where it\u2019s both the topic and the collaborators are just a complete thrill and pleasure to work with. And so, we\u2019re hoping that combining our techniques, I do systems and AI stuff and reinforcement learning, he understands program synthesis and verification. We have someone who understands causal inference and statistics. And so,\u00a0with these three disciplines, we\u2019re hoping that we can come up with a way to automatically synthesize these safeguards so that any system that is using AI and that has an idea of what safety means for them, can leverage the safeguard to come up with one.<\/p>\n (music plays)<\/i><\/b><\/p>\n Host: Well speaking of collaboration, that\u2019s a big deal at Microsoft Research and you<\/b>\u2019<\/b>re doing some\u00a0<\/b>really interesting<\/b>\u00a0work with a couple of universities. And I don\u2019t want to let you go before we talk at least briefly about two projects; one with NYU related to what systems folks call provenance<\/b>,<\/b>\u00a0and another with Harvard trying to figure out how to delete private data from storage systems in order to comply with GDPR. Both fall under this idea of responsible AI which is a huge deal at Microsoft right now. So, without getting super granular though<\/b>,<\/b>\u00a0because we have a couple of more\u00a0<\/b>questions<\/b>\u00a0I want to cover with you<\/b>, g<\/b>ive us a\u00a0<\/b>Snapchat<\/b>\u00a0<\/b>S<\/b>tory version of these two projects as well as how they came about real quick.<\/b><\/p>\n Sid Sen: Okay. Yeah, so with NYU, we\u2019re working with two professors there, Professor\u00a0Jinyang\u00a0Li and Professor\u00a0Aurojit\u00a0Panda,\u00a0and two amazing students,\u00a0on what we\u2019re calling ML provenance. So, the idea is can we find and explain the reasons behind a\u00a0particular decision\u00a0made by an ML model. And so, there\u2019s been a lot of work in trying to do this. And I think what we\u2019ve done that\u2019s different is we formulated the problem in a different way. We\u2019ve said, what if there were different sources of data that go into training this model? So, if that was the case, and you have all these different sources, you can look at a\u00a0particular decision\u00a0made by the machine learning model and say, okay, what were the sources that contributed to that decision? So, it turns out you can formulate a new machine learning problem that uses those sources as features and, as its label, the decision. And we can use statistical techniques to try to narrow down and zoom in on which of the sources\u00a0actually caused\u00a0that decision. So, this is going to be useful for doing things like detecting poisoning attacks which is a very common security problem people look at in machine learning models. With our technique, you can kind of find the data source that caused you to do that. So, the idea there is we can find the data source that caused that decision or maybe even find the actual data point in some cases that caused that decision.<\/p>\n Host: Wow.<\/b><\/p>\n Sid Sen: And we can do this without training too many models, and… The coolest thing about it,\u00a0I think,\u00a0is again,\u00a0this idea of harvesting existing work,\u00a0is that I think we found a way that we can do all of this by just leveraging the existing training work that\u2019s already\u00a0being\u00a0done to train these models.\u00a0So that\u2019s the NYU\u00a0work.\u00a0Um, the work that we\u2019re doing with Harvard is with Professor James Mickens and he has a student we\u2019re working with and there,\u00a0what we\u2019re trying to understand is this question of, can we really delete a user\u2019s private data from a system? Let\u2019s take a simple classical storage system as an example. Suppose all the person does that goes in and says, insert this data item into your storage system, like what happens? And what we found, it\u2019s kind of fascinating to see that just one thing that you insert into a storage system all the different things it touches. Like you have all this transient state created. You affect these global parameters and state,\u00a0and then you put the actual data into a data structure. That\u2019s the easy part, right?<\/p>\n Host: Right.<\/b><\/p>\n Sid Sen: But there\u2019s all these other kind of side effects that happen. Maybe some of those statistics that are used, are used to train\u00a0some kind of machine\u00a0learning model. I mean, there\u2019s all kinds of things that can happen.<\/p>\n Host: Right.<\/b><\/p>\n Sid Sen: So, we\u2019re trying to basically track all that and, in systems we have this notion of taint tracking which is,\u00a0like,\u00a0imagine you put a color on the data, and you see where the color spreads. So, we have some techniques for doing that in systems already. But what we\u2019re trying to understand is,\u00a0how do I measure, like,\u00a0how much I care about all the things I\u2019ve\u00a0touched? Right? Like these taint tracking things, if you touch something, they are like,\u00a0oh, it\u2019s tainted. But you know, if I add one number of mine that\u2019s a private number to like a thousand other numbers, do I really care about what I\u2019ve done there? What we\u2019re asking here in this work is also like, how do I reason about how much of my privacy is being leaked, which has connections to differential\u00a0privacy,\u00a0actually. How do I reason about how sensitive my input is to whatever state it\u2019s affected and once I figure that out, and I decide okay, I don\u2019t care about this stuff because it really hasn\u2019t affected things enough, but I care about this stuff because I\u2019ve affected it in a meaningful way, then how do I go about deleting that data? It\u2019s not so clear how we can, in a generic way, allow the entire system to fully and completely delete what it needs to delete when it comes to that user\u2019s private data.\u00a0So, another work in progress.<\/p>\n Host: Well it\u2019s about now that I always ask what could possibly go wrong, Sid. And I do this because I want to know that you have an eye to the risks as well as the rewards, inherent in research. So while you frame a lot of your work in terms of AI playing nicely with humans or collaborating with humans or helping humans, is there anything you can see down the line\u00a0<\/b>\u2013 o<\/b>r maybe are afraid you can\u2019t see down the line<\/b>,<\/b>\u00a0so that\u2019s even worse \u2013 that gives you cause for concern or keeps you up at night? And if so, what are you doing about it?<\/b><\/p>\n Sid Sen: I think that AI solutions will always be held to a different standard than\u00a0human operators\u00a0or human solutions. I do worry about what happens when even a system that we deploy that has let\u2019s say this kind of safeguard in place, if something goes wrong, how do you react to that? What kind of policies do you put in place to determine what to do in those situations?\u00a0You know, what happens when one of these AI systems\u00a0actually hurts\u00a0a person?\u00a0And this has happened in the past, right? But when it does happen, it\u2019s interpreted very differently than if the cause for that accident or incident was driven by a human. In a lot of the applications we\u2019ve looked at, if the safeguard gets violated a little bit here and there, it\u2019s actually okay,\u00a0and we actually can leverage that, and in the\u00a0work\u00a0we do, we do indeed leverage that. But what if it\u2019s not okay? Right? What if it\u2019s never okay for even the slightest violation to happen? How do we, in those situations, still learn, still do what we need to do, without the risk? So that\u2019s something that does worry me because it makes me realize that there are some things where you just don\u2019t want to replace a human.\u00a0But I do believe that this kind of hybrid approach we talked\u00a0about,\u00a0I think it\u00a0will ensure that both sides have an important role to play. I think what\u2019s happening is that\u00a0AI is showing us that we\u2019re not playing the right role. And so,\u00a0what we need to do is just kind of adjust the role.\u00a0I\u2019m not worried about AI replacing creativity and elegance. I mean a lot of people are worried about that. I think there\u2019s just too much,\u00a0like,\u00a0elegance and beauty in the kinds of solutions humans come up with and it\u2019s one of the reasons why I spend a lot of time learning from and using human solutions in all of the AI work that I do.<\/p>\n Host: I happen to know that you didn\u2019t begin your career doing hybrid systems reinforcement learning research in New York City for MSR, so what\u2019s your story, Sid? How did you get where you are today doing cutting-edge research in the city that never sleeps?<\/b><\/p>\n Sid Sen: That\u2019s a good question. I grew up in the Philippines, but I had a lot of exposure to the US and other countries. I grew up\u00a0in an international environment\u00a0and\u00a0I went to college at MIT in Cambridge in Massachusetts and I remember seeing a student solve a problem that we had been assigned in a way that he wasn\u2019t told how to solve it. And\u00a0it\u00a0kind of blew my mind, which is a little sad if you think about it. I was like, why did you do it that way\u00a0when they told us to do it this way?\u00a0You know, you would have figured it out if you did it that way, why did you do it that way? But that resulted in a new idea that led to a paper that he submitted. And, it was I think at that point that I realized my professors are not just giving me problem sets, that the stuff that they\u2019re teaching me is stuff they invented and that we have this ability to kind of innovate new ways of doing things, even new ways of doing existing things\u00a0and\u00a0I had not really realized that or appreciated\u00a0that until that point.\u00a0All this time I\u2019d\u00a0been like,\u00a0you know, thinking of my professors as just people who gave me problem sets, but now when I go back and look at them, I realize that I was taught by these amazing super stars that I kind of took for granted. And so, after I left college, I went to Microsoft to work in Windows Server\u00a0actually for\u00a0three years. I worked in a product group as a developer, but I always\u00a0ended up going back to thinking about the algorithms and the ideas behind what we were doing.\u00a0And, you know,\u00a0I was fortunate enough to have a good boss that let me work on\u00a0researchy-type things. So, I would actually visit MSR and talk to MSR people,\u00a0as an engineer from a product team,\u00a0and\u00a0you know,\u00a0ask\u00a0them questions\u00a0and I always had this,\u00a0you know,\u00a0respect for them and I always put them on a pedestal in my mind.\u00a0I think this kind of inspired a more,\u00a0kind of,\u00a0creative pursuit\u00a0and so that\u2019s why I went back to grad school. And I think the reason why I ended up at MSR after doing my PhD is because MSR was the one place where they really appreciated my interdisciplinary,\u00a0and slightly weird,\u00a0background.\u00a0Right? The fact that I did some systems stuff, the fact that I did some theory stuff, the fact that I worked in the product group for three years, like they\u00a0actually appreciated\u00a0a lot of those things. And I felt like this is a place where all that stuff will be viewed as a strength rather than,\u00a0oh,\u00a0you\u2019re\u00a0kind of good at these things, but are you really good at one of these things more than the others?\u00a0You\u00a0know,\u00a0that kind of thing, which is something that does come up as an issue in academia\u00a0and even in our labs, we struggle with it, at hiring people who are at the boundary of disciplines. Who are cross-cutting\u2026\u00a0Because it\u2019s not so easy to evaluate them. How do you compare such a person to another candidate who is an expert in one area, right? How do we put a\u00a0value on this kind of interdisciplinary style versus this deeper, more siloed style of research?\u00a0So, it\u2019s not an easy question.\u00a0It\u2019s been a recurring theme in my life,\u00a0I would say.<\/p>\n Host: Sid, what\u2019s one interesting thing that we don\u2019t know about you that has maybe impacted your life or career? And even if it didn\u2019t impact anything in your life or career, maybe it\u2019s just something interesting about you that people might not suspect?<\/b><\/p>\n Sid Sen: So,\u00a0I was not a very typical Indian kid growing up. I grew up in the Philippines and partly in India\u00a0and\u00a0I spent a lot of time doing,\u00a0like,\u00a0hip-hop and break-dancing, things that you know, drove my parents a little crazy I would say. They would let me do it\u00a0as long as\u00a0my grades didn\u2019t suffer. When I came to MIT, I saw these people doing a style of dance where they didn\u2019t have to prepare anything. It was just like\u2026 do you know how square-dancing works?<\/p>\n Host: Yeah<\/b>.<\/b><\/p>\n Sid Sen: Um,\u00a0right? You just follow a caller who calls\u00a0all of\u00a0the moves and,\u00a0you know, I joined that dance group and I eventually became the president of that group and that\u2019s actually where I met my wife. She was PhD student at the time.<\/p>\n Host: In a square-dancing group?<\/b><\/p>\n Sid Sen: No. It was in a salsa version of that kind\u00a0of work. So, it\u2019s called Casino Rueda. It\u2019s like square-dancing, but it\u2019s a caller-based dance where the caller calls all the moves. So, I would call these moves with hand signals and all that,\u00a0and everyone does it and makes these beautiful formations and patterns, but it\u2019s all salsa. So that was a very important,\u00a0kind of,\u00a0part of my time at MIT. And I do appreciate the university a lot because it gave me the ability to have a balanced life there. So that\u2019s something kind of non-work related that people don\u2019t usually expect. I guess something work-related that people may not know so much about me is that I don\u2019t really like\u00a0technology! I\u2019m\u00a0not a very good user or adopter of technology. I\u2019m always kind of behind. In fact, I think I use it when it becomes embarrassing that I don\u2019t know about it,\u00a0and this has happened repeatedly in the past. So, I do a lot of things because I need to for my work, but I think this\u00a0actually works\u00a0to my advantage. I don\u2019t like technology\u2026 I don\u2019t\u00a0like, like,\u00a0writing thousands and thousands of lines of code. I like thinking more about how to do things minimally.\u00a0Finding the simplest and\u00a0most elegant route is\u00a0really important\u00a0to me. One thing that I have a big passion for is teaching.\u00a0If I\u2019m honest with myself, I probably should be in a university because I love working with students and mentoring them. I\u00a0actually feel\u00a0that might be my greatest strength compared to the other things that I do. And one of the things you need to be able to do well,\u00a0if you want to teach a student,\u00a0is explain something in the kind of simplest way. And of course, it\u2019s a lot easier to explain something in a simple way if that thing is simple to begin with, right? Not only is it easier\u00a0to explain, it\u2019s easier to code up. It\u2019s usually easier to program. And that means that it\u2019s less likely to have bugs and other issues in it.\u00a0So,\u00a0there\u2019s a lot of value to simplicity and it\u2019s something I think about all the time.\u00a0It really has permeated everything that I do.<\/p>\n Host: Well, it\u2019s time to predict the future<\/b>,<\/b>\u00a0or at least dream about it. So<\/b>,<\/b>\u00a0if you<\/b>\u2019<\/b>re wildly successful, what will you be known for at the end of your career? In other words, how do you hope your research will have made an impact<\/b>,<\/b>\u00a0and what will we be able to do at that point that we wouldn\u2019t have been able to do before?<\/b><\/p>\n Sid Sen: Wow, that\u2019s a good one. That\u2019s a tough one to answer. Let me tell you two things that I think would make me happy if, at the end of my career, these things happened. If our world is run by AI systems that operate in a harmonious way with humans, whose safety we are so assured of that we take it for granted, if a part of that can be attributed to the work that I\u2019ve done, that would make me happy. So that is one thing. The other thing,\u00a0though,\u00a0which may even be more important to\u00a0me\u00a0is that if I\u2019m remembered as someone who was you know a good teacher, a good mentor. So if, for example if my ideas are being taught to undergraduates or even better, if they are being taught to high school students, I think the further down you go, the more fundamental and the more basic, the more simple the stuff is, to me that\u2019s,\u00a0you know,\u00a0a sign of a greater impact on education. So, I\u2019ve had a bit of a taste of that. When I was in my PhD, I did some work on very classical\u00a0data structures and came up with some new ones. And those are being taught to undergraduates now and they are in textbooks now. And that kind of thing makes me\u00a0really happy. You know I hope that, by the end of my career, that big chunks of my work\u00a0\u2013 or even parts of it \u2013 will be taught to undergraduates or even better, taught to high school students. To me, that would make me super happy.<\/p>\n Host: Sid Sen, thank you for joining us today on the podcast. It\u2019s been terrific.<\/b><\/p>\n Sid Sen: Thank you, Gretchen. I really appreciate it.<\/p>\n To learn more about Dr. Siddhartha Sen, and how researchers are working to optimize decision making in real world settings, visit Microsoft.com\/research<\/i><\/b><\/p>\n","protected":false},"excerpt":{"rendered":" Dr. Siddhartha Sen is a Principal Researcher in MSR\u2019s New York City lab, and his research interests are, if not impossible, at least impossible sounding: optimal decision making, universal data structures, and verifiably safe AI. Today, he tells us how he\u2019s using reinforcement learning and HAIbrid algorithms to tap the best of both human and machine intelligence and develop AI that\u2019s minimally disruptive, synergistic with human solutions, and safe.<\/p>\n","protected":false},"author":37583,"featured_media":662268,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/61222834\/","msr-podcast-episode":"116","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[240054],"tags":[],"research-area":[13561,13556,13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-660594","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/61222834\/","podcast_episode":"116","msr_research_lab":[199571],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144947],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"","byline":"","formattedDate":"May 27, 2020","formattedExcerpt":"Dr. Siddhartha Sen is a Principal Researcher in MSR\u2019s New York City lab, and his research interests are, if not impossible, at least impossible sounding: optimal decision making, universal data structures, and verifiably safe AI. Today, he tells us how he\u2019s using reinforcement learning and…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/660594"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37583"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=660594"}],"version-history":[{"count":8,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/660594\/revisions"}],"predecessor-version":[{"id":668115,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/660594\/revisions\/668115"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/662268"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=660594"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=660594"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=660594"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=660594"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=660594"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=660594"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=660594"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=660594"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=660594"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=660594"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=660594"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}Related:<\/h3>\n
\n
\nTranscript<\/h3>\n