{"id":488363,"date":"2018-05-30T07:53:25","date_gmt":"2018-05-30T14:53:25","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=488363"},"modified":"2020-04-23T15:12:48","modified_gmt":"2020-04-23T22:12:48","slug":"making-intelligence-intelligible-dr-rich-caruana","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/making-intelligence-intelligible-dr-rich-caruana\/","title":{"rendered":"Making intelligence intelligible with Dr. Rich Caruana"},"content":{"rendered":"
Dr. Rich Caruana, Principal Researcher. Photo courtesy of Maryatt Photography.<\/p><\/div>\n
In the world of machine learning, there\u2019s been a notable trade-off between accuracy and intelligibility. Either the models are accurate but difficult to make sense of, or easy to understand but prone to error. That\u2019s why Dr. Rich Caruana<\/a>, Principal Researcher at Microsoft Research, has spent a good part of his career working to make the simple more accurate and the accurate more intelligible.<\/p>\n Today, Dr. Caruana talks about how the rise of deep neural networks has made understanding machine predictions more difficult for humans, and discusses an interesting class of smaller, more interpretable models that may help to make the black box nature of machine learning more transparent.<\/p>\n Rich Caruana: I trained a neural net on this data, and it turned out the neural net I trained was the most accurate model anyone could train on this data set. And then there was a question, you know, would it actually be safe for us to go ahead and use this neural net on real patients? And I said, \u201cYou know, I don\u2019t think we should use this on real people. And that\u2019s because it\u2019s a black box. We don\u2019t actually understand what it\u2019s doing.<\/p>\n Host: You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/strong><\/p>\n Host: In the world of machine learning, there\u2019s been a notable trade-off between accuracy and intelligibility. Either the models are accurate but difficult to make sense of, or easy to understand but prone to error. That\u2019s why Dr. Rich Caruana, Principal Researcher at Microsoft Research, has spent a good part of his career working to make the simple more accurate and the accurate more intelligible.<\/strong><\/p>\n Today, Dr. Caruana talks about how the rise of deep neural networks has made understanding machine predictions more difficult for humans, and discusses an interesting class of smaller, more interpretable models that may help to make the black box nature of machine learning more transparent. That and much more on this episode of the Microsoft Research Podcast.<\/strong><\/p>\n Host: Rich Caruana, welcome to the podcast.<\/strong><\/p>\n Rich Caruana: Thank you.<\/p>\n Host: Great to have you here today.<\/strong><\/p>\n Rich Caruana: Great to be here.<\/p>\n Host: You\u2019re a principal researcher at MSR working with a broad set of AI researchers who have a lot of Venn diagram overlap, but you don\u2019t necessarily, as you say, cluster, to a formal group. But give our listeners an idea of the big research questions you\u2019re asking, the big problems you\u2019re trying to solve, what gets you up in the morning?<\/strong><\/p>\n Rich Caruana: Sure. Yeah, there\u2019s a group of us who don\u2019t necessarily cluster with other activities in the lab. First of all, I\u2019m a machine learning researcher. I\u2019ve been doing machine learning since grad school in the 90s. And I do a lot of machine learning for healthcare. So, machine learning where the problem of interest has to do with predicting your risk or its treatment for certain illnesses. But I do a lot of other things as well. I do some stuff in machine learning that\u2019s just trying to make machine learning better. And then I work on some other applications like machine learning being used for ecological purposes. For\u2026 you know, we do a lot of work in ornithology where we\u2019re trying to track birds and things like that.<\/p>\n Host: Let\u2019s talk a little bit more about machine learning and the models therein. There\u2019s historically been a tradeoff between accuracy and intelligibility, or interpretability. Could you explain what that means and why it matters\u2026<\/strong><\/p>\n Rich Caruana: Sure, sure.<\/p>\n Host: \u2026both to scientists and to us?<\/strong><\/p>\n Rich Caruana: So, there are simple learning methods. You know, think of linear regression, logistic regression. These are very simple methods. They\u2019re applicable to many different things. They\u2019re pretty intelligible, because you can just sort of see, well, it\u2019s this weight times this feature, plus this weight times this feature. Unfortunately, there\u2019s a lot of complex models that can\u2019t be represented so simply. You really need a much more complex function of the features. So, and that would be modern methods like boosted trees, random forests, neural nets, now deep neural nets. So, these things generate much more complex functions, and unfortunately, the penalty you pay for the extra accuracy is, they\u2019re much harder to open up and understand. I mean, a modern, deep neural net can be hundreds of layers deep, and it can have tens or hundreds of millions of weights inside the model, all of which have been tuned by the learning process on some data set that might itself have, you know, a million or ten million examples in it. You\u2019re never going to wrap your head around ten million or a hundred million weights. And yet, all of these things work together to represent the function that the thing has learned and is going to use to make predictions. So that\u2019s much harder to understand than linear regression, which was just a single weight times one of the features, plus another weight times another feature. So, in some domains, if you don\u2019t have much data, or if the problem\u2019s not very hard, linear or logistic regression is perfectly adequate. But there are lots of problems that we\u2019re working on now, since we have so much data, where you really need something that\u2019s much more complex, much more accurate. And unfortunately, you tend to lose intelligibility when you use those models. And there\u2019s a lot of work being done now in trying to figure out, how do we open up these black boxes to get some idea of what\u2019s going on inside?<\/p>\n Host: Let\u2019s talk about the black box nature of machine learning models. And um, it\u2019s been a given in the industry for a couple of different reasons. Why would we run into the black box?<\/strong><\/p>\n Rich Caruana: Yeah, no, that\u2019s a good question. So, if you think back thirty, forty years ago in AI, when we were building things like expert systems, these were human-engineered systems that were trying to do something very intelligent. But they were being crafted by humans. And because of that, you could actually understand exactly how they were supposed to work. Most AI now is based on machine learning. And machine learning, the way it works is, you collect a big data set, possibly very complex, it might have ten thousand features in it and a hundred thousand records in it, or ten million records in it. You would have no idea how to manually craft a very accurate rule for this sort of problem domain. So instead, what we do is, we give that data set to what is, essentially, a statistics engine that sits there and looks for every pattern it can find in those examples that seems relevant for what you want to predict. And then it ends up learning this complex function that captures all of those statistical regularities that are in the data set. Now the problem is, you didn\u2019t understand all the statistical regularities in the data set to begin with. And it\u2019s not like the machine learning told you what it was doing when it trained the model. It just, sort of, translates from, here\u2019s, uh, ten million records in a database, to here\u2019s ten million weights in a model, and you kind of have to use the model to make prediction. So, there\u2019s really not been much opportunity to understand what\u2019s been captured by the machine learning process. In fact, it\u2019s kind of a miracle that it works so well. They learn things, in some cases, that look like things that are going on inside human brains. In other cases, they learn patterns that we didn\u2019t know about but that once you see the pattern, you realize that\u2019s incredibly useful, new knowledge. In other cases, they learn the standard things, which you would have hoped that they would learn. And they maybe just learned it better than humans know those statistics. And then in a few cases, unfortunately, they learn things which are true patterns in the data, but they\u2019re mistakes for how you\u2019re going to use the model. And that\u2019s where you\u2019d love to be able to see what\u2019s going on inside these models. One, so you can learn from the good things that they\u2019ve figured out, because there might be like new science, new discoveries, hidden in the data. But also, you\u2019d love to be able to find these smaller number of things where they\u2019ve made some mistakes that could make the model less accurate, or possibly even dangerous, if you were to deploy it. So, you\u2019d love to be able to see what\u2019s going on inside the models, to find these mistakes, and then have a chance at fixing them.<\/p>\n Host: I know that we\u2019ve talked about the distillation model approach to solving some of these problems. And you\u2019ve added transparency to that. Can you unpack that a bit?<\/strong><\/p>\n Rich Caruana: Right. So, we tend to use transparency to mean that it\u2019s a model that we really can understand it, like we can open it up. Instead of black box, it\u2019s like clear box, or something like that.<\/p>\n Host: Lucite box.<\/strong><\/p>\n Rich Caruana: Yeah, yeah, yeah, exactly. Lucite box. I like that. So, transparency tends to mean we can see what\u2019s going on inside and you\u2019ve got to be careful with this. So, you know, a neural net that has a hundred million weights in it \u2013 if you give me all those weights and you show me how they\u2019re connected, I could, painfully, slowly, simulate exactly what it\u2019s doing. It\u2019s not that we don\u2019t know what\u2019s going on inside. I mean, we write the code for these things, so we know exactly what function is being computed. It\u2019s just not transparent in the sense that it\u2019s being represented in a way that\u2019s not a good match to how humans think. So, we think of it as not being transparent just because it\u2019s not practical for us to understand it. And then what we do is we try to, somehow, open up that box. There are lots of cases where you can\u2019t do that. And instead, what we sometimes do is we\u2026 we\u2019ve started training an intelligible model, something that\u2019s not as complex as a neural net, to mimic the neural net. So, if we can train an intelligible model to mimic was what learned by the neural net, and then open up the intelligible model, that could tell us something about what\u2019s going on inside the neural net. Maybe we\u2019ll still use the neural net at runtime to make predictions. But this gives us a sort of window into the black box, into that complex neural net. And maybe we can understand that.<\/p>\n (music plays)<\/strong><\/p>\n Host: You mentioned something about the proprietary nature or intellectual property syndrome for the other reason there\u2019s a black box. What\u2019s that about?<\/strong><\/p>\n Rich Caruana: Right, so sometimes, it\u2019s not the case that the black box model is super complex, like some massive neural net. Sometimes, it\u2019s just a black box because it\u2019s protected by IP. So, many people will have heard of this model that is used for recidivism predictions. So, this model was created by a company, and the model is a pay-for-use model. And the model is just not something that\u2019s known to us, because we\u2019re not allowed to know. By law, it\u2019s something the company owns, and the courts have, several times, upheld the right of the company to keep this model private. So maybe you\u2019re a person who this model has just predicted you\u2019re a high-risk of committing another crime and because of that, maybe you\u2019re not going to get parole. And you might say, \u201cHey, I think I have a right to know why this model predicts that I\u2019m high-risk.\u201d And so far, the courts have upheld the right of the company that created the model to keep the model private and not to tell you in detail why you\u2019re being predicted as high or low risk. Now, there are good reasons for this. You don\u2019t necessarily want people to be able to game the model. And in other cases, you really want to protect the company who went to the expense and risk of generating this model. But that\u2019s a very complex question.<\/p>\n Host: Let\u2019s switch over to the medical field right now. About this same issue of collecting data, giving you a result that is, like, anyone who knows anything would look at that result and say, \u201cThat can\u2019t be right!\u201d Tell us that story.<\/strong><\/p>\n Rich Caruana: So, back when I was a graduate student, my advisor, Tom Mitchell, who\u2019s well-known in the machine learning community, asked me to train a neural net on a pneumonia data set that he was working with, with a number of other colleagues. So, I trained a neural net on this data, and guess I got lucky. It turned out the neural net I trained was the most accurate model anyone could train on this data set. And then there was a question, you know, would it actually be safe for us to go ahead and use this neural net on real patients? And I said, \u201cYou know, I don\u2019t think we should use this neural net on real people. And that\u2019s because it\u2019s a black box. We don\u2019t actually understand what it\u2019s doing.\u201d And what had me concerned was a friend who was at another university, who was training on the same data set, but he was using rule-based learning, he learned a rule one night, that if you have a history of asthma, it lowers your chance of dying from pneumonia. That is asthma seems to be protective, you know, for pneumonia. And he\u2019s like, \u201cRich, what do you think that means?\u201d And so, we took it to the next meeting. There were real MDs involved in this project. And the MDs said, \u201cWow, we consider asthma to be a serious risk factor for people who now have pneumonia, so we don\u2019t want your model to predict that asthma\u2019s good for you. That\u2019s a bad thing.\u201d But we can kind of see how that might be a real pattern in the data. You know, asthmatics are paying attention to how they\u2019re breathing. They probably notice the symptoms of pneumonia earlier than most people would. You know, they have a healthcare professional who probably treats their asthma, so they\u2019re already plugged into healthcare. So, they\u2019re going to get an appointment, and they\u2019re going to get earlier diagnosis that, in fact, what they have is not an asthma-related problem, but that they\u2019ve got pneumonia. And because they\u2019re considered high-risk patients, they\u2019re actually going to get really high-quality, you know, aggressive treatment. They\u2019re going to get it sooner than other people because they\u2019re paying more attention. So, and it turns out if you\u2019ve got an infection, there\u2019s nothing better than getting, like, rapid diagnosis and rapid treatment. So, it turns out the asthmatics actually have almost half the chance of dying of the non-asthmatics. It\u2019s not because the asthma is good for them. But it\u2019s because they get to healthcare faster and they get to treatment faster. And the problem is that the rule-based system learned this rule that asthma is good for you. I assume a neural net that I trained on exactly the same data, learned the same pattern that asthma looks good for you. But if we\u2019re going to use that model to go out and intervene in your healthcare and we deny the asthmatics the sort of rapid, high-quality care that made them low-risk, then we\u2019ll actually be, possibly, harming asthmatics. And what I told them was, I said, my neural net probably has this asthma problem in it. That doesn\u2019t worry me, because I can probably figure out how to make that problem go away. What really scares me is, what else did the neural net learn that\u2019s similarly risky? But the rule-based system didn\u2019t learn it, and therefore I don\u2019t have a warning that I have this other problem in the neural net. And now, flash-forward to the present. We now have a new class of interpretable models that we\u2019ve been developing here at Microsoft for the last seven years, that is just as accurate as this neural net. So, this tradeoff between accuracy and intelligibility for certain kinds of problems now is gone. We can get all of that accuracy out of this new kind of model. But it\u2019s just as easy to understand as that rule-based system.<\/p>\n Host: As linear regression?<\/strong><\/p>\n Rich Caruana: As linear regression. In fact, in many ways it\u2019s better than linear regression from an interpretability point of view. And it turns out, when we trained the model on that data set, it of course learns that asthma is good for you, as we expected. That was the first thing I checked. It learns that heart disease and chest pain are good for you.<\/p>\n Host: Because you also get quick, aggressive, treatment.<\/strong><\/p>\n Rich Caruana: Exactly. If you had a heart attack, say a year or two ago, and now you wake up in the morning and your breathing feels a little funny, your chest doesn\u2019t feel quite right, within an hour, you\u2019re in an ambulance or you\u2019re at the ER. And even at the ER, you get pushed to the head of the line. You know, you\u2019re on a bed really fast, and they\u2019re trying to figure out if you\u2019re having a heart attack and what they should do. And then they have good news. It\u2019s like, \u201cOh, you\u2019re not having a heart attack. It\u2019s just pneumonia!\u201d And then, because you\u2019re still at higher risk if you\u2019ve had heart disease in the past, they\u2019ll give you very aggressive treatment again. So, it turns out, the model thinks that heart disease is even better for you than asthma, and that\u2019s because the heart disease patients are the ones who get to the hospital fastest if they\u2019re having a problem. And then they get high-quality treatment, and that treatment is usually effective.<\/p>\n Host: OK, so tell us about the model that you\u2019re referring to that you\u2019ve \u201cfixed\u201d this problem.<\/strong><\/p>\n Rich Caruana: Sure. So, what we\u2019ve done is we\u2019ve taken a class of models called generalized additive models. These are often called GAMs. Statisticians invented these in the late 80s, So, these are models that predate a lot of machine learning, and these models were created in part because statisticians wanted models that would be, you know, complex enough to fit the data, but interpretable. But statisticians, tend to fit models with a lot of emphasis on smoothness and simplicity. Like, a statistician\u2019s basic way of thinking is, \u201cI will not add that term to my model unless I\u2019m 95 percent confident it really has to be there.\u201d<\/p>\n Host: Right.<\/strong><\/p>\n Rich Caruana: Right? So, these are the people who brought us T-tests and confidence intervals and things like that. And they take them very seriously. What we\u2019ve done now is, we\u2019ve taken this class of generalized additive models that statisticians invented long ago, and we\u2019re now using kind of machine learning steroids. You know, modern computer science, modern machine learning, to fit them in a way where they don\u2019t have to be as simple and smooth as they used to be. By letting them be more complex, we\u2019re able to get a lot more accuracy out of them than statisticians were getting out of them. And then, by getting more accuracy out of them, they\u2019re turning out to be more interpretable, because they\u2019re just doing a better job of modeling the real phenomena in the data. It\u2019s just the same class being done a new way that is making them reach levels of performance, both accuracy and interpretability, that they hadn\u2019t achieved before.<\/p>\n Host: I\u2019m going to kind of go in different directions here for a second. But part of this is about the correlation-causation problem. Talk about that a little bit and how the research you\u2019re doing can help us, maybe, overcome some of these errors that we get.<\/strong><\/p>\n Rich Caruana: Sure. And this correlation-causation question is the fundamental question in the kinds of models we\u2019re talking about. When we say having asthma, having heart disease, are good for you, those are true statistical patterns in the data. Like it really is true that the patients with heart disease have a better chance of surviving.<\/p>\n Host: But not because of the heart disease\u2026<\/strong><\/p>\n Rich Caruana: But not because of the heart disease. It\u2019s because of the way those patients are treated in society. It\u2019s because the patient is worried. They take it seriously. The patient gets to healthcare and then healthcare takes if very seriously, so they get very rapid treatment. We would love the model to be learning the causal effect. We would love the model to figure those things out. Suppose our model, even that neural net, was going to be used by an insurance company just to sort of decide what your chance of living or dying \u2013 because if you\u2019re going to live, then they have to keep money in the bank to pay for your healthcare next year \u2013 well, that company might find it very interesting that heart disease patients actually have less chance of dying from pneumonia and presumably more chance now of dying from something else. But an insurance company, since they\u2019re not going to intervene, they\u2019re less concerned about the causal question. They\u2019re just interested that it\u2019s predictively accurate. It\u2019s only when you\u2019re going to intervene in somebody\u2019s healthcare and possibly change things like time-to-care, aggressiveness of treatment, whether they\u2019re hospitalized or not hospitalized. It\u2019s only when you\u2019re going to do that that you\u2019re really interested in the causal questions.<\/p>\n (music plays)<\/strong><\/p>\n Host: You talked to me about the \u201cteacher-student\u201d model and mimicry in machine learning and some of the new things that you\u2019re discovering. Can you talk to our listeners about that too?<\/strong><\/p>\n Rich Caruana: Sure. Sure. So, um. Let\u2019s see. I\u2019ll go way back. A grad student at Madison University in the early 90s, about the time I was doing my PhD, did this interesting thing where he trained a decision tree to mimic neural net, and that\u2019s because a neural net wasn\u2019t very easy to understand, but a decision tree could be understood. About fifteen years ago, when I was at Cornell, we ended up using this trick for a different purpose. So, there we were training these massive ensembles of models. And ensemble just means that we have a thousand models all of which vote to make a collective decision. And that often makes the models more accurate. The ensembles were so big they were too slow to use for many purposes. So, we wondered, is there a way that we could make them faster? And then what we did was we used this mimicry trick. We would train a small, fast model to mimic the predictions made by the very big ensemble. And to our surprise, we could often get that to work. That is, we could get a small model to be just as accurate as that massive ensemble. And then the small model would be a thousand times faster and a thousand times smaller. So that was great. And then, coming to the present, these deep neural nets just sort of keep getting deeper and bigger. When they were just three or four or five hidden layers, they weren\u2019t too crazy, big or expensive to run. But then they started getting to be twenty layers, and then fifty layers, and then a hundred layers. They got to the point where they were massive. It turns out the deepest neural nets just wouldn\u2019t run fast enough to be able to do this. So, we started doing work where we would compress these very deep neural nets into shallower neural nets. So, we could actually\u2026 You know, in the lab, over the last month, we would train that monster deep neural net, which was super accurate. And then the following month, we would figure out how to compress it into a much smaller neural net that was going to be much faster and would be something fast enough we could actually run it in production and use it, and get still most of the accuracy of the monster model that was too big for us to use. So that\u2019s a trick that people now use to make neural nets \u2013 deep, very big deep neural nets \u2013 be more practical.<\/p>\n Host: Does it have a name?<\/strong><\/p>\n Rich Caruana: I called it model compression. And Geoff Hinton, another well-known researcher in deep learning, called it distillation. I like to use the word compression when our goal is to make the model smaller and faster, but I like to use the word distillation when our goal is to take what was learned by Model A, and somehow put it into Model B. So, I like Geoff Hinton\u2019s term distillation there. And now what we\u2019re doing is we\u2019re using distillation to make black box models more interpretable. And we do exactly this mimicry trick. So, you have trained some monster deep neural net. And what we\u2019ll do is, we\u2019ll try to train some form of interpretable model, like the models that we\u2019ve been working on the last seven years, to mimic that deep neural net. And then we open up our more interpretable model, because it\u2019s easy to see what\u2019s inside of it, and it can tell us a lot of things about what\u2019s going on inside that big black box model. It\u2019s not a perfect technique yet, but it\u2026 so far, it seems to work surprisingly well. Again, better than we expected.<\/p>\n Host: So, you\u2019re calling the big model the teacher, and the little model the student. And the student is learning to mimic the teacher.<\/strong><\/p>\n Rich Caruana: That\u2019s exactly right. So, it\u2019s sometimes called teacher-student training. Mimicry, I think, is a good way to describe it.<\/p>\n Host: Well, it\u2019s kind of an apprentice or a mentor.<\/strong><\/p>\n Rich Caruana: Yes, yes, yes.<\/p>\n (music plays)<\/strong><\/p>\n Host: Let\u2019s talk about MSR and the nature of research that\u2019s going on here. You and I talked a bit about the serendipity of being able to have some academic freedom, and the discovery process that comes out of that in sometimes surprising ways. Talk about the climate here, the nature of interdisciplinary research among researchers and teams and so on, and what you like about it.<\/strong><\/p>\n Rich Caruana: Sure. So, I came to Microsoft Research, oh, maybe eight years ago. And, you know, I had to figure out, well, what are the things I want to work on here at MSR? And as an individual contributor, I have a lot of freedom in deciding what to work on. And one of the things that had always been bugging me was this problem I had run into in the 90s where I trained the most accurate model on a medical data set, but then I was the one who said, no, we shouldn\u2019t use it. And that always kind of bothered me. It\u2019s like, if I\u2019m going to do machine learning in healthcare, you know, I have to solve this problem so that we can use these very accurate models, you know, safely. So, I had some crazy idea, and I got an intern from Cornell, where I had been a professor, and explained the crazy idea. And the intern thought, \u201cOh, that\u2019s a cool idea. Let\u2019s work on that.\u201d And it just turned out to be better than we thought. And so now I\u2019ve been working on this, say, seven years, and at some point, my boss even said, \u201cOh, are you still working on that?\u201d And I said, \u201cYeah, I think it\u2019s important. I think it\u2019s valuable.\u201d And then the world, uh, fairness and transparency, out in the real world, suddenly became very important. And it turned out that we had what is maybe the best algorithm for being able to understand whether your model is biased and trying to fix that bias. So, because of that, it\u2019s suddenly gotten to be very important in the fairness and transparency community. And now I have two more interns working on it.<\/p>\n Host: Unbelievable.<\/strong><\/p>\n Rich Caruana: And, you know, we\u2019re trying to get some developer resources to make better quality code so that we can release the code to the world. So, it\u2019s one of these things where, you know, a problem I stubbed my toe on in the late 90s, it seems to be bearing fruit, and now it\u2019s actually of much more interest than I would have imagined.<\/p>\n Host: In an area that you might not have imagined.<\/strong><\/p>\n Rich Caruana: Exactly. So, in fact, now one of the interns is \u2013 Sarah is an expert in fairness and transparency machine learning. She also does machine learning for healthcare. But she has a lot of expertise in the fairness and transparency community. So now, you know, it could be half of our research is now these sorts of models for fairness and transparency. And Microsoft is very interested in these questions about, how do we make sure that the models we\u2019re training and using aren\u2019t biased? Turns out, this is of growing importance and value in the company. And it\u2019s all a sort, you know, kind of accidental…<\/p>\n Host: Following the path of the research line.<\/strong><\/p>\n Rich Caruana: Exactly. Exactly. Not every project I do, by the way, you know, is so successful. Sometimes I have crazy ideas and they just turn out, eventually, to be crazy ideas. And I\u2019ve been lucky in that I\u2019ve had a lot of freedom, here at MSR, to sort of explore these different ideas. And, you know, a good number of them pan out. But sometimes they pan out in ways that we just didn\u2019t anticipate when we started them. And that\u2019s great. I mean, you know, good science often does that.<\/p>\n Host: I asked all the researchers that come in \u2013 you\u2019ve actually already kind of answered what keeps you up at night. But is there anything else that you\u2019re thinking of, even now, that might fall into that same category?<\/strong><\/p>\n Rich Caruana: Right. So, there is one thing even related to this project which falls into that category, which is data can represent patterns that are true patterns in the world, but they\u2019re actually not good patterns for the way you plan to use the model. And sometimes, in some domains, the risk is low, and the value of accuracy is such that you\u2019re willing to go and do this, especially if you have strong belief that the test set really does look like the real world you\u2019re going to deploy the model into. But in other domains, you know, healthcare for example, this is clearly risky. The cost is high, and the odds of existing healthcare having fooled your model in some ways, like by making it think that asthma and heart disease are good for you…<\/p>\n Host: Good for you.<\/strong><\/p>\n Rich Caruana: \u2026you know, is pretty high. Because healthcare is a well-tuned, large system that\u2019s been running for decades. So, they have a lot of expertise. And all the data you\u2019re going to get has been affected by them. So that means your model is going to learn some things that are wrong. And high accuracy on a test set isn\u2019t safe. And that\u2019s because the test set looks just like the training set. So, it actually gets rewarded with extra-high accuracy. So, you might look at the model and say, wow, it\u2019s superhuman accuracy, we should go use it. But it\u2019s superhuman because it\u2019s cheating a little bit. And if we actually fix these things, the model is now less accurate on the test data. But we believe it would be more accurate, safer to use in the real world. And that\u2019s because we don\u2019t think the test data is actually a perfect match for what the world is going to look like when we deploy it.<\/p>\n Host: Well, these are things to keep you up at night and not necessarily to give you nightmares, but to give more research opportunity to the next generation. As we close, do you have any parting wisdom, parting advice, to your 25-year-old self?<\/strong><\/p>\n Rich Caruana: Oh, oh, that\u2019s interesting. Um. One thing that\u2019s good is, there used to be this saying back in the early days of AI: \u201cYou should always work on the part of the system that worries you the most, because there\u2019s a good chance that that\u2019s the fundamental bottleneck in the long run. And if you don\u2019t solve it early, it\u2019s just going to come back and haunt you later on.\u201d And I think in research, I\u2019ve found, the thing that bothers me the most is often the best thing for me to work on. Someone once told me, it\u2019s just as hard to work on a hard problem as it is to work on an easy problem. So why not work on the hard problems? Like, why not work on ones that really matter? And I like to make sure that at least some of the things I\u2019m doing are things that really worry me, and that I don\u2019t really have a good answer for. And every now and then you get lucky and you hit one of those out of the ballpark. For most of us, that only happens, you know, if we\u2019re lucky, once in our career, maybe twice in our career. So, it\u2019s really nice to make sure you\u2019re working on some things which just seem sort of fundamental and hard and challenging, and you\u2019re just not sure that there\u2019s any immediate progress or payoff that\u2019s going to come from it. So, I definitely encourage people to have a mix of things they\u2019re working on. At least one or two that they think is sort of fundamental.<\/p>\nRelated:<\/h3>\n
\n
\nTranscript<\/h3>\n