RSS feed\u00a0<\/a><\/li>\n<\/ul>\n
\nTranscript<\/h3>\n
Doug Burger: These are people who just have the scientific and engineering discipline, but also that deep artistic understanding. And so, they create new things by arranging things differently in this bag. And when you see a design that\u2019s done well it\u2019s beautiful. It\u2019s a beautiful thing.<\/p>\n
Host: You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/strong><\/p>\nSome of the world\u2019s leading architects are people that you\u2019ve probably never heard of, and they\u2019ve designed and built some of the world\u2019s most amazing structures that you\u2019ve probably never seen. Or at least you don\u2019t think you have. One of these architects is Dr. Doug Burger, Distinguished Engineer at Microsoft Research NExT. And, if you use a computer, or store anything in the Cloud, you\u2019re a beneficiary of the beautiful architecture that he, and people like him, work on every day.<\/strong><\/p>\nToday, in a fast-paced interview, Dr. Burger talks about how advances in AI and deep machine learning have placed new acceleration demands on current hardware and computer architecture, offers some observations about the demise of Moore\u2019s Law, and shares his vision of what life might look like in a post-CPU, post-von-Neumann computing world. That and much more on this episode of the Microsoft Research Podcast.<\/strong><\/p>\nHost: Doug Burger, welcome to the podcast today.<\/strong><\/p>\nDoug Burger: Thank you, great to be here.<\/p>\n
Host: We\u2019re in for an acronym rich recording session, I believe. I assume that our audience knows more about your world than I do, but since my mom listens, I\u2019m going to clarify a couple of times. I hope that\u2019s okay with you.<\/strong><\/p>\nDoug Burger: Whether it is acronym-heavy is TBD, sorry.<\/p>\n
Host: And you are an SME. You are actually, a Distinguished Engineer for MSR NExT. Tell us what that is.<\/strong><\/p>\nDoug Burger: The title is a standard Microsoft title for somebody that is in a technology leadership position. MSR NExT is a branch of MSR or an organization within MSR. So, when you then think about MSR we have a lot of diversity. We have geographic diversity. We have disciplinary diversity. And we have some organizational diversity. And so, NExT is just a different organizational structure that tends to produce different outcomes. That\u2019s how I like to think of it.<\/p>\n
Host: So, you do research in computer architecture. Give our listeners an overview of the problems you\u2019re trying solve. What gets you up in the morning in computer architecture research?<\/strong><\/p>\nDoug Burger: Great question. So, formally defined, computer architecture is defined as the interface between hardware and software. An architecture is what the software is able to see of the hardware. So, if I build a chip, and a system, and a disk drive, and some memory, all of those things, they\u2019re just dead hardware, until you do something with them. And to do something with them, you have to have a way to talk to them. And so, all of these things expose a way to talk to them. And that really is the architecture. You can think of it as the language of the computer at its very lowest level.<\/p>\n
Host: Machine language?<\/strong><\/p>\nDoug Burger: Machine language, right.<\/p>\n
Host: And how that translates from people into the transistor.<\/strong><\/p>\nDoug Burger: That\u2019s exactly right. There\u2019s actually a lot of layers in between the person and the transistor. And the architecture that I just described is one of those layers, more towards the bottom, but not at the bottom.<\/p>\n
Host: Right. Speaking of the stuff that\u2019s at the bottom, transistors and devices and things like that, we\u2019ve experienced a very long run of what we call Moore\u2019s Law.<\/strong><\/p>\nDoug Burger: Yes, it\u2019s been wonderful.<\/p>\n
Host: Which is transistors get smaller, but their power density stays constant. And some people, including you, have suggested that Moore\u2019s Law is probably coming to an end. Why do you think that is?<\/strong><\/p>\nDoug Burger: Let me start with a little bit of context. At its heart, all of this digital computing we\u2019re doing is built on switches and wires. It\u2019s a light switch. You flick it, it turns on. You flick it, it turns off. That\u2019s really what a transistor is. What we do, is we\u2019d start with a wire and a transistor, which remember, is just a switch. It\u2019s a fancy name for a switch. And then we\u2019d start putting them together. We put a few more transistors together, and then you have a logic gate. Something that can say, \u201cand\/or\/not\u201d right. Take a zero and turn it into a one, that\u2019s a not; or two ones together added together become a one; and a one and a zero added together becomes a zero. Just the very basic universal logic functions. Long ago, in the time of Babbage, we were building computers that really didn\u2019t work out of mechanical relays then we had vacuum tubes which at least were electronic. And then of course, the transistor was invented, and the integrated circuit. And these were a really big deal because now it allowed you to miniaturize these things on a chip.<\/p>\n
Host: Sure.<\/strong><\/p>\nDoug Burger: And then what Gordon Moore did, in his seminal paper in the mid-sixties, he pointed out that he expected these things to double in density, be able to fit twice as many of these things on a chip, on an integrated circuit, in a year and a half, or two years, as you would, two years ago. And then he revised the law in 1975, to be a little bit faster and then we\u2019ve been doubling the density every couple of years since then. For 50 years. It\u2019s been a 50-year exponential, where the cost of computing has been dropping exponentially for 50 years.<\/p>\n
Host: Sure.<\/strong><\/p>\nDoug Burger: And that\u2019s what\u2019s given us this crazy society we have with, you know, the internet and computational science, and sequencing genomes for incredibly cheap compared to what they were. It\u2019s nuts.<\/p>\n
Host: And so, are we nearing the end of this era?<\/strong><\/p>\nDoug Burger: So, I get asked this a lot, and there\u2019s two versions of Moore\u2019s Law. There\u2019s the actual one that he published, which talks about the rate at which chips get denser. And then there\u2019s what I call the New York Times version, which is something loosely associated with an exponential in computing.<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: Like, performance is growing exponentially or\u2026<\/p>\n
Host: Well, and I\u2026 that\u2019s what we hear, right?<\/strong><\/p>\nDoug Burger: \u2026that\u2019s right. That\u2019s right, that\u2019s what people think. So, now if you\u2019re on the precise version, which that\u2019s where I am, because that\u2019s my field. The rate of that density-increase has slowed down for the first time in 50 years. So, it hasn\u2019t stopped, but it\u2019s noticeably slowed. Now, maybe the chip manufacturers will pick the cadence back up again and then I\u2019ll issue a mea culpa, and say I was wrong. But I think we\u2019re in the end game. I think we\u2019re in a period right now where it\u2019s slowed. It may be that we\u2019re on a new cadence that is just slower, or it may be that the length of time between each new generation of semi-conductor technology, lengthens out. And the problem really is, is that we\u2019re running up against atomic limits.<\/p>\n
Host: I was just going to say, you\u2019ve defined these, you know, size limits in terms of atoms.<\/strong><\/p>\nDoug Burger: Right.<\/p>\n
Host: And how small they are now, if you \u201chalf\u201d it again and \u201chalf\u201d it again we\u2019re pretty soon down to a couple atoms.<\/strong><\/p>\nDoug Burger: That\u2019s right.<\/p>\n
Host: I can\u2019t even comprehend how small that is.<\/strong><\/p>\nDoug Burger: Yes. And now, people have built single atom or single electron transistors, right? So, it\u2019s not a question of can we, it\u2019s more a question of can you build a chip that has 10 billion of these on it economically. It\u2019s really a question of economics because we know we can build smaller devices, but can you make them work and sell them cheaply enough to actually be worth doing? The other thing to note, is that, as we started to run into atomic limits when we were scaling down the transistors, the old standard, very regular structure of a transistor wasn\u2019t possible anymore. I mean, it used to be a very simple thing that had, you know, three bars of material with something sprayed on top, and wires connected to some of those terminals. Now, these things are actually very intricate 3-D structures. So, they\u2019re just getting more and more complex as we try to scale them down, and of course, that\u2019s to control quantum tunneling of electrons, so we\u2019re not leaking elec\u2026 when the switch is off, it should be off. And there\u2019ve been challenges all along. And we always find interesting ways to deal with it, and people are doing incredibly creative things now. It\u2019s just getting really hard.<\/p>\n
Host: I suspect, even as you look at this complexity of the transistors getting smaller and smaller, that you\u2019re looking at other ways to keep the progress\u2026<\/strong><\/p>\nDoug Burger: Absolutely.<\/p>\n
Host: So that\u2019s what I want to ask you next is, let\u2019s just assume that you\u2019re right, and what are you doing a parallel universe to keep the progress of computing power and speed and efficiency moving forward?<\/strong><\/p>\nDoug Burger: We\u2019re now in an uncertain era where, you know, in one generation your cost gains might be smaller than expected or your power gains might be smaller, you might not get any speed and you\u2019re trading these things off. There\u2019re still lots of advances going in the other parts of computing things like memory density, flash memory, network speeds, optics\u2026 So, there\u2019s all the ancillary parts. People are also working on new types of computing. Try to bucketize what some of these might be, so programmable biology is a super exciting one, right, like DNA is a computer. DNA is a stored program that can actually replicate itself.<\/p>\n
Host: Yeah.<\/strong><\/p>\nDoug Burger: But there\u2019s a program encoded in it, and there\u2019s lots of rules about it. And so, we\u2019re getting actually better understanding those programming languages. They\u2019re more statistical in that you\u2019re dealing with, you know, stuff interacting in probabilistic way. So that\u2019s a whole different paradigm for computing that 50 years ago we didn\u2019t even know existed, and now, we\u2019re actually starting to leverage. There\u2019s another one, which you could think of as just digital computing that\u2019s not silicon-based. So, people have been looking at carbon nanotubes, different materials\u2026 None of it looks very close to me. It looks like we\u2019re kind of 10 years away from any of it getting competitive. And of course, silicate has had so much investment.<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: Like, you know, hundreds of billions of dollars, if not trillions. It takes\u2026 you know, something that always worries me as a researcher, I know I\u2019m jumping around a little bit here, is, if you\u2019re on technology X, and then there\u2019s technology Y that is not only better but will take you farther, but if you don\u2019t get Y started early enough, X gets advanced far enough that the amount of money you need to bootstrap Y is just too high, and it never happens.<\/p>\n
Host: Absolutely.<\/strong><\/p>\nDoug Burger: Right, and so that\u2019s going to be really interesting to see that play out when we think about silicon and post-silicon technologies. It may be that there are magical, much better things out there that will never achieve because we\u2019ve invested too much in silicate.<\/p>\n
Host: And that seems to be one of the drivers, is the cost of something. I mean, if you made a compelling case for something and it was cheap, people would adopt it.<\/strong><\/p>\nDoug Burger: Absolutely, yeah. So, these computing systems we have, have become exponentially cheaper for many decades. They are also very general, you know, they do everything. And that\u2019s based on something called von Neumann computing.<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: And that\u2019s a paradigm, you know, you write software, it\u2019s run on CPU. That\u2019s kind of the paradigm we\u2019ve been with for a very long time. And as the silicon gains are slowing, the gains you get from that paradigm are also slowing, and so we\u2019re starting to see even the digital computing ecosystem fracture and diversify because of the huge economic need to do more. Let me roll back a minute and get to the other bucket. So, there\u2019s neural computing then. They are also a programmable (albeit learning) machine and they\u2019re incredibly interesting. I mean, just profoundly interesting. We don\u2019t really understand how and why they work yet despite all the progress we\u2019ve made in digital AI.<\/p>\n
Host: Yeah.<\/strong><\/p>\nDoug Burger: There is something super magical there, that may not even be understandable at the end of the day. I hope it will be, we don\u2019t know. And then of course, there\u2019s chemistry and… So, there\u2019s just all these other ways to compute. And of course, the nice thing about the paradigm we\u2019ve been on is, all of the levels are deterministic. You know exactly what they are capable of, what they can express is bounded but very powerful \u2013 Turing Complete, if you are a computer scientist \u2013 and so, it\u2019s tractable and so you can\u2026 each layer hides the complexity underneath presenting you with a relatively simple interface that allows it to do all its wonderful stuff. And then now, things are getting more complex. And interesting, but also harder.<\/p>\n
Host: Well, and I, as, you know, a non-scientist here, look at the simplicity of the interface and the underneath part is intimidating to me in terms of trying to get my brain around it. But like you say, when you unpack what is in the box people go ah, hah it\u2019s, you know, I don\u2019t have to ignore that man behind the curtain anymore.<\/strong><\/p>\nDoug Burger: That\u2019s right.<\/p>\n
Host: I am that man behind the curtain.<\/strong><\/p>\nDoug Burger: Pay no, attention\u2026 I didn\u2019t say that.<\/p>\n
Host: Suddenly, I join the technopoly.<\/strong><\/p>\nDoug Burger: That\u2019s right. Yeah, let me make a comment on that. These things are both incredibly simple, and incredibly complicated, at the same time. But they\u2019ve got interfaces that are very clean and the concepts are pretty simple like switch, and-gate, adder, add two numbers, binary arithmetic, right? It\u2019s just math with zeros and ones instead of zeros through nines. But then the number of things we do to optimize the system\u2026 it\u2019s insane. And the complexity of these things\u2026 they\u2019re some of the most complex things humans have ever built. I mean you think about five billion switches on one chip, it\u2019s a small postage stamp-size thing with five billion switches organized in a way to make stuff go really fast. I mean, that\u2019s amazing.<\/p>\n
Host: Simon Peyton Jones said that computers, and software, and architecture are some of the most amazing structures people have ever built.<\/strong><\/p>\nDoug Burger: They are amazing structures. And in the architecture field, when you\u2019re designing one of those interfaces and then deciding what you put in the chip underneath to support that interface, the cool thing is it\u2019s, unlike software, where it\u2019s much more open-ended\u2026 I mean, to me, when I write software. I feel too free. There\u2019s no guardrails. I can do anything. I have too much freedom. And when you\u2019re doing the hardware, you have a budget. You have an area budget. I have this many square millimeters of silicon. And you have to decide what to fill it with, and what the abstractions are you expose, and how to spend on performance optimization versus features. If you want to put something else in, you have to take something out. So, your bag is of a finite size and you\u2019re trying to figure out how to fill up that bag with the right components interconnected in the right way to allow people to use it for lots of stuff. You want it to be general, you want it to be efficient, you want it to be fast, you want it to be simple for software to use\u2026 And so, all of those things you have to balance, so it\u2019s almost like an art rooted in science. There are a small number of people, and I don\u2019t count myself among them, who are the true artists. You know, Jim Keller is a very famous one who is active in the area. Bob Colwell, retired from Intel. Uri Weiser also from Intel. I mean, these are some of the more recent examples, but these are people who just have the scientific and engineering discipline, but also that deep artistic understanding. And so, they create new things by arranging things differently in this bag. And when you see a design that is done well, it\u2019s beautiful. It\u2019s a beautiful thing.<\/p>\n
(music plays)<\/strong><\/p>\nHost: Let\u2019s talk about a project that you co-founded and co-led called Project Catapult. Tell me about Project Catapult.<\/strong><\/p>\nDoug Burger: Before I moved to Microsoft, I had started some research with one of my PhD students, Hadi Esmaeilzadeh who is a professor now at the University of California, San Diego. And, at the time, the community was moving towards multi-core. And there was a, not a global consensus, but definitely it was the hot thing and people were saying if we just figure out the way to write parallel software we\u2019ll just scale up to thousands of cores. And Hadi and I really\u2026 I said this is, you know, when everyone is buying it\u2019s time to sell.<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: And so, we wrote a paper that ended up getting published in 2011 and got some pretty high visibility. It didn\u2019t coin the term \u201cdark silicon,\u201d but it popularized it. And the observation was because the transistors aren\u2019t getting more power-efficient we can keep increasing the number of cores, but we\u2019re not going to be able to power them. So even if you have parallel software, and you drive up the number of cores, the benefits you get are much lower than you\u2019ve gotten historically. And what that really said to me is that that\u2019s a great avenue, but we\u2019re also going to need something else. And so, that something else started to be specialized computing, where you\u2019re optimizing hardware for specific workloads. And the problem with that, is that building custom chips is expensive. And so, what you didn\u2019t want is, say a Cloud, where people are renting computers from us, to have 47,000 different times of chips and try and manage that and have that be your strategy going forward. And so, we took a bet on this hardware called FPGAs. Now, we\u2019re\u2026 to your acronym soup\u2026 it stands for field-programmable gate array. What they are is programmable hardware. So, you do a hardware design, and you flash it on the FPGA chip. That\u2019s why they are called field-programmable, because you can change them on the fly, in the field. And as soon as you\u2019re done using it, you can change it to something else. You can actually change them every few seconds. So, what we ended up doing was to say, let\u2019s take a big bet on this technology and deploy it widely in our Cloud, and we\u2019ll start lowering important things into hardware on that fabric, which is on a pretty interesting system architecture too. And then that\u2019s going to be our general-purpose platform for hardware specialization. And then, once you have hardware designs that are being baked onto your FPGAs, you can take some of them, or all of them, and then go spin those off into custom chips when the economics are right.<\/p>\n
Host: Right, interesting.<\/strong><\/p>\nDoug Burger: So, it\u2019s sort of a way to dip your toe in the water, but also to get a very clean, homogenous abstraction to get this whole thing started. And then while stuff is evolving rapidly, or if its units are too small, you leave it on the programmable fabric. And if it becomes super high scale, so you want to optimize the economics, and\/or it becomes super stable, you might harden it to save money or to get more performance.<\/p>\n
Host: So, there\u2019s flexibility there that you otherwise wouldn\u2019t have.<\/strong><\/p>\nDoug Burger: Yeah, the flexibility is a really key thing. And again, the FPGA chips had been used widely in telecom. They\u2019re very good at processing streams of data flowing through, quickly, and for testing out chips that you were going to build. But in the Cloud, nobody had really succeeded at deploying them at scale to use, not as a prototyping vehicle for acceleration, but as the actual deployment vehicle.<\/p>\n
Host: Well, so what can you do now post-Catapult, or with Catapult, that you couldn\u2019t do on a CPU or a GPU?<\/strong><\/p>\nDoug Burger: Well, let me first say that the CPUs and GPUs are amazing things that focus on different types of workloads well. The CPUs are very general. And what a CPU actually does well, is it takes a small amount of data called the \u201cworking set\u201d sitting \u2013 you know, for those of you who are architecture geeks in its registers in level-one data cache \u2013 and then it streams instructions by them, and operating on those data. We call that temporal computing. If the data that those instructions are operating on is too big or if those data are too big, the CPU doesn\u2019t actually work very well. It\u2019s not super-efficient. Which is why, for example, processing high a bandwidth stream coming of the network, you need custom chips like NICs for that. Because the CPU, you know, if it has to issue a bunch of instructions to process each byte, and those bytes are coming in at 12 1\/2 billion bytes a second, you know, that\u2019s a lot of thread.<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: So, what the GPUs do well is something called SIMD parallelism, which stands for Single Instruction Multiple Data, and the idea there is you have a bunch of tasks that are the same, all operating on similar, but not identical, data. So, you can issue one instruction and that instruction ends up doing the same operation on say, eight data items in parallel.<\/p>\n
Host: Okay.<\/strong><\/p>\nDoug Burger: So that\u2019s the GPU model. And then the FPGAs are actually a transpose of the CPU model. So rather than pinning a small amount of data and running instructions through, on an FPGA we pin the instructions, and then run the data through.<\/p>\n
Host: Interesting.<\/strong><\/p>\nDoug Burger: I call that structural computing. Other people have called it spatial. I mean, both terms work. But the idea is, you take a computational structure, you know, a graph of operations, and you pin it, and you\u2019re just flowing data through it continuously. And so, the FPGAs are really good for those sorts of workloads. And so, in the Cloud, when we have functions that can fit on a chip, and you want to pin it and stream data through at high rates, it works really well and it\u2019s a nice compliment to the CPUs.<\/p>\n
Host: Okay. So, Catapult is\u2026?<\/strong><\/p>\nDoug Burger: Catapult is our project code name for Microsoft\u2019s large-scale deployment of FPGAs in the Cloud. Sort of covers the boards and the system architecture. But it\u2019s really a research project.<\/p>\n
Host: I was just going to say, is this now in research, is it beta, is it production? Where are you with it?<\/strong><\/p>\nDoug Burger: In late 2015, Microsoft started shipping one of the Catapult FPGA boards in almost every new server that it bought. That\u2019s in Bing, Azure and other properties. And so, by this point, we\u2019ve gone to very large scale. This stuff is deployed at ultra-large scale worldwide. We\u2019re one of the largest consumers of FPGAs on the planet. And so, of course, there are now teams all over the company that are using them to enhance their own services. When you are a customer using the accelerated networking feature, that speed that you\u2019re getting, which is a huge increase in the speed, both over what we had before, but also it\u2019s faster than any of our competitors networks, is because the FPGA chip is actually re-writing the network flows with all of the polices that we have, to keep our customers secure, and keep their virtual private networks private, and make sure that everyone adheres to our polices. And so, it\u2019s inspecting the packets as they are flowing by at 50 gigabits, 50 billion bits a second. And then re-writing them to follow our rules and making sure that they obey the rules. If you try to do that on the CPUs, which is where we were doing it before, the CPUs are very general, they are programmable. They do a good job, but you use a lot of them to process flows at that rate. And so, the FPGAs are just a better alignment for that computational model.<\/p>\n
(music plays)<\/strong><\/p>\nHost: Talk about Brainwave, Project Brainwave.<\/strong><\/p>\nDoug Burger: So, Brainwave\u2026 there\u2019s a big move towards AI now, as I think you know, just everybody listening will know. A very hot area\u2026 And in particular, it was spurred by this thing called deep learning, which I think many of your listeners will know, too. But what they figured out was that, with the deep models, you know, deep neural networks, if you add more data to train them, they get better, as you add more data, you make them bigger, and they get better. And they kind of marched through a lot of the traditional AI spaces like machine translation, speech understanding, knowledge encoding, computer vision… and in replacing each dedicated set of algorithms for that domain that had been developed painstakingly over years. And that\u2019s really what spurred I think a lot of this huge movement because it seemed to be okay, there\u2019s something here, these things are very general and if we can just make them bigger, and train more and more data, we can even do more interesting things which has largely been born out, it\u2019s really interesting. And so, of course, that put a lot of pressure on the silicon, given the trends that we were discussing. And so now, there\u2019s a big investment in custom architectures for AI, machine learning, and, specifically, deep learning. And so, Brainwave is the architecture that we built in my team, working with partners in Bing and partners in Azure, to deploy to accelerate Microsoft services.<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: So, sort of our version of a deep neural network processor, what some people call a neural processing unit, or NPU. And so, for an organization like Bing, who\u2019s really compute-bound, like they are trying to learn more and more so they can give you better answers, better quality searches, show you what you need to see. And so, we\u2019ve been able to deploy larger models now that run in the small amount of time that you\u2019re allowed to take before you have to return an answer to a user. And so, we\u2019ve been running it at worldwide scale for some time now. And now, what we announced at Build, is that we\u2019re bringing it to Azure for our customers to use. And also a private preview where customers, in their own servers, in their companies, can actually buy one of our boards with the Catapult architecture, and then pull models down, AI models down from Azure, and run them on their own site as well.<\/p>\n
Host: Wow.<\/strong><\/p>\nDoug Burger: So, they become an Azure endpoint, in some sense, that benefits from all of the AI that\u2019s sitting in Azure. Uh, one other thing about the Brainwave stack itself that\u2019s significant is that I think right now, for inference, which is asking of the questions, a lot of the other technologies use something called batching where you have to glom, say, 64 different requests together and ship them as a package, and then you get all the answers back at once. The analogy I like to use is if you\u2019re standing in a bank line and you\u2019re the second person but there\u2019s 100 people in line, that the teller processes them all by taking all their IDs and then asking them all what they want, and then, you know, withdrawing all the money, and then handing it to each person. You all finish at the same time, right? That\u2019s batching.<\/p>\n
Host: I love it.<\/strong><\/p>\nDoug Burger: It\u2019s great for throughput on these machines but not so good for latency, so we are really pushing this real-time AI angle.<\/p>\n
Host: That leads me to a kind of philosophical question\u2026 How fast is fast enough?<\/strong><\/p>\nDoug Burger: Well, how fast is fast enough really depends on what you are trying to do. So, for example, if you are taking multiple signals from around the web, and trying to figure out that, for example, there\u2019s an emergency somewhere. A couple of minutes might be fast enough, you know, a millisecond is not life and death. If you\u2019re doing interactive speech with somebody, actually very small pauses matter for the quality of the experience.<\/p>\n
Host: We\u2019ve all experienced that in, you know, watching a television interview where there\u2019s latency between the question and the answer\u2026<\/strong><\/p>\nDoug Burger: Exactly.<\/p>\n
Host: \u2026and you often step on each other\u2026<\/strong><\/p>\nDoug Burger: That\u2019s right. Another good example, another piece of AI tech that Microsoft unveiled was its HPU, which goes in the HoloLens headset, and that\u2019s also processing neural networks. That was a very, very amazing team in a different organization, the HoloLens organization, that built that chip, working with Ilan Spillinger\u2019s team. But that chip, if you think about the latency requirements, it\u2019s figuring out the environment around you so that it can hang virtual content that is stable as your eyes are darting around, so, you know, even a couple of milliseconds there is a problem. So really, how fast depends on what you\u2019re trying to do.<\/p>\n
Host: What you\u2019re trying to do, yeah, yeah.<\/strong><\/p>\nDoug Burger: So, speed is one component, and then the other is cost. You know. So, if you have billions of images, or millions of lines of text, and you want to go through and process it so that you can, you know, understand the questions that people commonly ask, you know, in your company, or you want to look for, you know, signs that this might be cancer in a bunch of radiological scans, then what really matters there is cost-to-serve. So, you have a, sort of, a tradeoff between how fast is it, how much processing power you are going to throw at it right away? And how cheap is it to do one request.?<\/p>\n
Host: Right.<\/strong><\/p>\nDoug Burger: The great thing about the BrainWave system is I think we\u2019re actually in a really good position on both. I think we\u2019re the fastest, and we are in the ballpark to be the cheapest.<\/p>\n
(music plays)<\/strong><\/p>\nHost: What was your path to Microsoft Research? How did you end up here?<\/strong><\/p>\nDoug Burger: I was a professor at the University of Texas for ten years. I worked very closely with a gentleman named Steve Keckler. We were, sort of, two in a box, and we did this really fun and ambitious project that was called TRIPS, where we decided we wanted to build a better type of CPU. And I mean I, this\u2026 the Catapult was my first FPGA project, so I\u2019m really actually a hardened chip guy. And so, we came up with some ideas, and said hey, there\u2019s a better way to do CPUs. It\u2019s really a new architecture. You know, a new interface to the machine that uses different principles than the historical ones. And so, we raised a whole bunch of DARPA money that\u2026 DARPA was super interested in the technology\u2026 built a team in the university, went through the grind, built a working chip. I mean, Steve was an expert in that space. He really led the design of the chip, did a phenomenal job. We worked together on the architecture and the compiler. So, we built this full working system, boards, chips, compilers, operating system, all based on these very new principles. And, academically, it was a pretty influential project and pretty high-profile. But we got to the end of that\u2026 it was too radical to push into industry. It was too expensive to push into a startup, you know, semi-conductor startups weren\u2019t hot at the time. But after that, I really wanted to go and try to influence things more directly. And so, Microsoft came calling right around the time I was wondering what\u2019s next. And it just seemed like time for new challenges.<\/p>\n
Host: So, your work has amazing potential and that usually means we need to be thinking about both the potential for good and the potential not-so-good. So, is there anything about what you\u2019re doing that keeps you up at night?<\/strong><\/p>\nDoug Burger: I\u2019ll tell you that I don\u2019t worry about the AI apocalypse at all. I think we\u2019re still tens or hundreds of thousands of times away from the efficiency of a biological system. And these things are really not very smart. Yes, we need to keep an eye on it in ethics\u2026 I mean frankly, I worry more about global warming and climate change. That\u2019s kind of my big thing that keeps me up at night. To the extent that our work makes computing more efficient and we can help solve big, important scientific problems, then that\u2019s great.<\/p>\n
Host: So, as we close, what\u2019s the most exciting challenge, or set of challenges, that you see maybe on the horizon for people that are thinking of getting into research in hardware systems and computer architecture.<\/strong><\/p>\nDoug Burger: Really, it\u2019s find a North Star that you passionately believe in and then just drive to it relentlessly. Everything else takes care of itself. And, you know, find that passion. Don\u2019t worry about the career tactics and, you know, which job should you take. Find the job that puts you on that path to something that you really think is transformational.<\/p>\n
Host: Okay, so might be one of those transformational end-goals?<\/strong><\/p>\nDoug Burger: For me, and this is, I think, the first time I\u2019ve talked about it, I think we\u2019re beginning to figure out that there is something more profound waiting out there than all of these heterogeneous accelerators for the post-von Neumann age. So, we\u2019re at the end of Moore\u2019s Law. We\u2019re kind of in the silicon end-game, and von Neumann computing, meaning, you know, the stream of instructions I described, you know, has been with us\u2026 since von Neumann invented it, you know, in the forties, and it\u2019s been amazingly successful. But right now, we have von Neumann computing and then a bunch of bolted on accelerators and then kind of a mess with people exploring stuff. I think there\u2019s a deeper truth there for that next phase which, in my head, I\u2019m calling structural computation. So that\u2019s what I\u2019m thinking about a lot now. I\u2019m starting to sniff out that I think there is something there that might be general and be that next big leap in computer architecture. And of course, the exciting thing is I could be totally wrong and it\u2019s just a mess but that\u2019s the fun of it.<\/p>\n
(music plays)<\/strong><\/p>\nHost: Doug Burger, I wish I had more time with you. Thanks for coming in and sharing the depth of knowledge that you\u2019ve got stacked in that brain.<\/strong><\/p>\nDoug Burger: Uh, my pleasure. Thank you for the great questions, and for all your listeners.<\/p>\n
To learn more about Dr. Doug Burger, and the latest research on acceleration architecture for next-generation hyperscale clouds, visit Microsoft.com\/research<\/a><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"Episode 23, May 9, 2018 – Dr. Burger talks about how advances in AI and deep machine learning have placed new acceleration demands on current hardware and computer architecture, offers some observations about the demise of Moore\u2019s Law, and shares his vision of what life might look like in a post-CPU, post-von-Neumann computing world.<\/p>\n","protected":false},"author":37074,"featured_media":485016,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/33809176\/","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[240054],"tags":[],"research-area":[13556,13552],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-484668","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-artificial-intelligence","msr-research-area-hardware-devices","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/33809176\/","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144666],"related-projects":[171431],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"","byline":"","formattedDate":"May 9, 2018","formattedExcerpt":"Episode 23, May 9, 2018 - Dr. Burger talks about how advances in AI and deep machine learning have placed new acceleration demands on current hardware and computer architecture, offers some observations about the demise of Moore\u2019s Law, and shares his vision of what life…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/484668"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37074"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=484668"}],"version-history":[{"count":13,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/484668\/revisions"}],"predecessor-version":[{"id":485670,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/484668\/revisions\/485670"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/485016"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=484668"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=484668"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=484668"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=484668"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=484668"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=484668"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=484668"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=484668"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=484668"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=484668"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=484668"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}