Visualizing Data and Other Big Ideas with Dr. Steven Drucker

Publié décembre 20, 2017

Partagez cette page

photo of Dr. Steven Drucker - Principle Researcher — Dr. Steven Drucker – Principle Researcher

Episode 5 | December 20, 2017

Visualizing Data and Other Big Ideas with Dr. Steven Drucker

In a wide-ranging interview, veteran Microsoft Researcher, Dr. Steven Drucker talks about his work in data visualization, the importance of clear communication in a world of complex algorithms and big data, and the long, slow work of big breakthroughs. He also offers some pro-tips to aspiring researchers, and tells us why stand-up comedy is an important skill for computer scientists.

Transcript

Steven Drucker: It’s hard enough to make sense about small amounts of data, much less big data. There are so many problems with it. And each of them is like a speed bump. And I really want to lower some of those speed bumps and make it a little easier. But at the same time, if you lower them too much, you get a lot of people that are making false conclusions, misunderstanding and misapplying data.

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research, and the scientists behind it. I’m your host, Gretchen Huizinga. Today I’m talking with veteran Microsoft researcher Dr. Steven Drucker about data visualization, his advice for aspiring research scientists, the long, slow work of big breakthroughs, and why standup comedy is an important skill for computer scientists. That, and much more, on this episode of the Microsoft Research Podcast.

Host: Tell me a little bit what your background is and how you actually landed doing what you’re doing.

Steven Drucker: Oh boy, I have a long and twisty path. I actually started out as a neuroscientist. I did my undergraduate work in neurosciences, and I was really inspired as a kid by the sort of bionic man, and prosthetics, and trying to hook up devices to the brain. And I got into robotics from that. And so there was – from neurosciences to robotics was a very logical progression. Then – maybe this is like going back too far. But at the time, my advisor didn’t get tenure. So I left. And I switched over to a different group that was doing computer graphics, because computer graphics and robotics are really similar. And so I did computer graphics. That’s what I did my PhD in. And I came out to Microsoft 22 years ago where there was a group that was looking at graphical environments for social communication. Did that for a number of years, and you know I was hoping that it was going to be the new media form. That this was going to be the new way to communicate, the new way to tell stories. And yet, it got to be very much sort of hit driven, iterative. Kind of like what Hollywood is. Here’s the next version of, you know, Tomb Raider 3, or Grand Theft Auto 5. You see this sort of iteration. They’re not taking risks. So, I got a little bit more interested in sort of just organizing and collecting of media. So, I did a lot of things with photos and photo collections, and tagging and organizing photo collections. Partly because in some ways, this was a problem that we were having at home. Well, from organizing and collecting media, it’s not a big leap to go to large collections of things, and large data, and understanding data and communicating with that data. And I’ve been doing that for the last 8 years or so.

Host: Give the listeners an elevator pitch of what you do at Microsoft Research.

Steven Drucker: Well, I’m in Microsoft Research. And I focus on data visualization, which is trying to understand complex data and communicate insights about that data to others, to take actions on it. And building tools so people can in turn understand that data themselves. So, less data analysis, more building tools so that other people can understand and communicate that data.

Host: You’re building tools for data analysis, though?

Steven Drucker: Exactly, yes. So – I mean, there were a lot of people that got hired on as a data analyst. And then they might be using, whether it’s programming, or existing tools to do that. And they have good skills in that. But I’m also really interested in kind of reaching directly to the public. I think – again, I think about myself as a communicator even more than an analyzer.

Host: When you say building tools, my mind goes directly to the product groups. But you’re in Microsoft Research. Where on the spectrum of building are you?

Steven Drucker: I’ve been in Microsoft Research for 22 years. And I’ve gone back and forth in that spectrum. And I think that’s part of why someone might go to Microsoft Research as opposed to going to an academic department. In Microsoft Research, we’re very bottom up driven, so we come up with new ideas that aren’t necessarily what the customers are asking for. It can be very problematic if you just ask customers what they want. I don’t know if you’ve heard these old stories about how you drive cars? In the very first focus groups on driving a car were reins. Where you pull on one side or the other. Because that’s what people were accustomed to. So you can’t necessarily ask existing people for solutions to their problems. And I look at research to a certain extent as trying to break out of innovator’s dilemma. Where we’re trying to come up with new ideas from the bottom up. And sometimes they go into products, and sometimes they go into academic papers and sometimes they just go nowhere, and we drop them.

Host: Some people say that one reason technology is advancing so quickly, exponentially – if that’s where we go with it – is that we’re using tools now that are better than the tools before. And so each year, we use – leading to the question – what tools are you actually using to do your work? What do you find yourself – like you have a technological tool belt. What do you pull out the most to do the work you like to do on – like with SandDance. What did you make that with?

Steven Drucker: SandDance is built on top of JavaScript, and we use WebGL, which is the 3D graphics library if you’re doing – on a webpage. So, this is interesting. The tools that we use to build it are text editors and compilers. They’re not that different than what I was doing 25 years ago. So that’s kind of funny when you look at – my tool usage has not changed that much. That being said, we are standing on the shoulders of all these things. I don’t start from scratch each time. I actually really like the open source movement, in that we are – SandDance and most of the Power BI custom visuals are available as open source projects. We actually got the permission to open source it. And it’s fine to go ahead and do that. I think it’s great, because then we are all standing on everybody else’s shoulders. And then we’re combining them in different ways. And I think that is a huge change.

Host: Right. And I remember back in the day, intellectual property, IP being this big thing, and it’s proprietary. We’re not going to let anyone peek in the Kimono.

Steven Drucker: Yes.

Host: And it’s like, nobody’s thinking that way anymore.

Steven Drucker: It does feel like that. And I think it’s great. Because I can get people out of school. And they could be, oh, yeah, I can do this sort of thing. Instead of saying, well, when you come, you’re going to spend about three quarters of a year learning how to do this thing before you can actually make any contribution. And now it’s like, well, here’s this library out there that we can use this. Because it’s really been pre-approved that we can use it. And we can port this library, and we’ll get something new.

Host: Which is awesome.

Steven Drucker: And that’s amazing.

Host: What advice would you give aspiring researchers? What knowledge base or skill sets do you think they need?

Steven Drucker: The way I look at that is, I don’t want everybody to be a computer scientist. I look at coding as this sort of, one of the new tools to have. I think learning to code is going to be fundamentally important for any job that you have, whether you’re going to be someone writing the software, or a lawyer, or – so that – there’s that. There’s also the style of algorithmic thinking. All of those things are important. Certainly, understanding data is a big challenge. And it’s not a challenge that’s going away. So, having some backgrounds… in programming enough to be able to do the data statistics. Machine-learning is hot right now. There might be an overhype. Maybe not. I mean, this is – we’re seeing results. So those knowledges are really important. I guess something that we haven’t discussed that we’ve discussed before is I really look at this computer as a tool that augments our abilities. And I think one of the things that we do bring in, that is harder to duplicate right now is the creativity. I like to see people that have a design sense, who are aesthetic – you know, can do things creatively and aesthetically in addition to technically. I think it’s really important for someone to have the skills to talk about what they’re doing, why it’s important, and make that accessible for a wide audience.

Host: Has that been a problem for computer science people before?

Steven Drucker: Oh, not at all. Yeah. I mean – we have periodically from MSR something called Tech Fest, where we show the rest of the company the things we’re working on. And that’s always fun, because you’re up there talking about this stuff. Some people still are like, oh my god, this is – you’re not really appealing to everybody here. But if you can explain your things in way that gets people excited about it, and pulls them in, that helps everybody. So, as much as the technical abilities, I think this communication is as important. You know, I’m in research, so I go to conferences. And I speak at these conferences. A skill that’s really important there that’s surprising is stand-up comedy.

Host: Oh my gosh, yes.

Steven Drucker: To be able to do improv, and to be able to talk a question, be able to understand that question, and respond to it.

Host: Let’s move onto data. To use a metaphor, we’re kind of swimming in it. And it seems like the problem now is more about how we make sense of it, how we find what we’re looking for, and what we make of it once we find it. So, what’s going on in your world that you’d contend is helping us?

Steven Drucker: Again, that’s a huge question. And it’s hard enough to make sense about small amounts of data, much less big data. I find that it’s really hard to make sense of data. You need a lot of skills and a lot of sort of experience and background in statistics and programming and shaping that data. There are so many problems with it. And each of them is like a speed bump. And I really want to lower some of those speed bumps and make it a little easier. But at the same time, if you lower them too much, you get a lot of people that are making false conclusions. I’m not even talking about fake news. But just sort of misunderstanding and misapplying data.

Host: So, let me make that concrete from a research perspective, because you’re a researcher. What kinds of things are you doing? Because I love what you’re saying, and I want that for me.

Steven Drucker: Well, I – one thing that we’ve done in my group in particular is worked on a research project that was showing data in – we call it unit visualization, where you show every piece of data organized in different ways to show different conclusions. And that was great as a research project. You give talks. But we also ship that in Power BI as a custom visual. So anybody – actually, you can go to a website right now and you can try it without, for free. For just trying these ideas, load your own data set in there, and experiment with that, and look at these things, and examine your data in different ways.

Host: You’re professionally interested in how people communicate with data.

Steven Drucker: Yes. I definitely am.

Host: Unpack that.

Steven Drucker: One of the things I find about a lot of the academics that I work with, or a lot of the researchers is that they are fairly good communicators at their own audience. And they’re not very good communicators – they don’t necessarily understand when people don’t understand them. You know, how could they not get this? And part of I think effective communication is to try to put this in ways that they – why I like visualizations – they can see it. They can ask questions about it. And you have a little bit of a dialogue about it. So, I think that’s one element is understanding where people don’t communicate, don’t understand. Our visual systems are highly developed, and we can see patterns. Oftentimes, we can see patterns when they’re not there. But we can start seeing relationships. And again, we’re wired that way. So, if we can hook into that and pique their interest and start having them understand that, that’s kind of what I’ve been focusing on. It’s interesting. Trying to do this in a podcast is very difficult, because this is inherently visual. There are auditory visualizations that can do that as well. But again, so much of our brain is devoted to understanding visual phenomena.

Host: We talked about the current tension between – that swirls around deep neural networks. Now, this is a topic that a lot of people don’t really understand. And I don’t necessarily think we have to unpack it here. But algorithms have become so complex that people can’t even understand them, let alone explain them. And you mentioned a recent ruling in the EU that says you can’t make a decision with an algorithm that a human can’t explain.

Steven Drucker: I think there’s a requirement to explain those decisions.

Host: Okay, so you can make a decision, but you better be able to explain it?

Steven Drucker: Exactly. And you think of how important that is if you’re going to be paroling someone, or hiring someone. Or firing someone. And you don’t want to say, well, the computer said to fire you, and so we’re firing you.

Host: Let’s back up. Let’s back up. Is that happening?

Steven Drucker: There’s parole. This has been a big argument recently about sort of algorithmic systems for recidivism. How likely is it for someone to…

Host: Repeat.

Steven Drucker: …commit the crime again? And they’re trained on data. And of course, the data might contain biases in it. It might be trained on a population that’s not, not the same population.

Host: Oh my gosh. That is such a huge thing.

Steven Drucker: It is.

Host: I mean, on so many levels.

Steven Drucker: Yes, exactly. And machine learning to me is amazing because we’ve had huge strides and progress over the last decade about doing things like understanding voice, and objects, and maybe we’ll have intelligent driving cars. But at the same time, a lot of it really is just looking at statistical relationships between input and output. And we don’t really know what’s going on inside those relationships. And sometimes there might be, again, erroneous data that leaks into it, or some small problems. And I actually see visualization playing an important role in this. It’s very, very hard to visualize these models. But maybe to try to understand where they’re working and where they’re not working, or give people an idea. Or perhaps we can explain most of the cases by a simpler, more visual model. And then sometimes we can’t explain the rest. And so what do we do when there are discrepancies? But I think this is one of these big areas right now, or big challenges.

Host: You talked about big areas and challenges. In my mind, I kind of envision researchers in their labs working for the next big breakthrough. Is that what you guys are doing?

Steven Drucker: On one hand, yes. We all want breakthroughs. We always want the big thing. But I look at a lot of the big things now, and like Facebook. Facebook is like the 7th or 8th iteration of a peer-to-peer friend network. Twitter, that was like instant messaging. And you already have all these emails. You have all of these things that are huge now, that aren’t exactly breakthroughs. But they’re big because the time was right, the circumstances were right. They had the right sauce, whatever it was that did that.

Host: And the technology that enabled it was right.

Steven Drucker: Yes, that’s right. Exactly. And you look at the iPad. There had been pen computing going on since the mid-80s. So these things have happened over and over and over again. And sometimes the time is right. Now, so what are those big breakthroughs? We often don’t know what those breakthroughs are until, you know, hindsight.

Host: That was a breakthrough, but I didn’t call it that when I was making it.

Steven Drucker: That’s right. And I think there is a rush towards some of these breakthroughs, whether it’s with augmented reality. There’s a lot of interest in that right now. Or self-driving cars. A lot of interest in that. I think we all kind of want that – oh, things were like this before, and I’ve done something to make it fundamentally different. Academic research and research in general is not usually like that. It’s usually very, very incremental and careful. And there’s this balance between that careful science, and the, oh my god, we’ve made something big. And we’ve got this rush to a new startup, and that’s really exciting. And I think you kind of want a little bit of the energy from both of those to go together.

Host: I love, love, love how you’ve said that, because it’s hard work.

Steven Drucker: Yeah.

Host: And it’s slow work. And it’s rarely the big bang.

Steven Drucker: The metaphor I usually like is standing on the beach when the waves are coming in. If you just stand there and you watch any single wave, it’s only a little bit further than the last wave. But if you’re standing way at the top, looking down, and looking at this over several hours, like oh my god, I was down there before. And now it’s way up here.

Host: It’s called the tide. Not the wave.

Steven Drucker: Exactly. Exactly. And when you are standing on that beach amongst doing that work, it’s very incremental. You’re just trying to get past those last little wavelets. And then again, when you step back, wow, this is the – and then you have to step back and examine, oh, that area’s about to be flooded. We should do something about that!

Host: On the spectrum of, let’s just say Microsoft, because that’s what we’re talking about. But you’ve got your research people that are the out there in five years. You’ve got your product people that are right now. And then you’ve got your end consumers. And so that spectrum of how you contribute to both the company and then the world, the concentric circles out. How do you envision – how do you internalize that what you do? How does it make a difference?

Steven Drucker: Yeah, it changes at different times. Sometimes that’s – you shoot for that outside circle of the world, but along the way, you figure out, like here’s a stepping stone that I can publish, and here’s – so I’ll at least get this other audience of people working on it. And maybe here’s a product that I might be able to impact, and they’ll reach a different audience. And that’s sort of leading there. So sometimes it’s like, I’ve got this directed thing, and I’ll do those steps stones along the way. Other times, it’s almost the reverse. It’s sort of like, someone has come up with this thing that they really want, and you can give them, you know, beyond that solution. And it was just the sort of short-term that takes you off in a new direction that you didn’t know before. I got into that sort of photo management stuff, partly because talking – because I had that problem. Other people were talking. We were talking about it at lunch, and we were seeing that. And this was before iPhoto, and before there were any of those organizers… Let’s focus on this. And we did this for a good five, six years. And that led to a whole flowering of different areas. But that really was in response to some short-term problems that I was thinking, I think this is big.

Host: And don’t you think sometimes that’s how a lot of big solutions started out as accidents?

Steven Drucker: Yes.

Host: I think I’m going to test for this, and it turns out that that medicine worked for that instead.

Steven Drucker: Yeah, in that field, and all the others… In fact, I think it was in Hamming’s Turing Award acceptance speech, he talks about how that was the almost the kiss of death, because everybody then was expecting just big projects from him. And therefore, those thousand little tiny projects didn’t have a chance for any of them to bloom into something else. So, starting with this idea it’s going to big can be deathly. You need to have this mix of, you know, these bigger things that people are collaborating on, and the little flowers blooming. So, it’s really hard to say, which circle are you aiming at right now, and fluidly moving…

Host: Right. Sometimes it’s a moonshot. And sometimes it’s like, I’m just going to do what, you know…

Steven Drucker: That’s right. Because maybe that’s going to lead somewhere else. And it’s going to lead to something big unexpectedly.

Host: Steven Drucker, thanks for coming in and sharing your passion and your stories and your projects, and everything else.

Steven Drucker: It’s been my pleasure.

Host: To learn more about Dr. Steven Drucker’s work in data visualization, and other great things going on in Microsoft Research, visit Microsoft.com/research.

[End of recording]