{"id":472518,"date":"2018-03-28T07:22:53","date_gmt":"2018-03-28T14:22:53","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=472518"},"modified":"2018-05-23T10:33:55","modified_gmt":"2018-05-23T17:33:55","slug":"when-psychology-meets-technology-with-dr-daniel-mcduff","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/when-psychology-meets-technology-with-dr-daniel-mcduff\/","title":{"rendered":"When Psychology Meets Technology with Dr. Daniel McDuff"},"content":{"rendered":"
Microsoft Researcher Dr. Daniel McDuff. Photography by Maryatt Photography.<\/p><\/div>\n
Episode 17, March 28, 2018<\/strong><\/p>\n One of the most intriguing areas of machine learning research is affective computing, where scientists are working to bridge the gap between human emotions and computers. It is here, at the intersection of psychology and computer science, that we find Dr. Daniel McDuff<\/a>, who has been designing systems, from hardware to algorithms, that can sense human behavior and respond to human emotions.<\/p>\n Today, Dr. McDuff talks about why we need computers to understand us, outlines the pros and cons of designing emotionally sentient agents, explains the technology behind CardioLens<\/a>, a pair of augmented reality glasses that can take your heartrate by looking at your face, and addresses the challenges of maintaining trust and privacy when we\u2019re surrounded by devices that want to know not just what we\u2019re doing, but how we\u2019re feeling.<\/p>\n Relate<\/b>d:<\/b><\/p>\n Transcript<\/strong><\/p>\n Daniel McDuff: We\u2019ve developed a system that allows people to look at another individual and see physiological responses of that person. So it\u2019s data they wouldn\u2019t normally be able to see, but it\u2019s superimposed onto that other person so they can actually see their heart beating. They can see changes in stress, based on heart rate variability. And that\u2019s all sensed remotely. But you\u2019re giving the individual a new sensory channel that they can leverage\u2026<\/p>\n (Music plays)<\/strong><\/p>\n Host: You are listening to the Microsoft Research podcast. A show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/strong><\/p>\n (Music plays)<\/strong><\/p>\n One of the most intriguing areas of machine learning research is affective computing, where scientists are working to bridge the gap between human emotions and computers. It is here, at the intersection of psychology and computer science, that we find Dr. Daniel McDuff, who has been designing systems, from hardware to algorithms, that can sense human behavior and respond to human emotions.<\/strong><\/p>\n Today, Dr. McDuff talks about why we need computers to understand us, outlines the pros and cons of designing emotionally sentient agents, explains the technology behind CardioLens, a pair of augmented reality glasses that can take your heartrate by looking at your face, and addresses the challenges of maintaining trust and privacy when we\u2019re surrounded by devices that want to know not just what we\u2019re doing, but how we\u2019re feeling.<\/strong><\/p>\n That and much more on this episode of the Microsoft Research podcast.<\/strong><\/p>\n (Music plays)<\/strong><\/p>\n Host: Daniel McDuff, welcome to the show today. Great to have you with us.<\/strong><\/p>\n Daniel McDuff: It\u2019s great to be here.<\/p>\n Host: So you\u2019re in Human Computer Interaction, or HCI, and you situate your research at the intersection of computer science and psychology. So, tell us in broad strokes about HCI and what you do.<\/strong><\/p>\n Daniel McDuff: So the crux of what I do is teaching machines to understand people in a deeper way, and that involves capturing and responding to their emotional state. So, can we design a machine that really understands people, not just what they\u2019re saying, but how they\u2019re saying it and how they\u2019re behaving? And I think that\u2019s really fundamental to human-computer interaction because so much of what we do as people is nonverbal. It\u2019s not described in language. And a lot of computer systems don\u2019t understand that. That\u2019s the focus of my work, is bringing that EQ to technology.<\/p>\n Host: EQ meaning Emotional Quotient?<\/strong><\/p>\n Daniel McDuff: Yeah, that\u2019s a sort of somewhat slang term and it\u2019s used frequently to contrast IQ, which is something that technology has a lot of. Technology can answer lots of questions very quickly because it has access to all of the information on the internet, but not much technology has EQ.<\/p>\n Host: No. Uhhh\u2026 Does any?<\/strong><\/p>\n Daniel McDuff: I think we\u2019re starting to see the beginning of this. So you see social robots as a great example of systems which have some kind of personality they can express visually, some basic facial expressions on a screen or using sounds or lights. Movies are a great example. So, R2-D2 is a great example of a system that doesn\u2019t have a face, but can still communicate emotions. Although that\u2019s fictional, we are starting to see systems in the real world that kind of behave in a somewhat similar way.<\/p>\n Host: That\u2019s fascinating. I even think of Wallace and Gromit animation where Gromit only communicates with his eyes and his eyebrows, and yet you get almost everything that he wants to say through his eyes.<\/strong><\/p>\n Daniel McDuff: Exactly. And we take a lot of inspiration from animations and animators. Because I study facial expressions, it\u2019s magical how some creators can show so much rich emotion just through a facial expression. And as we design systems that recognize those and exhibit them, there\u2019s a lot we can learn from that side of the world.<\/p>\n Host: I\u2019m intrigued by the field of affective computing. And I understand it aims to bridge the gap between human emotions and computational technology. So, what is affective computing, what does it promise, and what do we need computers to understand us as human beings for?<\/strong><\/p>\n Daniel McDuff: At a high level, affective computing is designing systems that can read, interpret and respond to human emotion. And that sounds like a daunting task. There\u2019s a lot more we need to do in research, but we\u2019re starting to see real-world systems where this is true. So systems where they can read facial expressions for instance or understand the voice tone of someone, or look at sentiment for instance on Facebook posts or Twitter to understand the emotions that are being expressed. And this is kind of where the world is now, but in the future, we can imagine systems that use multimodal data, robotics systems that interact with us in an embodied way, that also sends this type of information. And that\u2019s kind of the target we\u2019re focused on.<\/p>\n Host: So, why do you think we need computers to understand us?<\/strong><\/p>\n Daniel McDuff: I think it\u2019s fundamental to how we interact as human beings. And so, when we interact with a system that doesn\u2019t do those things that we take for granted, it can be off-putting. For instance, if a system doesn\u2019t realize that I\u2019m getting frustrated with it, it can be more frustrating. It can even be upsetting. There\u2019s research showing that robots that can apologize are liked a lot more than robots that don\u2019t, even if they\u2019re no better at completing the task they were intending to complete. So, it can really improve our relationship and our well-being, because it fundamentally improves the interaction we have with the devices around us.<\/p>\n Host: So, you coauthored an article called, \u201cDesigning Emotionally Sentient Agents.\u201d And aside from the Hollywood connotations that phrase brings to mind, what should we understand about this research, Daniel, and what should we be excited about or concerned about?<\/strong><\/p>\n Daniel McDuff: I think there\u2019s a lot to be excited about in homecare, in healthcare, in understanding just human interaction even more. If we can design systems to mimic some of those things, it will deepen our understanding of how we as humans behave. There are a number of challenges that we need to overcome. One is how do we sense this information. And senses can be intrusive. You know, devices that are around us that are listening for commands all the time are starting to appear and you can imagine in future, there could be camera systems as well. So we need to think about the social norms that exist around the sensing side of things. Where does that data go? How is it stored? How do we know that the sensor\u2019s on? How do we control it? How do we stop it from recording when we want to? And then, how do we allow other people who are in our lives to not be sensed even if we\u2019re being sensed or if I invite someone into my home and I\u2019ve got a device that\u2019s always listening or always watching, what does that mean for our social interaction? So, I think there\u2019s some challenges to overcome there, but there\u2019s also more philosophical challenges about, how much do we teach computers of human emotion? Is it possible for a machine ever to feel emotion? What does that mean? And how should machines express emotion or respond to this information? We definitely don\u2019t want to design systems that are manipulative or that make people feel like they\u2019re more intelligent than they are. If someone sees a system that appears emotional, they might think, wow, this is really, really intelligent, even if it\u2019s only expressing very basic behaviors. And that can be challenging because some of the other abilities of that system might be quite weak. And so people might trust it even if it can\u2019t actually perform, accurately, the tasks it\u2019s trying to do.<\/p>\n Host: You\u2019re using terms that are interesting. And interesting is a kind of placeholder word for other words that I\u2019m actually thinking. Like \u201csentient\u201d and \u201cunderstanding\u201d regarding a machine. And I wonder how I should interpret that? What do people like you and your colleagues really believe about what you just addressed? Can a machine ever feel? Can it really understand? Can it become sentient?<\/strong><\/p>\n Daniel McDuff: I think machines are fundamentally different to humans. Machines can recognize some expressions of emotion. They can respond to them. But I don\u2019t think that that constitutes feeling and emotion. Feeling and emotion requires experience. It requires a reward and a cost associated with different actions. It\u2019s much, much more complex than that. So I don\u2019t think machines will ever experience emotion in the way that we do, but they will have many of the, sort of, fundamental skills that we have.<\/p>\n (Music plays)<\/strong><\/p>\n Host: What can you tell us about the emerging field of what I would call artificial emotional intelligence or emotional technology? You use an example of a bathroom mirror that has ambient intelligence and can tell whether I\u2019ve slept well. Why do I need that?<\/strong><\/p>\n Daniel McDuff: That\u2019s a good question. I think it\u2019s important that we design systems that are ultimately beneficial to people. And one of the roadblocks, especially in healthcare, is that there\u2019s so much rich data out there, but it\u2019s very hard to understand it, or it\u2019s cumbersome to monitor it. And so designing systems that make it seamless to be able to collect and understand that type of data is really important. So, at MIT, when I was a graduate student, we built a mirror that had a camera embedded in it. It was actually hidden behind 2-way glass, so all it looked like was just a regular mirror. But when you looked in the mirror, the camera was using some remote sensing technology we built to measure the heart rate of the person. And we can also measure things like heart rate variability, which is correlated with stress. And so the mirror could then display that information back to the user. So it\u2019s not just reflecting their outward appearance, but their sort of inner physiological state as well. And I found that really compelling because, in many cases, we want to know that information, but we might not want to strap on a sensor or have to go out of our way to collect it and if it can be digitally captured by the devices we already use, there\u2019s something quite compelling about that.<\/p>\n Host: Let\u2019s talk about the technical aspects of your work for a bit. Much of it\u2019s centered on computer vision technologies and involves webcams and algorithms that aim to understand emotional states. What\u2019s the field of computer vision founded on technically and what new developments are we seeing?<\/strong><\/p>\n Daniel McDuff: So computer vision is exploding. The past 10 years have been some of the most exciting in this domain, with the invention of what\u2019s commonly called deep learning. And so this is the ability to leverage huge amounts of data to train systems that are much more accurate than previous systems were. So, for instance, we have object recognition, text recognition, scene understanding that\u2019s way more accurate than it used to be, because we have these systems that capture lots of complexities of the data. And because there\u2019s so much data they can learn from, they get a really good representation. And understanding facial expressions has also benefited from the advances in this technology, as has a lot of other areas of affective computing, whether it\u2019s speech recognition or understanding vocal prosody and things like that. So, there\u2019s a lot of advances that have happened that basically improve the underlying sensing. And I don\u2019t think up until this point we\u2019ve really had the volume of data about emotions to go to the next level where we can really understand, okay, how do we build a system that actually knows what to do with these sensor inputs, with something that\u2019s as amorphous as emotion is and hard to define?<\/p>\n Host: So, the basis of what you\u2019re doing is on deep neural networks and machine learning models that you\u2019re then applying to the affective domain.<\/strong><\/p>\n Daniel McDuff: Exactly, yes. So we use deep learning for almost all of the sensing modalities we use, whether it\u2019s vision-based or audio-based or language-based. And then that feeds into a system which is taking sort of intermediate-level information. For instance, does my facial expression appear positive or negative? Is my voice tone high-energy or low-energy? Is the language I\u2019m using hostile or serene? And then, those intermediate states feed into a high-level understanding which is combined with context. So we need to know what\u2019s happening to interpret emotion. We can\u2019t just observe the person. We need to know the situation, the social context. And so that\u2019s kind of where we\u2019re moving, is really to combine these sensor observations with more contextual information.<\/p>\n Host: So, I wasn\u2019t going to ask you this. It wasn\u2019t on my list. But how do you gather data on emotions? Do you have to bring people in and make them angry? I mean, it\u2019s a serious question in a funny way.<\/strong><\/p>\n Daniel McDuff: In the past, that was how it was often done. But a lot of my work in the last few years has been focusing on in-situ, large-scale data collection. So we always ask people if they want to opt in. And if they do, then we enable them to use a system which is part of their everyday life. So this might be a system that runs on their computer or runs on their cell phone and collects this data over time. Often, we might prompt them throughout the day, sort of how are you feeling? Or we might say is this feeling that we think you\u2019re feeling correct? In order to get some kind of ground truth. But ultimately, we want to be able to collect real-life data about people\u2019s emotional experiences. Because we know if they come to a lab, it\u2019s not exactly the same as how it would be in the real world.<\/p>\n Host: One of the applications of this emotional technology is the workplace. In an MIT Sloan Management Review article, it claims that emotion-sensing technologies could help employees make better decisions, improve concentration, alleviate stress. So, tell us how this works and give us some examples of what it looks like it. And then maybe tell me why I would want my boss to monitor my eye movements, my facial expressions, and my skin conductance?<\/strong><\/p>\n Daniel McDuff: So, one example we give in that article is about a trader in Japan who unfortunately swapped the number of shares they were selling and the price of the shares, and got those two numbers the wrong way around. That ended up being a huge financial loss. And in high-stress situations that can be really problematic. Another example would be air traffic control. A very high-stress job where people have to be performing at a high-level for the whole of the duration of their shift. And so if we can design technology that is able to sense when people are becoming overloaded, too stressed to perform at the level that they need to, we could give them that feedback. So, for individuals, that could be very helpful for knowing when they need to take a break. I, myself, in a job, you know, on an average day, it would be great if my computer knew when I was in flow and stopped interrupting me with emails notifications. Or if I needed to take a break, it could suggest things that would help me relax and make me more productive when I came back to my desk. And then I think it would also benefit teams and organizations, knowing the well-being of your company is a really important thing. And we\u2019re starting to see the development of really science around organizations, and particularly focused on the social components. Social capital is really important, and emotion plays a big role in that.<\/p>\n Host: Tell me what safeguards a designer or a developer might think about so that this technology doesn\u2019t become \u201cnanny cam\u201d in the workplace?<\/strong><\/p>\n Daniel McDuff: That\u2019s a really, really important question. And I think as we design this technology, it\u2019s important that we design social norms around how they\u2019re used. Ultimately, technology will advance. That\u2019s somewhat inevitable. But how we use technology and the social norms that we design around it are not inevitable. So to give an example, one of the practices we follow is always \u201copt in.\u201d So we always make sure that people choose to switch on sensors rather than having it imposed upon them. Another example is as we mentioned before, allowing people to turn off sensors. And that\u2019s really important people have that. It increases their trust and comfort with the system a lot. Those are a couple of examples about kind of social norms we can design around this technology. And I think there are many more that will develop as we kind of advance technology and think about use cases.<\/p>\n (Music plays)<\/strong><\/p>\n Host: Let\u2019s talk about reality for a bit. There\u2019s actual reality, which I have a passing familiarity with, but also virtual reality, augmented reality, mixed reality. There\u2019s so many realities. Give us a baseline definition of each of those different realities, uh, so we have a frame of reference for what I want to talk about next.<\/strong><\/p>\n Daniel McDuff: Great, so virtual reality is a completely alternative environment. So this is where, most people will probably be familiar with virtual reality in terms of the headsets with a screen where all the information that you see is displayed on that screen. Then augmented reality is usually when you can see the real world, but there\u2019s some augmentation of what you see. So there might be a transparent screen which is actually displaying certain objects which are superimposed on the real world. And then there\u2019s this idea of mixed reality, which is really blurring the boundaries between virtual and augmented reality. So you\u2019re leveraging much deeper understanding about the environment, as well as incorporating a lot more augmentation.<\/p>\n Host: So let\u2019s go along that thread for a second here. Because when you talk about augmenting human perception through mixed or virtual reality, you suggest that VR might be able to help people develop \u201csuperhuman\u201d senses. What are the possibilities and challenges even, of advancing human senses in this way?<\/strong><\/p>\n Daniel McDuff: Yeah, so I mean, one of the things I find most fascinating about other areas of science, like neuroscience, is how adaptable we are, and particularly the brain is, at being able to learn new things based on sensory input. So, we have a panel at South by Southwest where we\u2019re discussing some of the ways that sensor inputs can influence people\u2019s perception. And one example that we\u2019ve developed is a system that allows people to look at another individual and see physiological responses of that person. So it\u2019s data they wouldn\u2019t normally be able to see, but it\u2019s superimposed onto that other person so that they can actually see their heart beating. They can see changes in stress based on heart rate variability. And that\u2019s all sensed remotely. But you\u2019re giving the individual a new sensory channel that they can leverage, something that they wouldn\u2019t normally have.<\/p>\n Host: So this is like x-ray vision.<\/strong><\/p>\n Daniel McDuff: In a sense, yes. That\u2019s a good analogy.<\/p>\n Host: I mean, from the superhero realm, that\u2019s…<\/strong><\/p>\n Daniel McDuff: Exactly, yeah.<\/p>\n Host: So the idea of superhuman senses would be physiological senses that you wouldn\u2019t normally be able to see, aside from somebody sweating or blushing, or you know their facial expressions. It\u2019s inside their bodies.<\/strong><\/p>\n Daniel McDuff: Exactly, yeah. Yeah, it\u2019s hidden information that wouldn\u2019t normally be accessible, but using the new technology, like high-definition cameras and this augmented experience that we can create through the HoloLens headset, we can allow you to see that information in real time.<\/p>\n Host: So maybe now junior high kids can actually find out if someone\u2019s in love with them, just by putting on these glasses? And they don\u2019t have to ask their friend to go ask their other friend if he likes me.<\/strong><\/p>\n Daniel McDuff: I\u2019ve always wanted to build that demo and just see how badly it fails.<\/p>\n Host: That would actually be a really compelling application of the technology, just to help the junior high kids. Um, so you\u2019re one of the creators of an application of this technology that you call CardioLens, and while it\u2019s still in the early stages of research, and it’s not being used in any real-life situations right now, you\u2019re actually able to read my heartrate by looking at my face through a pair of augmented reality glasses. Tell me more about this. What are the possibilities of this research, down the road?<\/strong><\/p>\n Daniel McDuff: Yeah, so I\u2019ve been working on this area of remote, or non-contact, physiological measurement for a while. And this is the idea that a regular webcam, just the camera that might be on your cell phone or on your laptop, has the sensitivity to pick up very small changes in the color of your skin, or light reflected in your skin, to be more accurate, which are related to blood flow. So actually, by analyzing a video input, video stream to that camera, we can pick up your pulse, we can pick up your respiration rate and your heartrate variability. And there\u2019s new work showing you can measure blood oxygenation and other things. And people are trying to get towards things like blood pressure. So just using a regular device, no adaptation to the hardware, and some software, we can recover this information. So, what we did was to the put the algorithm on the HoloLens, which has a camera that faces forward. And so when you look at someone, it detects their face. It segments the skin. It analyzes the color change and recovers the physiological information, and then displays that back in real time, superimposed onto their appearance.<\/p>\n Host: How accurate is it?<\/strong><\/p>\n Daniel McDuff: So, the technology can be very accurate. So we\u2019ve done a lot of validation of this. We can measure heartrate to within 1 or 2 beats per minute, typically on a regular video. We\u2019re using deep learning to address this problem, and we\u2019ve got really, really good results on some of the hardest data sets that we\u2019ve tried it on. When you take this out into the real world where people are moving around and the lighting\u2019s changing and you can\u2019t control if they\u2019re making facial expressions or speaking, or you know if it starts to become more challenging, that\u2019s the type of data that we\u2019re pushing towards addressing. So we want to start designing methods that are robust to all of those variations that we\u2019d actually see in a real-life application.<\/p>\n Host: But normally, you would measure someone\u2019s blood pressure or their pulse in a clinical setting. I mean, you wouldn\u2019t necessarily \u2013 it would be a tool maybe for the medical community first, or…<\/strong><\/p>\n Daniel McDuff: Yeah.<\/p>\n Host: Or the junior high boy that needs to know, do you love me?<\/strong><\/p>\n Daniel McDuff: That might be the biggest market. No, I think\u2026 one application I am particularly excited about is in medical applications, for instance, surgery. So, you could see if a particular part of the body has good or poor blood flow. And that could be important in transplant operations where you\u2019re attaching a new part of the body, a new organ, and you need to know if blood is flowing to that particular part of the body. And with a heads-up display, a surgeon could potentially look at that region and see if there is a blood flow signal. But there are other applications, too. Another example would be, for instance, being able to scan a scene and identify if there\u2019s someone who\u2019s alive, for instance, in a search and rescue application.<\/p>\n Host: Oh, interesting.<\/strong><\/p>\n Daniel McDuff: And this also works with infrared cameras. So even if it\u2019s dark, we can still measure the signal. There are other things like baby monitors or in hospital ICU units monitoring physiological information without having to have people wired up to lots of different sensors. We can just use a camera to do that.<\/p>\n Host: Every single show, I end up shaking my head. No one can see it happening, but it\u2019s like, really? This is happening? I can\u2019t believe it. It\u2019s amazing. Talk about the trade-offs between the promises these technologies make and some of the concerns, very real concerns, about privacy of the data.<\/strong><\/p>\n Daniel McDuff: Yeah, as I mentioned before, I think it\u2019s really important that we design this technology appropriately. And I think that\u2019s where we\u2019ll see the biggest benefits. The benefits are when people recognize this is something that actually helps me in my everyday life or helps in a specific application like healthcare. There are definitely big challenges to privacy, because a lot of what we need to do to deploy this technology, is to be able to sense information longitudinally, on a large scale, because everyone experiences emotion differently. You can\u2019t just take 10 people and train a system on 10 people that will generalize to the whole population. And so we do need to overcome that challenge of, you know how do we make this technology such that people feel comfortable with it, they trust it, and not leave them to feel as though their privacy is violated or that it\u2019s too obtrusive. And so I think in design challenges, it\u2019s about designing ways for people to be aware that technology is on, that\u2019s it there, what it\u2019s measuring, what it\u2019s doing with that data. And these are some more unsolved problems as of yet.<\/p>\n Host: You said you prefer social norms over governmental regulations or legal remedies. So what\u2019s the balance between the responsibilities of scientists, engineers, and programmers here, versus big regulatory initiatives like GDPR in Europe, and other things that might be coming down the pike?<\/strong><\/p>\n Daniel McDuff: I think both are important. But the reason I prefer focusing on social norms is because as a designer, as an engineer, that\u2019s something I can actively influence every day in my job. So I can think about, okay, so I\u2019m going to design this sensor system that people are going to choose to use and capture their emotions and it\u2019s going to create an experience that adapts to how they\u2019re feeling. I can choose how to design that, and I can influence the social norms around that technology. So being kind of a leader in the research space allows me to do that actively, regularly. And I don\u2019t think we can necessarily rely 100 percent on government or regulation to solve that piece of the puzzle. A good part of being part of MSR is that we\u2019re very involved with the academic community. I\u2019m involved with the Future of Computing Academy, the ACM. And our task group within that organization is to think about the ethical questions around AI. Not just in affective computing technology, but just broadly with machine learning and AI technology that can make decisions about important things like, for instance, healthcare or justice. And I think social norms and governmental regulation both serve a purpose there. But one of the things I personally can actively work towards on a daily basis is thinking through, what do I ask people to give up in terms of data and what do they get back for that, and how is that data used? And that\u2019s something I\u2019m really, really interested in.<\/p>\n (Music plays)<\/strong><\/p>\n Host: Let\u2019s talk about you for a second. I\u2019m curious how you got interested in the emotional side of technology and how you ended up at MSR. Who were your influences, your inspirations, your mentors?<\/strong><\/p>\n Daniel McDuff: So I did my Master\u2019s at Cambridge University and was focused on machine learning. But I was very interested in how I could address more social problems with that technology, not just focus on predicting stock market prices or some of the sort of numerical analyses that are often solved using machine-learning algorithms. I wanted to see how this technology could actually help people. And at the time, my advisor for my PhD, Rosalind Picard, who is one of the founders of this field, was working a lot with applications for people on the autism spectrum for whom understanding emotions is a complex task and often a big challenge in social situations. And that was one of the reasons that I joined that lab, is because I really believed in the potential benefits of affective computing technology, not just to one portion of the population, but to everyone. I could see how it could benefit my life as well. So that\u2019s how I got into it. And you know it\u2019s becoming more true now, but certainly 10 years ago, there was no technology you could really think of that responded or understood human emotion.<\/p>\n Host: No. Even now.<\/strong><\/p>\n Daniel McDuff: Even now, I mean, yeah. We\u2019re getting there in research, but there\u2019s not many real-life applications you could point towards and say, oh, this is an example of a system that really understands nonverbal or emotional cues.<\/p>\n Host: Right. So what was your path from Cambridge and Rosalind Picard to here?<\/strong><\/p>\n Daniel McDuff: So I went to the MIT media lab where I did my PhD. And there I worked a lot on large-scale data analysis to do with understanding emotions in real-world contexts. And then I worked for a couple years at a startup and joined MSR out of that, and now lead the Affective Computing Technology development within Research here.<\/p>\n Host: That\u2019s really cool. So, as we wrap up, Daniel, what thoughts or advice would you leave with our listeners, many of whom are aspiring researchers, who might have an interest in human computer interaction or affective computing? What lines of research are interesting right now? What might augment, to use an industry term, the field?<\/strong><\/p>\n Daniel McDuff: If I were to sort of summarize the areas that I think are most important, the first would be multi-modal understanding. So, in the past, a lot of the systems that have been built have focused just on one piece of information like for instance, facial expressions, or voice tone, or text. But to really understand emotions, you have to integrate all that information together. Because if I just look at facial expressions, you know if I were to show you a video of someone without the audio and without the information about what they were saying, it would be hard to interpret exactly how they felt. Or many people have probably experienced being on a phone call where they haven\u2019t been able to exactly understand how someone was feeling because you\u2019ve only got the voice tone and language to rely on. You don\u2019t have all of that visual information about their gestures and facial expressions and body posture. So I think multi-modal understanding is really important. Another area that I\u2019m particularly interested in is something we\u2019ve touched on already, which is kind of deploying this in the real world. So, how do we take these experiments that have typically been performed in labs, in research environments, where you bring 10 or 20 people in and you get them to experience the system and you evaluate it. Which is fine for controlled studies, but ultimately, if we\u2019re going to evaluate the real system and how people will actually respond to it in their everyday lives, we need to deploy it. And so that\u2019s something we\u2019re focused on, is really designing things that are so seamless that people can use them without them being a burden, and we can start to mine this data that occurs in everyday contexts.<\/p>\n\n
\n