{"id":573651,"date":"2019-03-20T07:58:36","date_gmt":"2019-03-20T14:58:36","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=573651"},"modified":"2020-04-23T15:02:40","modified_gmt":"2020-04-23T22:02:40","slug":"project-triton-and-the-physics-of-sound-with-dr-nikunj-raghuvanshi","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/project-triton-and-the-physics-of-sound-with-dr-nikunj-raghuvanshi\/","title":{"rendered":"Project Triton and the physics of sound with Dr. Nikunj Raghuvanshi"},"content":{"rendered":"<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-573663\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-1024x576.png\" alt=\"\" width=\"1024\" height=\"576\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<h3>Episode 68, March 20, 2019<\/h3>\n<p>If you\u2019ve ever played video games, you know that for the most part, they look a lot better than they sound. That\u2019s largely due to the fact that audible sound waves are much longer \u2013 and a lot more crafty \u2013 than visual light waves, and therefore, much more difficult to replicate in simulated environments. But <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/nikunjr\/\">Dr. Nikunj Raghuvanshi, a Senior Researcher<\/a> in <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/graphics\/\">the Interactive Media Group at Microsoft Research<\/a>, is working to change that by bringing the quality of game audio up to speed with the quality of game video. He wants you to hear how sound really travels \u2013 in rooms, around corners, behind walls, out doors \u2013 and he\u2019s using computational physics to do it.<\/p>\n<p>Today, Dr. Raghuvanshi talks about the unique challenges of simulating realistic sound on a budget (both money and CPU), explains how classic ideas in concert hall acoustics need a fresh take for complex games like <em>Gears of War<\/em>, reveals the computational secret sauce you need to deliver the right sound at the right time, and tells us about <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-triton\/\">Project Triton<\/a>, an acoustic system that models how real sound waves behave in 3-D game environments to makes us believe with our ears as well as our eyes.<\/p>\n<h3>Related:<\/h3>\n<ul type=\"disc\">\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/acoustics\/what-is-acoustics\">Project Acoustics<\/a>: What is Project Acoustics<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\">Microsoft Research Podcast<\/a>: View more podcasts on Microsoft.com<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/itunes.apple.com\/us\/podcast\/microsoft-research-a-podcast\/id1318021537?mt=2\">iTunes<\/a>: Subscribe and listen to new podcasts each week on iTunes<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/subscribebyemail.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\">Email<\/a>: Subscribe and listen by email<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/subscribeonandroid.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\">Android<\/a>: Subscribe and listen on Android<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/open.spotify.com\/show\/4ndjUXyL0hH1FXHgwIiTWU\">Spotify<\/a>: Listen on Spotify<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.blubrry.com\/feeds\/microsoftresearch.xml\">RSS feed<\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/note.microsoft.com\/ww-registration-microsoft-research-newsletter-s.html?wt.mc_id=S-webpage_podcast\">Microsoft Research Newsletter<\/a>: Sign up to receive the latest news from Microsoft Research<\/li>\n<\/ul>\n<hr \/>\n<h3>Final Transcript<\/h3>\n<p>Nikunj Raghuvanshi: In a game scene, you will have multiple rooms, you\u2019ll have caves, you\u2019ll have courtyards, you\u2019ll have all sorts of complex geometry and then people love to blow off roofs and poke holes into geometry all over the place. And within that, now sound is streaming all around the space and it\u2019s making its way around geometry. And the question becomes how do you compute even the direct sound? Even the initial sound\u2019s loudness and direction, which are important? How do you find those? Quickly? Because you are on the clock and you have like 60, 100 sources moving around, and you have to compute all of that very quickly.<\/p>\n<p><strong>Host: You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/strong><\/p>\n<p><strong>Host: If you\u2019ve ever played video games, you know that for the most part, they look a lot better than they sound. That\u2019s largely due to the fact that audible sound waves are much longer \u2013 and a lot more crafty \u2013 than visual light waves, and therefore, much more difficult to replicate in simulated environments. But Dr. Nikunj Raghuvanshi, a Senior Researcher in the Interactive Media Group at Microsoft Research, is working to change that by bringing the quality of game audio up to speed with the quality of game video. He wants you to hear how sound really travels \u2013 in rooms, around corners, behind walls, out doors \u2013 and he\u2019s using computational physics to do it.<\/strong><\/p>\n<p><strong>Today, Dr. Raghuvanshi talks about the unique challenges of simulating realistic sound on a budget (both money and CPU), explains how classic ideas in concert hall acoustics need a fresh take for complex games like <em>Gears of War<\/em>, reveals the computational secret sauce you need to deliver the right sound at the right time, and tells us about Project Triton, an acoustic system that models how real sound waves behave in 3-D game environments to makes us believe with our ears as well as our eyes. That and much more on this episode of the Microsoft Research Podcast.<\/strong><\/p>\n<p><strong>Host: Nikunj Raghuvanshi, welcome to the podcast.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: I\u2019m glad to be here!<\/p>\n<p><strong>Host: You are a senior researcher in MSR\u2019s Interactive Media Group, and you situate your research at the intersection of computational acoustics and graphics. Specifically, you call it \u201cfast computational physics for interactive audio\/visual applications.\u201d<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yep, that\u2019s a mouthful, right?<\/p>\n<p><strong>Host: It is a mouthful. So, unpack that! How would you describe what you do and why you do it? What gets you up in the morning?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, so my passion is physics. I really like the mixture of computers and physics. So, the way I got into this was, many, many years ago, I picked up this book on C++ and it was describing graphics and stuff. And I didn\u2019t understand half of it, and there was a color plate in there. It took me two days to realize that those are not photographs, they were generated by a machine, and I was like, somebody took a photo of a world that doesn\u2019t exist. So, that is what excites me. I was like, this is amazing. This is as close to magic as you can get. And then the idea was I used to build these little simulations and I was like the exciting thing is you just code up these laws of physics into a machine and you see all this behavior emerge out of it. And you didn\u2019t tell the world to do this or that. It\u2019s just basic Newtonian physics. So, that is computational physics. And when you try to do this for games, the challenge is you have to be super-fast. You have 1\/60th of a second to render the next frame to produce the next buffer of audio. Right? So, that\u2019s the fast portion. How do you take all these laws and compute the results fast enough that it can happen at 1\/60th of a second, repeatedly? So, that\u2019s where the computer science enters the physics part of it. So, that\u2019s the sort of mixture of things where I like to work in.<\/p>\n<p><strong>Host: You\u2019ve said that light and sound, or video and audio, work together to make gaming, augmented reality, virtual reality, believable. Why are the visual components so much more advanced than the audio? Is it because the audio is the poor relation in this equation, or is it that much harder to do?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: It is kind of both. Humans are visual dominant creatures, right? Because visuals are what is on our conscious mind and when you describe the world, our language is so visual, right? Even for sound, sometimes we use visual metaphors to describe things. So, that is part of it. And part of it is also that for sound, the physics is in many ways tougher because you have much longer wavelengths and you need to model wave diffraction, wave scattering and all these things to produce a believable simulation. And so, that is the physical aspect of it. And also, there\u2019s a perceptual aspect. Our brain has evolved in a world where both audio\/visual cues exist, and our brain is very clever. It goes for the physical aspects of both that give us separate information, unique information. So, visuals give you line-of-sight, high resolution, right? But audio is lower resolution directionally, but it goes around corners. It goes around rooms. That\u2019s why if you put on your headphones and just listen to music at the loud volume, you are a danger to everybody on the street because you have no awareness.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: So, audio is the awareness part of it.<\/p>\n<p><strong>Host: That is fascinating because you\u2019re right. What you can see is what is in front of you, but you could hear things that aren\u2019t in front of you.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah.<\/p>\n<p><strong>Host: You can\u2019t see behind you, but you can hear behind you.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Absolutely, you can hear behind yourself and you can hear around stuff, around corners. You can hear stuff you don\u2019t see, and that\u2019s important for anticipating stuff.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: People coming towards you and things like that.<\/p>\n<p><strong>Host: So, there\u2019s all kinds of people here that are working on 3D sound and head-related transfer functions and all that.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, Ivan\u2019s group.<\/p>\n<p><strong>Host: Yeah! How is your work interacting with that?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: So, that work is about, if I tell you the spatial sound field around your head, how does it translate into a personal experience in your two ears? So, the HRTF modeling is about that aspect. My work with John Snyder is about, how does the sound propagate in the world, right?<\/p>\n<p><strong>Host: Interesting.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: So, if there is a sound down a hallway, what happens during the time it gets from there up to your head? That\u2019s our work.<\/p>\n<p><strong>Host: I want you to give us a snapshot of the current state-of-the-art in computational acoustics and there\u2019s apparently two main approaches in the field. What are they, and what\u2019s the case for each and where do you land in this spectrum?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: So, there\u2019s a lot of work in room acoustics where people are thinking about, okay, what makes a concert hall sound great? Can you simulate a concert hall before you build it, so you know how it\u2019s going to sound? And, based on the constraints on those areas, people have used a lot of ray tracing approaches which borrow on a lot of literature in graphics. And for graphics, ray tracing is the main technique, and it works really well, because the idea is you\u2019re using a short wavelength approximation. So, light wavelengths are submicron and if they hit something, they get blocked. But the analogy I like to use is sound is very different, the wavelengths are much bigger. So, you can hold your thumb out in front of you and blot out the sun, but you are going to have a hard time blocking out the sound of thunder with a thumb held out in front of your ear because the waves will just wrap around. And, that\u2019s what motivates our approach which is to actually go back to the physical laws and say, instead of doing the short wave length approximation for sound, we revisit and say, maybe sounds needs the more fundamental wave equation to be solved, to actually model these diffraction effects for us. The usual thinking is that, you know, in games, you are thinking about we want a certain set of perceptual cues. We want walls to occlude sound, we want a small room to reverberate less. We want a large hall to reverberate more. And the thought is, why are we solving this expensive partial differential equation again? Can\u2019t we just find some shortcut to jump straight to the answer instead of going through this long-winded route of physics? And our answer has been that you really have to do all the hard work because there\u2019s a ton of information that\u2019s folded in and what seems easy to us as humans isn\u2019t quite so easy for a computer and and there\u2019s no neat trick to get you straight to the perceptual answer you care about.<\/p>\n<p>(music plays)<\/p>\n<p><strong>Host: Much of the work in audio and acoustic research is focused on indoor sound where the sound source is within the line of sight and the audience and the listener can see what they were listening to\u2026<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Um-hum.<\/p>\n<p><strong>Host: \u2026and you mentioned that the concert hall has a rich literature in this field. So, what\u2019s the gap in the literature when we move from the concert hall to the computer, specifically in virtual environments?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, so games and virtual reality, the key demand they have is the scene is not one room, and with time it has become much more difficult. So, a concert hall is terrible if you can\u2019t see the people who are playing the sound, right? So, it allows for a certain set of assumptions that work extremely nicely. The direct sound, which is the initial sound, which is perceptually very critical, just goes in a straight line from source to listener. You know the distance so you can just use a simple formula and you know exactly how loud the initial sound is at the person. But in a game scene, you will have multiple rooms, you\u2019ll have caves, you\u2019ll have courtyards, you\u2019ll have all sorts of complex geometry and then people love to blow off roofs and poke holes into geometry all over the place. And within that, now sound is streaming all around the space and it\u2019s making its way around geometry. And the question becomes, how do you compute even the direct sound? Even the initial sound\u2019s loudness and direction, which are important? How do you find those? Quickly? Because you are on the clock and you have like 60, 100 sources moving around, and you have to compute all of that very quickly. So, that\u2019s the challenge.<\/p>\n<p><strong>Host: All right. So, let\u2019s talk about how you\u2019re addressing it. A recent paper that you\u2019ve published made some waves, sound waves probably. No pun intended\u2026 It\u2019s called Parametric Directional Coding for Pre-computed Sound Propagation. Another mouthful. But it\u2019s a great paper and the technology is so cool. Talk about this\u2026 research this that you\u2019re doing.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah. So, our main idea is, actually, to look at the literature in lighting again and see the kind of path they\u2019d followed to kind of deliver this computational challenge of how you do these extensive simulations and still hit that stringent CPU budget in real time. And one of the key ideas is you precompute. You cheat. You just look at the scene and just compute everything you need to compute beforehand, right? Instead of trying to do it on the fly during the game. So, it does introduce the limitation that the scene has to be static. But then you can do these very nice physical computations and you can ensure that the whole thing is robust, it is accurate, it doesn\u2019t suffer from all the sort of corner cases that approximations tend to suffer from, and you have your result. You basically have a giant look-up table. If somebody tells you that the source is over there and the listener is over here, tell me what the loudness of the sound would be. We just say okay, we this a giant table, we\u2019ll just go look it up for you. And that is the main way we bring the CPU usage into control. But it generates a knock-off challenge that now we have this huge table, there\u2019s this huge amount of data that we\u2019ve stored and it\u2019s 6-dimensional. The source can move in 3-dimensions and the listener can move in 3-dimensions. So, we have the giant table which is terabytes or even more on data.<\/p>\n<p><strong>Host: Yeah.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: And the game\u2019s typical budget is like 100 megabytes. So, the key challenge we\u2019re facing is, how do we fit everything in that? How do we take this data and extract out something salient that people listen to and use that? So, you start with full computation, you start as close to nature as possible and then we\u2019re saying okay, now what would a person hear out of this? Right? Now, let\u2019s do that activity of, instead of doing a shortcut, now let\u2019s think about okay, a person hears the directional sound comes from. If there is a doorway, the sound should come from the doorway. So, we pick out these perceptual parameters that are salient for human perception and then we store those. That\u2019s the crucial way you kind of bring down this enormous data set and do a sort of memory budget that\u2019s feasible.<\/p>\n<p><strong>Host: So, that\u2019s the paper.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Um-hum.<\/p>\n<p><strong>Host: And how has it played out in practice, or in project, as it were?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: So, a little bit of history on this is, we had a paper SIGGRAPH 2010, me and John Snyder and some academic collaborators, and at that point, we were trying to think of just physical accuracy. So, we took the physical data and we were trying to stay as close to physical reality as possible and we were rendering that. And around 2012, we got to talking with <em>Gears of War<\/em>, the studio, and we were going through what the budgets will be, how things would be. And we were like we need\u2026 this needs to\u2026 this is gigabytes, it needs to go to megabytes\u2026<\/p>\n<p><strong>Host: Really?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: \u2026very quickly. And that\u2019s when we were like, okay, let\u2019s simplify. Like, what\u2019s the four like most basic things that you really want from an acoustic system? Like walls should occlude sound and thing like that. So, we kind of re-winded and came to it from this perceptual viewpoint that I was just describing. Let\u2019s keep only what\u2019s necessary. And that\u2019s how we were able to ship this in 2016 in <em>Gears of War<\/em> 4 by just re-winding and doing this process.<\/p>\n<p><strong>Host: How is that playing in to, you know\u2026 Project Triton is the big project that we\u2019re talking about. How would you describe what that\u2019s about and where it\u2019s going? Is it everything you\u2019ve just described or is there\u2026 other aspects to it?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah. Project Triton is this idea that you should precompute the wave physics, instead of starting with approximations. Approximate later. That\u2019s one idea of Project Triton. And the second is, if you want to make it feasible for real games and real virtual reality and augmented reality, switch to perceptual parameters. Extract that out of this physical simulation and then you have something feasible. And the path we are on now, which brings me back to the recent paper you mentioned\u2026<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: \u2026is, in <em>Gears of War<\/em>, we shipped some set of parameters. We were like, these make a big difference. But one thing we lacked was if the sound is, say, in a different room and you are separated by a doorway, you would hear the right loudness of the sound, but its direction would be wrong. Its direction would be straight through the wall, going from source to listener.<\/p>\n<p><strong>Host: Interesting.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: And that\u2019s an important spatial cue. It helps you orient yourself when sounds funnel through doorways.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Right? And it\u2019s a cue that sound designers really look for and try to hand-tune to get good ambiances going. So, in the recent 2018 paper, that\u2019s what we fixed. We call this portaling. It\u2019s a made-up word for this effect of sounds going around doorways, but that\u2019s what we\u2019re modeling now.<\/p>\n<p><strong>Host: Is this new stuff? I mean, people have tackled these problems for a long time.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah.<\/p>\n<p><strong>Host: Are you people the first ones to come up with this, the portaling and\u2026?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: I mean, the basic ideas have been around. People know that, perceptually, this is important, and there are approaches to try to tackle this, but I\u2019d say, because we\u2019re using wave physics, this problem becomes much easier because you just have the waves diffract around the edge. With ray tracing you face the difficult problem that you have to trace out the rays \u201cintelligently\u201d somehow to hit an edge, which is like hitting a bullseye, right?<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: So, the ray can wrap around the edge. So, it becomes really difficult. Most practical ray tracing systems don\u2019t try to deal with this edge diffraction effect because of that. Although there are academic approaches to it, in practice it becomes difficult. But as I worked on this over the years, I\u2019ve kind of realized, these are the real advantages of this. Disadvantages are pretty clear: it\u2019s slow, right? So, you have to precompute. But we\u2019re realizing, over time, that going to physics has these advantages.<\/p>\n<p><strong>Host: Well, but the precompute part is innovative in terms of a thought process on how you would accomplish the speed-up\u2026<\/strong><\/p>\n<p>Nikunj Raghuvanshi: There have been papers on precomputed acoustics, academically before, but this realization that mixing precomputation and extracting these perceptual parameters? That is a recipe that makes a lot of practical sense. Because a third thing that I haven\u2019t mentioned yet is going to the perceptual domain, now the sound designer can make sense of the numbers coming out of this whole system. Because it\u2019s loudness. It\u2019s reverberation time, how long the sound is reverberating. And these numbers that are super-intuitive to sound designers, they already deal with them. So, now what you are telling them is, hey, you used to start with a blank world, which had nothing, right? Like the world before the act of creation, there\u2019s nothing. It\u2019s just empty space and you are trying to make things reverberate this way or that, now you don\u2019t need to do that. Now physics will execute first ,on the actual scene with the actual materials, and then you can say I don\u2019t like what physics did here or there, let me tweak it, let me modify what the real result is and make it meet the artistic goals I have for my game.<\/p>\n<p>(music plays)<\/p>\n<p><strong>Host: We\u2019ve talked about indoor audio modeling, but let\u2019s talk about the outdoors for now and the computational challenges to making natural outdoor sounds, sound convincing.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah.<\/p>\n<p><strong>Host: How have people hacked it in the past and how does your work in ambient sound propagation move us forward here?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, we\u2019ve hacked it in the past! Okay. This is something we realized on <em>Gears of War<\/em> because the parameters we use were borrowed, again, from the concert hall literature and, because they\u2019re parameters informed by concert halls, things sound like halls and rooms. Back in the days of Doom, this tech would have been great because it was all indoors and rooms, but in <em>Gears of War<\/em>, we have these open spaces and it doesn\u2019t sound quite right. Outdoors sounds like a huge hall and you know, how do we do wind ambiances and rain that\u2019s outdoors? And so, we came up with a solution for them at that time which we called \u201coutdoorness.\u201d It\u2019s, again, an invented word.<\/p>\n<p><strong>Host: Outdoorness.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Outdoorness.<\/p>\n<p><strong>Host: I\u2019m going to use that. I like it.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Because the idea it\u2019s trying to convey is, it\u2019s not a binary indoor\/outdoor. When you are crossing a doorway or a threshold, you expect a smooth transition. You expect that, I\u2019m not hearing rain inside, I\u2019m feeling nice and dry and comfortable and now I\u2019m walking into the rain\u2026<\/p>\n<p><strong>Host: Yeah.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: \u2026and you want the smooth transition on it. So, we built a sort of custom tech to do that outdoor transition. But it got us thinking about, what\u2019s the right way to do this? How do you produce the right sort of spatial impression of, there\u2019s rain outside, it\u2019s coming through a doorway, the doorway is to my left, and as you walk, it spreads all around you. You are standing in the middle of rain now and it\u2019s all around you. So, we wanted to create that experience. So, the ambient sound propagation work was an intern project and now we finished it up with our collaborators in Cornell. And that was about, how do you model extended sound sources? So, again, going back to concert halls, usually people have dealt with point-like sources which might have a directivity pattern. But rain is like a million little drops. If you try to model each and every drop, that\u2019s not going to get you anywhere. So, that\u2019s what the paper is about, how to treat it as one aggregate that somebody gave us? And we produce an aggregate sort of energy distribution of that thing along with this directional characteristics and just encode that.<\/p>\n<p><strong>Host: And just encode it.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: And just encode it.<\/p>\n<p><strong>Host: How is it working?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: It works nice. It sounds good. To my ears it sounds great.<\/p>\n<p><strong>Host: Well you know, and you\u2019re the picky one, I would imagine.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah. I\u2019m the picky one and also when you are doing iterations for a paper, you also completely lose objectivity at some point. So, you\u2019re always looking for others to get some feedback.<\/p>\n<p><strong>Host: Here, listen to this.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Well, reviewers give their feedback, so, yeah.<\/p>\n<p><strong>Host: Sure. Okay. Well, kind of riffing on that, there\u2019s another project going on that I\u2019d love for you to talk as much as you can about called <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/acoustics\/what-is-acoustics\">Project Acoustics<\/a> and kind of the future of where we\u2019re going with this. Talk about that.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: That\u2019s really exciting. So, up to now, Project Triton was an internal tech which we managed to propagate from research into actual Microsoft product, internally.<\/p>\n<p><strong>Host: Um-hum.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.youtube.com\/watch?v=pIzwo-MxCC8\">Project Acoustics<\/a> is being led by Noel Cross\u2019s team in Azure Cognition. And what they\u2019re doing is turning it into a product that\u2019s externally usable. So, trying to democratize this technology so it can be used by any game audio team anywhere backed by Azure compute to do the precomputation.<\/p>\n<p><strong>Host: Which is key, the Azure compute.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, because you know, it took us a long time, with <em>Gears of War<\/em> to figure out, okay, where is all this precompute going to happen?<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: We had to figure out the whole cluster story for themselves, how to get the machines, how to procure them, and there\u2019s a big headache of arranging compute for yourself. And so that\u2019s, logistically, a key problem that people face when they try to think of precomputed acoustics. The run-time side, Project Acoustics, we are going to have plug-ins for all the standard game audio engines and everything. So, that makes things simpler on that side. But a key blocker in my view was always this question of, where are you going to precompute? So, now the answer is simple. You get your Azure badge account and you just send your stuff up there and it just computes.<\/p>\n<p><strong>Host: Send it to the cloud and the cloud will rain it back down on you.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yes. It will send down data.<\/p>\n<p><strong>Host: Who is your audience for Project Acoustics?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Project Acoustics, the audience is the whole game audio industry. And our real hope is that we\u2019ll see some uptake on it when we announce it at GDC in March, and we want people to use it, as many teams, small, big, medium, everybody, to start using this because we feel there\u2019s a positive feedback loop that can be set up where you have these new tools available, designers realize that they have these new tools available that have shipped in Triple A games, so they do work. And for them to give us feedback. If they use these tools, we hope that they can produce new audio experiences that are distinctly different so that then they can say to their tech director, or somebody, for the next game, we need more CPU budget. Because we\u2019ve shown you value. So, a big exercise was how to fit this within current budgets so people can produce these examples of novel possible experiences so they can argue for more. So, to increase the budget for audio and kind of bring it on par with graphics over time as you alluded to earlier.<\/p>\n<p><strong>Host: You know, if we get nothing across in this podcast, it\u2019s like, people, pay attention to good audio. Give it its props. Because it needs it. Let\u2019s talk briefly about some of the other applications for computational acoustics. Where else might it be awesome to have a layer of realism with audio computing?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: One of the applications that I find very exciting is for audio rendering for people who are blind. I had the opportunity to actually show the demo of our latest system to Daniel Kish, who, if you don\u2019t know, he\u2019s the human echo-locator. And he uses clicks from his mouth to actually locate geometry around him and he\u2019s always oriented. He\u2019s an amazing person. So that was a collaboration, actually, we had with a team in the Garage. They released a game called Ear Hockey and it was a nice collaboration, like there was a good exchange of ideas over there. That\u2019s nice because I feel that\u2019s a whole different application where it can have a potential social positive impact. The other one that\u2019s very interesting to me is that we lived in 2-D desktop screens for a while and now computing is moving into the physical world. That\u2019s the sort of exciting thing about mixed reality, is moving compute out into this world. And then the acoustics of the real world being folded into the sounds of virtual objects becomes extremely important. If something virtual is right behind the wall from you, you don\u2019t want to listen to it with full loudness. That would completely break the realism of something being situated in the real world. So, from that viewpoint, good light transport and good sound propagation are both required things for the future compute platform in the physical world. So that\u2019s a very exciting future direction to me.<\/p>\n<p>(music plays)<\/p>\n<p><strong>Host: It\u2019s about this time in the podcast I ask all my guests the infamous \u201cwhat keeps you up at night?\u201d question. And when you and I talked before, we went down kind of two tracks here, and I felt like we could do a whole podcast on it, but sadly we can\u2019t\u2026 But let\u2019s talk about what keeps you up at night. Ironically to tee it up here, it deals with both getting people to use your technology\u2026<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Um-hum.<\/p>\n<p><strong>Host: And keeping people from using your technology.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: No! I wanted everybody to use the technology. But I\u2019d say like five years ago, what used to keep me up at night is like, how are we going to ship this thing in <em>Gears of War<\/em>? Now what\u2019s keeping me up at night is how do we make Project Acoustics succeed and how do we you know expand the adoption of it and, in a small way, try to improve, move the game audio industry forward a bit and help artists do the artistic expression they need to do in games? So, that\u2019s what I\u2019m thinking right now, how can we move things forward in that direction? I frankly look at video games as an art form. And I\u2019ve gamed a lot in my time. To be honest, all of it wasn\u2019t art, I was enjoying myself a lot and I wasted some time playing games. But we all have our ways to unwind and waste time. But good games can be amazing. They can be much better than a Hollywood movie in terms of what you leave them with. And I just want to contribute in my small way to that. Giving artists the tools to maybe make the next great story, you know.<\/p>\n<p><strong>Host: All right. So, let\u2019s do talk a little bit, though, about this idea of you make a really good game\u2026<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Um-hum.<\/p>\n<p><strong>Host: Suddenly, you\u2019ve got a lot of people spending a lot of time. I won\u2019t say wasting. But we have to address the nature of gaming, and the fact that there are you know\u2026 you\u2019re upstream of it. You are an artist, you are a technologist, you are a scientist\u2026<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Um-hum.<\/p>\n<p><strong>Host: And it\u2019s like I just want to make this cool stuff.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah.<\/p>\n<p><strong>Host: Downstream, it\u2019s people want people to use it a lot. So, how do you think about that and the responsibilities of a researcher in this arena?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah. You know, this reminds me of Kurt Vonnegut\u2019s book, Cat\u2019s Cradle? He kind of makes \u2013 what there\u2019s scientist who makes Ice 9 and it freezes the whole planet or something. So, you see things about video games in the news and stuff. But I frankly feel that the kind of games I\u2019ve participated in making, these games are very social experiences. People meet on the games a lot. Like Sea of Thieves is all about, you get a bunch of friends together, you\u2019re sitting on the couch together, and you\u2019re just going crazy like on these pirate ships and trying to just have fun. So, they are not the sort of games where a person is being separated from society by the act of gaming and just is immersed in the screen and is just not participating in the world. They are kind of the opposite. So, games have all these aspects. And so, I personally feel pretty good about the games I\u2019ve contributed to. I can at least say that.<\/p>\n<p><strong>Host: So, I like to hear personal stories of the researchers that come on the podcast. So, tell us a little bit about yourself. When did you know you wanted to do science for a living and how did you go about making that happen?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Science for a living? I was the guy in 6th grade who\u2019d get up and say I want to be a scientist. So, that was then, but what got me really hooked was graphics, initially. Like I told you, I found the book which had these color plates and I was like, wow, that\u2019s awesome! So, I was at UNC Chapel Hill, graphics group, and I studied graphics for my graduate studies. And then, in my second or third year, my advisor, Ming Lin, she does a lot of research in physical simulations. How do we make water look nice in physical simulations? Lots of it is CGI. How do you model that? How do you model cloth? How do you model hair? So, there\u2019s all this physics for that. And so, I took a course with her and I was like, you know what? I want to do audio because you get a different sense, right? It\u2019s simulation, not for visuals, but you get to hear stuff. I\u2019m like okay, this is cool. This is different. So, I did a project with her and I published a paper on sound synthesis. So, like how rigid bodies, like objects rolling and bouncing around and sliding make sound, just from physical equations. And I found a cool technique and I was like okay, let me do acoustics with this. It\u2019s going to be fun. And I\u2019m going to publish another paper in a year. And here I am, still trying to crack that problem of how to do acoustics in spaces!<\/p>\n<p><strong>Host: Yeah, but what a place to be. And speaking of that, you have a really interesting story about how you ended up at Microsoft Research and brought your entire PhD code base with you.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah. It was an interesting time. So, when I was graduating, MSR was my number one choice because I was always thinking of this technology as, it would be great if games used this one day. This is the sort of thing that would have a good application in games. And then, around that time, I got hired to MSR and it was a multicore incubation back then, my group was looking at how do these multicore systems enable all sorts of cool new things? And one of the things my hiring manager was looking at was how can we do physically based sound synthesis and propagation. So, that\u2019s what my PhD was, so they licensed the whole code base and I built on that.<\/p>\n<p><strong>Host: You don\u2019t see that very often.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, it was nice.<\/p>\n<p><strong>Host: That\u2019s awesome. Well, Nikunj, as we close, I always like to ask guests to give some words of wisdom or advice or encouragement, however it looks to you. What would you say to the next generation of researchers who might want to make sound sound better?<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yeah, it\u2019s an exciting area. It\u2019s super-exciting right now. Because even like just to start from more technical stuff, there are so many problems to solve with acoustic propagation. I\u2019d say we\u2019ve taken just the first step of feasibility, maybe a second one with Project Acoustics, but we\u2019re right at the beginning of this. And we\u2019re thinking there are so many missing things, like outdoors is one thing that we\u2019ve kind of fixed up a bit, but we\u2019re going towards what sorts of effects can you model in the future? Like directional sources is one we\u2019re looking at, but there are so many problems. I kind of think of it as the 1980s of graphics when people first figured out that you can make this work. You can make light propagation work. What are the things that you need to do to make it ever closer to reality? And we\u2019re still at it. So, I think we\u2019re at that phase with acoustics. We\u2019ve just figured out this is one way that you can actually ship in practical applications and we know there are deficiencies in its realism in many, many places. So, I think of it as a very rich area that students can jump in and start contributing.<\/p>\n<p><strong>Host: Nowhere to go but up.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Yes. Absolutely!<\/p>\n<p><strong>Host: Nikunj Raghuvanshi, thank you for coming in and talking us today.<\/strong><\/p>\n<p>Nikunj Raghuvanshi: Thanks for having me.<\/p>\n<p>(music plays)<\/p>\n<p>To learn more about Dr. Nikunj Raghuvanshi and the science of sound simulation, visit <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/\">Microsoft.com\/research<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Episode 68, March 20, 2019 &#8211; Today, Dr. Raghuvanshi talks about the unique challenges of simulating realistic sound on a budget (both money and CPU), explains how classic ideas in concert hall acoustics need a fresh take for complex games like Gears of War, reveals the computational secret sauce you need to deliver the right sound at the right time, and tells us about Project Triton, an acoustic system that models how real sound waves behave in 3-D game environments to makes us believe with our ears as well as our eyes.<\/p>\n","protected":false},"author":38022,"featured_media":573663,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/42626424\/","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[240054],"tags":[],"research-area":[243062,13551],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-573651","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-audio-acoustics","msr-research-area-graphics-and-multimedia","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/42626424\/","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[546345],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788.png\" class=\"img-object-cover\" alt=\"Nikunj Raghuvanshi wearing glasses and smiling at the camera\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/03\/Nikunj-Raghuvanshi-POD_Site_02_2019_1400x788-343x193.png 343w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"March 20, 2019","formattedExcerpt":"Episode 68, March 20, 2019 - Today, Dr. Raghuvanshi talks about the unique challenges of simulating realistic sound on a budget (both money and CPU), explains how classic ideas in concert hall acoustics need a fresh take for complex games like Gears of War, reveals&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/573651","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/38022"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=573651"}],"version-history":[{"count":12,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/573651\/revisions"}],"predecessor-version":[{"id":574695,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/573651\/revisions\/574695"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/573663"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=573651"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=573651"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=573651"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=573651"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=573651"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=573651"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=573651"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=573651"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=573651"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=573651"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=573651"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}