{"id":664818,"date":"2020-06-17T03:00:32","date_gmt":"2020-06-17T10:00:32","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=664818"},"modified":"2020-06-18T07:27:27","modified_gmt":"2020-06-18T14:27:27","slug":"accessible-systems-for-sign-language-computation-with-dr-danielle-bragg","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/accessible-systems-for-sign-language-computation-with-dr-danielle-bragg\/","title":{"rendered":"Accessible systems for sign language computation with Dr. Danielle Bragg"},"content":{"rendered":"
Many computer science researchers set their sights on building general AI technologies that could impact hundreds of millions \u2013 or even billions \u2013 of people. But Dr. Danielle Bragg (opens in new tab)<\/span><\/a>, a senior researcher at MSR\u2019s New England lab (opens in new tab)<\/span><\/a>, has a slightly smaller and more specific population in mind: the some seventy million people worldwide who use sign languages as their primary means of communication.<\/p>\n Today, Dr. Bragg gives us an insightful overview of the field and talks about the unique challenges and opportunities of building systems that expand access to information in line with the needs and desires of the deaf and signing community.<\/p>\n Danielle Bragg\u00a0opening quote:\u00a0As machine learning becomes more powerful, having data to train those models on becomes increasingly valuable. And,\u00a0in working with minority populations, we\u2019re often working in data-scarce environments because the population is small and there might be other barriers to collecting data from those groups in order to build these powerful tools that actually can really benefit these\u00a0minority\u00a0communities.<\/p>\n Host:\u00a0<\/b>You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/b><\/p>\n Host:\u00a0<\/b>Many computer science researchers set their sights on building general AI technologies that could impact hundreds of millions \u2013 or even billions \u2013 of people. But Dr. Danielle Bragg, a senior researcher at MSR\u2019s New England lab, has a slightly smaller and more specific population in mind:\u00a0<\/b>the some<\/b>\u00a0<\/b>seventy<\/b>\u00a0million people worldwide who use sign languages as their primary means of communication.<\/b><\/p>\n Today, Dr. Bragg gives us an insightful overview of the field and talks about the unique challenges and opportunities of building systems that expand access to information in line with the needs and desires of the\u00a0<\/b>Deaf<\/b>\u00a0and signing community.\u00a0<\/b>That and much more on this episode of the Microsoft Research Podcast.<\/b><\/p>\n Host: Danielle Bragg, welcome to the podcast.<\/b><\/p>\n Danielle Bragg: Thank you. It\u2019s great to talk to you.<\/p>\n Host: I like to start by situating both my guests and their labs and let\u2019s start with your lab. You\u2019re a senior researcher at MSR New England and you\u2019re hanging out in Cambridge, Massachusetts?\u00a0<\/b>Tell us about the work going on in the lab in general and why it\u2019s important. What\u00a0<\/b>gaps \u2013 or\u00a0<\/b>research gaps\u00a0<\/b>\u2013\u00a0<\/b>do the folks in Cambridge fill?<\/b><\/p>\n Danielle Bragg: Yeah, we\u2019re located in Cambridge. The lab is a very interdisciplinary place to be. We have a lot of people from different fields, not just computer scientists, but we also have economists, people who work in theory and social science, which makes it\u00a0a really interesting\u00a0place to work. We work on problems that are both technically challenging and have societal impact. So, the collection of skills that we have really fits that mission. We have machine learning experts, economists, theory folks as well as social scientists all working side-by-side, which makes it a\u00a0really rich\u00a0place to work.<\/p>\n Host: Yeah. Well, let\u2019s situate you now. Many of my guests locate themselves at an intersection. Where do you live, Danielle? What are your\u00a0<\/b>particular research<\/b>\u00a0interests and passions and what gets you up in the morning?<\/b><\/p>\n Danielle Bragg: Yeah, so my work lies within human-computer interaction,\u00a0or HCI, but it also fits under accessibility and applied machine learning, so kind of the intersection of those three, I would say.<\/p>\n Host: What is particularly interesting to you right now as you are looking at the sort of broad picture of the research you\u2019re doing?<\/b><\/p>\n Danielle Bragg: My work primarily focuses on building systems that expand access to information,\u00a0in particular for\u00a0people with disabilities and, in the last few years, in particular for sign language users. So I\u2019m a computer scientist and, as we discussed, I fall under the general umbrella of HCI, but also touch on other fields and I would say I\u2019m personally very motivated by problem solving and by working on problems that I feel have some positive impact in the world.<\/p>\n Host: Well, before I talked to you earlier, I\u00a0<\/b>have to<\/b>\u00a0admit, I had some big gaps in my understanding of what sign language actually is and how it works, and perhaps some of our listeners are the same. So, give us a bit of a primer on this very singular language. How and why does ASL pose unique challenges for technical applications that other languages don\u2019t, and how do you answer the question<\/b>,<\/b>\u00a0since most\u00a0<\/b>Deaf<\/b>\u00a0people can see and read, why not just use English?<\/b><\/p>\n Danielle Bragg: Yeah, those are great questions and a great place to start. So ASL stands for American Sign Language, for those listening\u00a0who\u00a0haven\u2019t heard the acronym before, and ASL is a natural language. It has its own grammar and vocabulary just like English or any other spoken language. There are\u00a0actually many\u00a0different sign languages used around the world, which some people may not realize,\u00a0and American Sign Language\u00a0is the primary language of the\u00a0Deaf\u00a0community in the United States, specifically, as well as a few other areas around the world. There are\u00a0a number of\u00a0linguistic features that make up sign languages just like there are linguistic features that make up spoken languages. For ASL, there is hand shape, location of the hand on the body, and movement. Those are the three primary types of features, but there are a whole host of other features that are also important. For example, non-manuals which include facial expressions and other types of body gestures. There\u2019s finger spelling, classifiers, depictions, where\u00a0you\u2019re\u00a0kind of acting out certain content. It\u2019s a\u00a0really beautiful\u00a0language and there\u2019s a really rich culture centered around it, just like there\u2019s\u00a0rich culture centered around other languages around the world.<\/p>\n Host: So interestingly, I wasn\u2019t even going to ask this, but as you bring it up, I\u2019m thinking to myself, are there idioms within the language? Is there slang within the language? Are there things that are outside the normal sort of structural grammar of the language as it evolves with people and, and generations?<\/b><\/p>\n Danielle Bragg: Yeah. Yeah, there\u00a0definitely are. There are different dialects used by different sub-populations. There are also just\u00a0really rich\u00a0genres of literature. There\u2019s\u00a0Deaf\u00a0poetry.\u00a0There\u2019s\u00a0certain types of stories that people like to tell and because the language is visual,\u00a0there\u2019s a lot of richness there that you don\u2019t really get with spoken languages. But I should also give a disclaimer. I\u2019m not deaf myself and I\u2019m still learning a lot about this space. I\u2019ve taken some ASL classes and learned about\u00a0Deaf\u00a0culture and the\u00a0Deaf\u00a0community, but, you know,\u00a0I don\u2019t have a lifetime of experience so I\u2019m always learning as well.<\/p>\n Host: Just as a point of interes<\/b>t, i<\/b>s there Chinese Sign Language? Is there Spanish Sign Language, French Sign Language? Or is it that granular?<\/b><\/p>\n Danielle Bragg: Yes. There\u2019s Greek Sign Language, British Sign Language, French Sign Language\u2026\u00a0there are many different sign languages across the world.<\/p>\n Host: Okay.<\/b><\/p>\n Danielle Bragg: However, American Sign Language is\u00a0actually more\u00a0closely related to French Sign Language than it is to British Sign Language so the relationships between the sign languages don\u2019t always mirror,\u00a0exactly,\u00a0the relationships between\u2026<\/p>\n Host: Spoken languages\u2026.<\/b><\/p>\n Danielle Bragg:\u00a0\u2026the spoken languages.<\/p>\n Host: Interesting.<\/b><\/p>\n Danielle Bragg: And that\u2019s because, you know, there\u2019s a different history\u00a0and\u00a0evolution and the groups of people who are using those languages mixed in slightly different ways, but they are basically geographically situated because people who physically live near one another talk to one another more.<\/p>\n Host: Right, right, right. I\u2019ve got my mouth open just\u00a0<\/b>like,<\/b>\u00a0I didn\u2019t know that<\/b>\u2026<\/b>\u00a0I didn\u2019t know that either.\u00a0<\/b>W<\/b>e\u2019re off to a great start! Well, before we get into your technical work, I think it\u2019s really important to understand who you\u2019re doing it for and why you\u2019re doing it and we sort of alluded to that already, but when we talked before, you mentioned two groups<\/b>:<\/b>\u00a0<\/b>Deaf<\/b>\u00a0signers and people who have hearing loss but don\u2019t sign. So, you\u2019re addressing two<\/b>,<\/b>\u00a0sort of<\/b>,<\/b>\u00a0populations there. More interestingly to me though<\/b>,<\/b>\u00a0is that fact that you frame this whole thing in terms of culture. So, I\u2019d like you to talk about your goal of improving technical accessibility for the two main groups and how that plays out, but maybe you could help us understand the cultural aspects first?<\/b><\/p>\n Danielle Bragg: So the\u00a0Deaf\u00a0community has a really rich culture and ASL is a very important part of that culture and in this conversation we\u2019re focusing on ASL because we\u2019re here in the US and that\u2019s where most of my work is focused, but a lot of this applies to other sign languages as well. And within the\u00a0Deaf\u00a0community, ASL has a\u00a0really sacred\u00a0place, I would say. It\u2019s a\u00a0really\u00a0beautiful\u00a0language and it\u2019s kind of the binding glue for the community in many ways,\u00a0and a lot of my work focuses on helping to preserve ASL and supporting people who want to use ASL.\u00a0So,\u00a0a lot of my work is about supporting sign language use and supporting people using it in their interactions with technology.\u00a0Being deaf is a point of cultural pride for many people so,\u00a0many people who are deaf don\u2019t view themselves as disabled. They view being deaf as a cultural identity. If you\u2019re deaf, you can still do anything that anyone else does. You can go to the store, you can drive a car, you can go to work, but the communication piece is where the barriers come into play and communication is central to culture, right? So, people who share a language develop cultural connections with one another in a different way.<\/p>\n Host: Well, you\u2019ve put the big picture of building accessible information systems in a data-driven frame. So, talk about this approach,\u00a0<\/b>writ large<\/b>,<\/b>\u00a0<\/b>and how it\u2019s informing the specific projects and papers you\u2019re working on since data is central to the\u00a0<\/b>technological<\/b>\u00a0approaches that many of you are working on right now.<\/b><\/p>\n Danielle Bragg: Yeah, data is central to a lot of technologies that are being developed.\u00a0As machine learning becomes more powerful, having data to train those models on becomes increasingly valuable. And,\u00a0in working with minority populations, we\u2019re often working in data-scarce environments because the population is small and there might be other barriers to collecting data from those groups in order to build these powerful tools that actually can really benefit these minority\u00a0communities.\u00a0And so,\u00a0in my work, I try to build data-driven solutions and in doing that, I often will try to\u00a0actually collect\u00a0data in a system that is also providing benefit to the community. So, we don\u2019t have to go to the community and say, oh, give us your data, we\u2019ll pay you, or provide some other kind of compensation. If we can\u00a0actually build\u00a0systems that provide benefit to the community while they\u2019re contributing,\u00a0that can be a much more organic solution to this type of problem.<\/p>\n Host: Okay<\/b>\u2026 i<\/b>f you\u2019re approaching this from a data-driven perspective, and data is scarce, what\u2019s the\u00a0<\/b>biggest<\/b>\u00a0problem\u00a0<\/b>you face in your research right now?<\/b><\/p>\n Danielle Bragg: Well, I would say one of the biggest challenges is dealing with this data scarcity and figuring out how to collect data in these environments\u00a0actually presents\u00a0a host of really rich research problems to work on. You can be\u00a0really creative\u00a0in designing systems that incentivize people to participate and provide benefit while also collecting data to then train other models and provide other types of services.<\/p>\n Host: Well, let\u2019s go upstream for a second and talk about\u00a0<\/b>what kinds of models you want to provide that you would need this data for. So,\u00a0<\/b>what kinds of<\/b>,<\/b>\u00a0sort of<\/b>,<\/b>\u00a0top-level\u00a0<\/b>applications or\u00a0<\/b>solutions are you aiming for?<\/b><\/p>\n Danielle Bragg: Yeah, so within the sign language space, the dream, in some sense, would be to provide end-to-end translation between, for example, English and American Sign Language,\u00a0and that translation needs to be bi-directional, right? So, it\u2019s not enough to just recognize signs and\u00a0translate that into English. We also need to let the\u00a0Deaf\u00a0person know what, you know, people speaking in English around them are saying. So, we need to translate from English to American Sign Language as well.\u00a0And recently, there have been some advances in deep learning and convolutional neural nets,\u00a0in particular,\u00a0that seem promising in this space, but\u00a0it\u2019s important to note that any technical solution would be dictated by the needs of the\u00a0Deaf\u00a0community and would not be a replacement for human interpreters.<\/p>\n (music plays)<\/i><\/b><\/p>\n Host:\u00a0<\/b>L<\/b>et\u2019s talk about what you call sign language computation, which is sort of an umbrella term for all the research going on here. Give us an overview of the current state-of-the-art for sign language computation and then<\/b>\u00a0\u2013<\/b>\u00a0<\/b>and\u00a0<\/b>this is going to be a multi-part question so I will keep bringing us back<\/b>,<\/b>\u00a0making sure we cover everything<\/b>\u00a0\u2013 t<\/b>alk about the biggest challenges you face in five areas that you identify as datasets<\/b>\u00a0(<\/b>which we\u2019ve sort of already talked about<\/b>)<\/b>;\u00a0<\/b>recognition in computer vision<\/b>;<\/b>\u00a0modeling\u00a0<\/b>and<\/b>\u00a0NLP<\/b>;<\/b>\u00a0avatars and graphics<\/b>;<\/b>\u00a0and then UI\/UX design. That\u2019s a lot to unpack. If we get lost, I\u2019ll bring us back, but let\u2019s start with the state<\/b>–<\/b>of<\/b>–<\/b>the<\/b>–<\/b>art of sign language\u00a0<\/b>computation<\/b>.<\/b><\/p>\n Danielle Bragg: Sure. So\u00a0that breakdown into those five groups is\u00a0really helpful\u00a0for thinking about this space. So those five areas are really needed for developing end-to-end, bi-directional translation. So, first, we\u2019ll talk about datasets. Existing sign language datasets are primarily in video format and there are\u00a0a number of\u00a0different ways that people have tried to collect these videos. You can curate videos from professional interpreters. You can try to scrape different online resources, but these are all limited in some way. In particular, the diversity of the signers in the videos, how many\u00a0Deaf\u00a0fluent signers you get as opposed to students or professional interpreters is also limited,\u00a0often,\u00a0and just the sheer size of the dataset is also very limited. So, to put that last problem in context, for speech corpuses, we typically have datasets between five million words and one billion words large, and for sign language datasets, the largest datasets we have are less\u00a0a hundred thousand\u00a0signs,\u00a0total. So that\u2019s a very large difference in how much data we have and if you think about the history of speech recognition, how long it took them to get to where they are today,\u00a0and how much difference having all that data has made,\u00a0that might put into context for you how hard this is.<\/p>\n Host: Okay, so if we\u2019re talking about datasets being limited and you\u2019re looking for robust machine learning models to help get to a robust sign language computation application,<\/b>\u00a0<\/b>how do the other things play in? You mentioned recognition and computer vision. Let\u2019s talk about that for a second.<\/b><\/p>\n Danielle Bragg: Yeah, so in the space of\u00a0recognition\u00a0and computer vision for sign language recognition, it\u2019s a pretty young field dating back to the \u201880s when people used hard wired circuits and rule-based approaches. For example, fitted to gloves that had little sensors in them. Those types of systems are limited in how well they work. In addition to the technical constraints, gloves also have other problems, so if you\u2019re using gloves for recognition, you\u2019re missing a lot\u00a0of important grammar information that is on the face, for example, and you\u2019re asking someone to carry around gloves and put them on all the time and they also don\u2019t provide this bi-directional translation that\u2019s really needed to have a conversation, right? If you\u2019re wearing gloves and signing, maybe some microphone can speak out what you\u2019re saying, but then if someone talks back to you, you have no idea what they\u2019re saying.\u00a0<\/b>So, it\u2019s a very incomplete solution. But for technical reasons, people started out with those types of approaches. More recently, advances in neural networks, for example, CNNs\u00a0and hybrid models that pull together information from different types of models, have been promising, but we\u2019re still operating in this data-limited environment so we don\u2019t actually know how well those models might perform given enough data.<\/p>\n Host: All right, so the recognition and computer vision state-of-the-art isn\u2019t very good state-of-the-art is what you\u2019re saying\u2026<\/b><\/p>\n Danielle Bragg: Yeah, basically.<\/p>\n Host: And so, the challenge for researchers there is, what\u00a0<\/b>could<\/b>\u00a0we do instead<\/b>,<\/b>\u00a0or how\u00a0<\/b>could<\/b>\u00a0we augment or advance what we\u2019ve done in these areas with new tools? New approaches?<\/b><\/p>\n Danielle Bragg:\u00a0I mean, yeah, people are playing around with different types of models. People have also tried to be clever with pulling together multiple datasets, for example, or tuning parameters in certain ways, but ultimately, my intuition is that we really need more data.\u00a0Once we have more data, we can figure out how to finesse the models, but we don\u2019t even know how far the models can take us right now because we don\u2019t have the data to fully try them out.<\/p>\n Host: All right, well, I want to get back to how you\u2019re going to go about getting data because we had\u00a0<\/b>a really interesting<\/b>\u00a0conversation about that a couple days ago, but let\u2019s continue to unpack these five areas. The next one we talked about was modeling\u00a0<\/b>and<\/b>\u00a0NLP, natural language processing. How does that play into this?<\/b><\/p>\n Danielle Bragg: Yeah, so modeling and NLP is very important for figuring out how to translate and how to do other interesting computations with sign language.\u00a0These types of approaches have traditionally been designed for spoken and written languages, which introduces certain difficulties. For example, there are certain assumptions with written and spoken languages,\u00a0in particular that\u00a0one sound happens at a time, but in sign languages, one movement doesn\u2019t always happen at a time. You can have multiple things going on at the same time and some of these models don\u2019t allow for those types of complexities that a sign language might have. Another complexity is that the use of space can be contextual in sign languages. So sometimes, if you point to the right of you, you might be referring to yourself at home. In\u00a0another point,\u00a0while you\u2019re talking to someone, you\u00a0could reestablish that area to mean yourself at the coffee shop.\u00a0And so, we need to have contextual models that can recognize these types of nuances and the models built for speech don\u2019t account for these types of complexities.<\/p>\n Host: Right.<\/b><\/p>\n Danielle Bragg: So, we may need new types of models.<\/p>\n Host: Okay.<\/b><\/p>\n Danielle Bragg: Another big problem in this space is a lack of annotation. So even if we have videos of people signing, we often don\u2019t have written annotations of what is\u00a0actually being\u00a0signed and a lot of the NLP techniques, for example, really rely on annotations that computers can process in order to work.<\/p>\n Host: Okay. These are huge challenges. Well, let\u2019s talk about avatars and graphics as another challenge in this field.<\/b><\/p>\n Danielle Bragg: Yeah, so avatars and graphics are needed to render content in a sign language. So,\u00a0we\u2019ve talked about this bi-directional translation that would be great to facilitate and, in moving from English to ASL, for example,\u00a0you\u00a0need some kind of rendering of the signed\u00a0content,\u00a0and avatars and computer graphics provide a nice way to do that. The process of creating an avatar is actually really complex and,\u00a0right at the moment, a human is needed to intervene at basically every step of the way, so we have a lot of work to do in this space as well, but typically, the process starts with some\u00a0kind\u00a0of annotated\u00a0script that gets translated into a motion plan for the avatar.\u00a0A number of\u00a0parameters,\u00a0then,\u00a0need to be tuned. For example, speed within individual signed units or cross-signed units,\u00a0and then finally, we need some animation software to\u00a0actually render\u00a0the avatar. I should also mention that avatars have had mixed reception among the\u00a0Deaf\u00a0community, especially if they are not very realistic looking, they can be kind of disturbing to look at,\u00a0so there are lots of challenges in this space.<\/p>\n Host: Are they sophisticated enough to even get to the uncanny valley or are they just lame?<\/b><\/p>\n Danielle Bragg:\u00a0Ahhh,\u00a0I mean,\u00a0it probably depends on the avatar\u2026!<\/p>\n Host: I suppose. Well, either way it sounds expensive and cumbersome to have this be an approach that\u2019s viable.<\/b><\/p>\n Danielle Bragg: Yeah, it is difficult. I mean, there are some companies and research groups that have tried to make avatars and they typically spend a lot of money collecting very high quality examples of signs that they can later string together in the avatar, but even with that, you need a human to come in and manage and clean up whatever is generated.<\/p>\n Host: Well, let\u2019s talk about UI\/UX design and that interface between\u00a0<\/b>Deaf<\/b>\u00a0signers and computation. What are the challenges there?<\/b><\/p>\n Danielle Bragg: So,\u00a0I think UI\/UX design is another\u00a0really rich\u00a0space for exploration and development.\u00a0In particular because\u00a0sign language is a different modality from written and spoken languages, but again, a big challenge here is designing interfaces that will be useful despite our lack of data and despite the limitations that our current technologies have.<\/p>\n Host:\u00a0<\/b>Mmm<\/b>-hmm.<\/b><\/p>\n Danielle Bragg: So, figuring out ways to\u00a0provide a human-in-the-loop solution,\u00a0or provide results that are good enough that can then learn from users as they\u2019re using the system or other types of creative ways to support users becomes a really rich space for design and exploration.<\/p>\n Host: Right. Right. So, there\u2019s a lot of opportunity for research in this area and probably a lot of room for other researchers to join the efforts<\/b>\u2026<\/b><\/p>\n Danielle Bragg: Yeah, definitely. I think it\u2019s also one of the most interdisciplinary spaces that I\u2019ve come across, right? You need people who are experts in\u00a0Deaf\u00a0studies and linguistics and HCI and machine learning. You need\u00a0all of\u00a0these areas to come together to make something that\u2019s really going to be useful for the community.<\/p>\n Host:<\/b>\u00a0<\/b>Tell me a little bit more about your ideas and approaches for\u00a0<\/b>actually gathering<\/b>\u00a0data. You\u2019ve alluded to some of the difficulties in the existing datasets. So how might you broaden your data collection?<\/b><\/p>\n Danielle Bragg: Yeah, so that\u2019s a great question. I can give an example of one system that I\u2019ve been working on that both provides benefit to the community and collects useful data at the same time.\u00a0So,\u00a0one project I\u2019ve been working on,\u00a0it was started when I was a PhD student at University of Washington with my former advisor Richard Ladner there, is to build an ASL dictionary. So, if you come across a sign that you don\u2019t know and you want to look it up, that can be really challenging.\u00a0Existing search engines and search interfaces are typically designed around English, but it\u2019s\u00a0really hard\u00a0to describe a sign in English and we also just don\u2019t have videos indexed that way, right? Like, what would your query look like? Right hand moves right, left hand moves up, you know?<\/p>\n Host: Right.<\/b><\/p>\n Danielle Bragg: Two fingers extended. We just don\u2019t support those types of queries,\u00a0and\u00a0also,\u00a0searching by gesture recognition\u00a0also\u00a0doesn\u2019t work very well because we don\u2019t really have the capabilities working accurately yet.\u00a0So, we designed a feature-based dictionary where you can select a set of features that describe the sign that you\u2019re trying to look up, for example, different types of hand shapes or movements,\u00a0and then we match that against a database of past queries that we have for signs in the database and sort the results based on similarity to past queries in order to give you a good result. And in this way, while you\u2019re using the dictionary to look up a\u00a0sign, you\u2019re\u00a0actually providing\u00a0data that can be used to improve the models and improve results for people in the future.<\/p>\n Host: Right.<\/b><\/p>\n Danielle Bragg: So, these types of systems,\u00a0where users are providing data that will\u00a0actually improve\u00a0the system going forward,\u00a0can be a really nice way to jump-start this problem of data scarcity.<\/p>\n Host: Right. And you talked earlier about existing datasets, which involve maybe some videos that have been taken from somebody giving a speech and having a\u00a0<\/b>Deaf<\/b>\u00a0signer in front or beside and<\/b>,<\/b>\u00a0are those all public domain? Are you able to use those kinds of things that exist and just pull them in or is there a problem there as well?<\/b><\/p>\n Danielle Bragg: Yeah, that\u2019s a great question too. So, some datasets are public domain, some are not. So, collecting sign language data is very expensive and not only in terms of, you know,\u00a0dollars spent, but also in terms of time and resources, and so groups who collect datasets may be disincentivized to share them.<\/p>\n Host: Right.<\/b><\/p>\n Danielle Bragg:\u00a0That\u00a0could be research groups who invested a lot in collecting a dataset, but it could also be companies who are trying to build\u00a0a\u00a0translation software and they\u2019re trying to out-do their competitors so\u2026<\/p>\n Host: Right.<\/b><\/p>\n Danielle Bragg:\u00a0\u2026there are a lot of datasets that are not publicly available. We don\u2019t\u00a0actually know\u00a0exactly how big those datasets are because they\u2019re not public, but it seems like they\u2019re pretty small based on the quality of existing translation and recognition systems.<\/p>\n (music plays)<\/i><\/b><\/p>\n Host:\u00a0<\/b>All right, w<\/b>ell, let\u2019s move on to a recent paper that you published. In fact, it was in 2019 and\u00a0<\/b>it won the best paper award at\u00a0<\/b>ASSETS<\/b>\u00a0<\/b>and you addressed many of the things we\u2019ve talked about, but the paper also addresses the problem of silos and how to bring separate portions of the sign language processing pipeline together. So, talk about the questions you asked in this paper and the resulting answers and calls to actions. It was called,\u00a0<\/b>Sign\u00a0<\/i><\/b>l<\/i><\/b>anguage\u00a0<\/i><\/b>r<\/i><\/b>ecognition<\/i><\/b>,<\/i><\/b>\u00a0<\/i><\/b>g<\/i><\/b>eneration and\u00a0<\/i><\/b>t<\/i><\/b>ranslation<\/i><\/b>:<\/i><\/b>\u00a0<\/i><\/b>an<\/i><\/b>\u00a0<\/i><\/b>i<\/i><\/b>nterdisciplinary\u00a0<\/i><\/b>p<\/i><\/b>erspective<\/i><\/b>.<\/b><\/p>\n Danielle Bragg:\u00a0Yeah, so, we were trying to answer three main research questions. First is, what is the current state-of-the-art of sign language technology and processing? Second, what are the\u00a0biggest challenges facing the field? And then third, what calls to action are there for people working in this area?<\/p>\n Host:\u00a0<\/b>Mmm<\/b>-hmm.<\/b><\/p>\n Danielle Bragg: And, as you mentioned, this is a very interdisciplinary area and we need people working together across diverse disciplines and so we organized a large, interdisciplinary workshop in February of 2019. We invited a variety of academics working in a variety of fields. We also had internal attendees who were employees at Microsoft,\u00a0and in particular, we\u00a0made sure to invite members of the\u00a0Deaf\u00a0community because their perspective is key, and they led a variety of panels and portions of the day. And as a group,\u00a0we discussed the five main areas that we have already talked about\u2026<\/p>\n Host: Right.<\/b><\/p>\n Danielle Bragg: \u2026and kind of summarized,\u00a0you know, what is the state-of-the-art, what are the challenges, and where do we go from here?<\/p>\n Host:\u00a0<\/b>Mmm<\/b>-hmm.<\/b><\/p>\n Danielle Bragg: So that paper was presenting our results.<\/p>\n Host: All right,\u00a0<\/b>s<\/b>o\u00a0<\/b>drill in a little bit on this siloed approach and what some of those problems are as you work towards a robust application in this arena.<\/b><\/p>\n Danielle Bragg:\u00a0So,\u00a0I think we touched on this a little bit earlier when I was talking about some of the challenges in using NLP techniques for sign language computation.\u00a0A lot of the NLP techniques are developed with spoken languages in mind and so they don\u2019t really handle\u00a0all of\u00a0the complexities of sign languages. So that\u2019s an example of a situation where we really need linguists or\u00a0Deaf\u00a0culture experts combining with natural language processing experts in order to create models that\u00a0actually will\u00a0apply to sign languages, right? If you only have NLP people who are hearing, who use English building these models, you\u2019re going to have very English-centric models as a result that don\u2019t work well for sign languages and,\u00a0you know, the people probably don\u2019t realize that they don\u2019t apply.<\/p>\n Host: Right. And which gets to the question of why don\u2019t you just use English? Well, because it\u2019s a different language, right?<\/b><\/p>\n Danielle Bragg: Right, exactly. Yeah,\u00a0American\u00a0Sign\u00a0Language\u00a0is a completely different language from English. It\u2019s not\u00a0\u201csigned English\u201d\u00a0so if you know English that doesn\u2019t mean that you can understand ASL easily and if you know ASL that does not mean that you can necessarily read English easily either. So, that\u2019s a point that,\u00a0I think,\u00a0not a lot of people recognize,\u00a0that English, in a lot of cases, is a person\u2019s second language.\u00a0They can grow up\u00a0signing in the home and then learn English as a second language at school and as, you know, anyone listening who has learned a second language knows, it\u2019s not as comfortable most of the time.<\/p>\n Host: Let\u2019s talk about your tool belt for a minute. You\u2019ve situated yourself at the intersection of AI and HCI, leaning more towards HCI<\/b>,<\/b>\u00a0and much of your research is building systems, but you still face some of the big challenges with\u00a0<\/b>\u201c<\/b>enough<\/b>\u201d<\/b>\u00a0data and\u00a0<\/b>\u201c<\/b>good enough<\/b>\u201d<\/b>\u00a0data as we\u2019ve talked about. Talk about the research methodologies and technical tools you\u2019re using and how you\u2019re working to tackle the challenges that you face.<\/b><\/p>\n Danielle Bragg: Yeah, so as you mentioned, I do\u00a0do\u00a0a lot of systems building. I do a lot of website building, full stack engineering, and\u00a0then\u00a0there\u2019s a whole set of skills that go into that. As far as data collection goes, I\u2019ve used a lot of crowdsourcing, whether that be on an existing platform like Mechanical\u00a0Turk,\u00a0or\u00a0building a new platform to collect data in other ways. We also incorporate a lot of applied machine learning techniques in the dictionary, for example, that I was explaining. Our back end is powered by Latent Semantic Analysis, which basically does a big dimension reduction on the feature space to figure out which dimensions are\u00a0actually meaningful\u00a0in completing the search. I also do a lot of user studies, interacting with users in a variety of ways and engage in\u00a0a number of\u00a0design practices that incorporate key stakeholders. So, in particular, securing\u00a0research partners who are deaf, but also engaging in participatory design and other ways to engage with the community. I like a combination of qualitative and quantitative work as, I guess that\u2019s kind of a catch phrase these days, but\u2026<\/p>\n Host: Right, right, right.<\/b>\u00a0<\/b>Let\u2019s project a bit and speculate how the work you\u2019re doing for the\u00a0<\/b>Deaf<\/b>\u00a0community might have a positive, if unintended, impact on the broader population. Some people call this the \u201ccurb-cut effect\u201d where something that was supposed to help somebody, ended up helping everybody, or populations they didn\u2019t expect to help. You know, the curb cut was for wheelchairs<\/b>\u2026<\/b>\u00a0turned out to be great for strollers and cyclists and people rolling suitcases and everything else. So, do you have any thoughts on other application arenas that face similar challenges to sign language computation? One thing that comes to mind is dance annotation. I have a background in that and it\u2019s full-body expression as well.<\/b><\/p>\n Danielle Bragg: It\u2019s funny that you mention dance because there are a lot of parallels there. In particular, sign languages actually don\u2019t have a widely accepted written form and that causes a lot of the barriers to\u00a0using\u00a0our text-based interfaces in a sign language,\u00a0and a lot of the same problems apply to dancers, right? If you\u2019re a dancer or choreographer and you want to write down the dance that you are coming up with, or the dance that you\u2019re dancing, that can be really hard,\u00a0and as a result, there\u2019s a woman who came up with a system called Dance Writing and that system has actually been adapted to create a written form for sign languages called Sign Writing. So there definitely are\u00a0a lot of\u00a0similarities between, you know, dance and signing, and I would say,\u00a0more generally, any gesture-based human interaction has a good amount of overlap with sign language research. So,\u00a0gesture recognition\u00a0in particular has\u00a0a lot of similarities to sign\u00a0recognition. I would say that gesture recognition is\u00a0actually a\u00a0simpler problem in many ways because there\u2019s no grammatical structures to understand and the context doesn\u2019t change the meaning of a gesture the way it does to a sign,\u00a0in many cases.<\/p>\n Host: Right. So, gestures might be for a person on a runway who\u2019s bringing the plane in<\/b>\u00a0or something,<\/b>\u00a0or what you would do with cyclists and what those gestures mean<\/b>,<\/b>\u00a0and they\u2019re\u00a0<\/b>pretty simple<\/b>\u00a0and straightforward\u2026<\/b><\/p>\n Danielle Bragg: Yeah, exactly. Or you\u00a0could\u00a0think about interacting with a computer through a simple set of gestures\u00a0or\u2026<\/p>\n Host: Hmmm<\/b>.<\/b><\/p>\n Danielle Bragg: \u2026for an X-Box. I know there have also been research projects to try to support people learning how to play a\u00a0particular sport\u00a0or do yoga more effectively by detecting gestures that the person is making and helping to correct them. Or how you learn a musical instrument, for example, the types of gestures that you make,\u00a0make a big difference. So,\u00a0I think there\u2019s a lot of overlap with other areas where human movement or gesture\u00a0is\u00a0involved.<\/p>\n Host: Danielle, we\u2019ve talked about what gets you up in the morning, but now I\u00a0<\/b>have to<\/b>\u00a0ask what keeps you up at night?\u00a0<\/b>And y<\/b>ou could call this the \u201cwhat could possibly go wrong\u201d question. Do you have any concerns about the work you\u2019re doing and if so, how are you addressing them up-front rather than post-deployment?<\/b><\/p>\n Danielle Bragg: In all the projects that I do related to sign language, I really do my best to include perspectives from people who are deaf and give\u00a0Deaf\u00a0people a platform to be heard and to participate and expand their careers, but that is something that I consciously think about and sometimes worry about. I personally am still learning about\u00a0Deaf\u00a0culture and the\u00a0Deaf\u00a0experience. I don\u2019t have a lifetime of experience in this space. I\u2019ve taken some ASL classes, but I\u2019m not fluent. I\u2019m also not deaf and I don\u2019t have the\u00a0Deaf\u00a0lived experience so it\u2019s particularly important to include those perspectives in the work that I\u2019m doing and I have a number of really wonderful collaborators at Gallaudet University,\u00a0at Boston University,\u00a0at Rochester Institute of Technology,\u00a0and a number of other places. So that\u2019s what I\u2019m doing to try to help with this, you know,\u00a0with\u00a0this concern.<\/p>\n Host: Right. Well, what about data collection and privacy?<\/b><\/p>\n Danielle Bragg: That\u2019s a great question as well. I do worry about that. In particular, for sign language data, it\u2019s a very personal form of data because the person\u2019s face is in it, their body is in it, the background, you know, if it\u2019s their home or their workplace or wherever they\u2019re signing is in it. So, there are a lot of privacy concerns involved in this space. I\u2019ve done some preliminary work exploring how we might be able to impose certain types of filters on videos of people signing. You know, for example, blurring out their face or replacing their face with an avatar face. Of course, if the movement is still there, if you know the person very well, you might still be able to recognize them just from the types of movements that they\u2019re making, but I think there are things we can do to improve privacy at least, and it seems like a very interesting, rich space to work in.<\/p>\n Host: Well, it\u2019s story time<\/b>.<\/b>\u00a0What got young Danielle Bragg interested in computer science and what path did she take to follow her dreams and end up working at Microsoft Research New England?<\/b><\/p>\n Danielle Bragg: So, in my undergrad, I studied applied math. Growing up, math was always my favorite subject, and I still enjoy mathematical oriented work. Towards the end of my undergrad I didn\u2019t know exactly what I wanted to do, but I wanted to do something practical, so I decided to go to grad school for computer science. It seemed like a practical decision. But in grad school I was really searching for projects that had some human impact and that hopefully were making a positive impact in the world and that\u2019s where I really got interested in accessibility. So, I met my former PhD advisor, Richard Ladner, at University of Washington, and he introduced me to the field of accessibility. He got me taking ASL classes and working on some problems in this space that I\u2019m still working on today.<\/p>\n Host: So<\/b>,<\/b>\u00a0did you just fall into a job at Microsoft Research or were you an intern? Is that the typical pathway to the job<\/b>,<\/b>\u00a0or how did that happen?<\/b><\/p>\n Danielle Bragg: I did intern at Microsoft. I\u2019ve interned at Microsoft three\u00a0times, actually.\u00a0Once in the Bing Search group and then two times as a research intern with Adam\u00a0Kalai\u00a0in the New England lab, and then I did a postdoc at the New England lab for two years, and now I am a full-time researcher in the lab so\u00a0I can\u2019t\u2026\u00a0I can\u2019t go anywhere else!\u00a0I\u2019m forever a New England researcher.<\/p>\n Host: Awesome. What\u2019s something we don\u2019t know or might not suspect about you? Maybe a character trait, a life event, a hobby<\/b>\/<\/b>side quest and how has it impacted your life and career?<\/b><\/p>\n Danielle Bragg: So, I\u2019ve spent a lot of time training in classical music\u00a0performance, actually. I played the bassoon\u00a0pretty seriously\u00a0through college and considered being a professional musician at that point. I studied with someone in The Boston Symphony, I went to summer music festivals, which is a thing that pre-professional musicians do in the summers and I still have a lot of friends and acquaintances in orchestras and playing chamber music. And I would say music really added a lot of richness to my life.\u00a0In addition to my love of music, I think my professional training\u00a0actually had\u00a0a lot in common with my research training. So training to be a professional musician takes a lot of practice and dedication and it\u2019s more of an apprentice model so you usually study closely with one teacher at a time and they really teach you, you know, how to play, how to make reeds, if your instrument requires reed making, and actually being trained to do research is quite similar in a lot of ways, right? You have your PhD advisor who you work\u00a0closely with and you learn from doing research alongside them. So, I didn\u2019t plan it,\u00a0originally, but I think that, you know, being trained as a classical musician probably actually helped me a lot with training to do research.<\/p>\n Host: I love that. You know, there\u2019s such a huge connection between music and math<\/b>,<\/b>\u00a0by the way<\/b>,<\/b>\u00a0that so many researchers I\u2019ve talked to have\u00a0<\/b>had\u00a0<\/b>that musical interest as well, but not in the classical, bassoon-playing category. So, you\u2019re unique in that.<\/b><\/p>\n Danielle Bragg: Yeah, bassoon is a, a different one.<\/p>\n Host: I grew up\u2026 my mom had a record of Peter and the Wolf and all the different animals were represented by the different instruments and I remember the bassoon, but I can\u2019t remember the animal it was associated with. I\u2019ll look it up after we\u2019re done.<\/b><\/p>\n Danielle Bragg: I think it\u2019s the grandfather, but I could be wrong.<\/p>\n Host:<\/b>\u00a0<\/b>Well, as we close, and I\u2019m sad to close, as we close\u2026<\/b><\/p>\n Danielle Bragg: Me too.<\/p>\n Host: This has been so much fun! I\u2019ve taken to asking what the world might look like if you\u2019re wildly successful and some people frame this in term<\/b>s<\/b>\u00a0of solving problems that would impact millions or billions of people, but I think sometimes the goal is less grandiose and the impact might be more meaningful to a smaller population. So, at the end of your career, what do you hope to have accomplished in your field and how would you\u00a0<\/b>like<\/b>\u00a0life to be different because of your research?<\/b><\/p>\n Danielle Bragg: Well, it might sound a little cheesy or clich\u00e9, but I really hope to leave the world a little bit better than it was when I started out, and in my career,\u00a0I hope I\u2019ll have helped people get access to information that they may not have had access to beforehand. I think education is so key to so many things. You know, not only degrees that you get from schools, but your personal developments, different types of skill development, or just general understanding of the world. And I think if you don\u2019t have access to information,\u00a0that\u2019s really, really a problem, right? At least if you have access to the information,\u00a0you can decide whether you want to consume it, you can decide what you want to do with it,\u00a0and\u00a0you have the possibility of learning or advancing yourself, but if you don\u2019t even have access then, you know, what can you do? So, a lot of my work is focused on increasing access to people who use languages that are not often served or supported,\u00a0or have difficulty accessing information in different ways.<\/p>\n Host: Danielle Bragg, this has been\u00a0<\/b>really great<\/b>. I have learned so much from you and I\u2019m so inspired by the work you\u2019re doing. Thank you so much for coming on the podcast today.<\/b><\/p>\n Danielle Bragg: Yeah, thank you.<\/p>\n (music plays)<\/i><\/b><\/p>\n To learn more about Dr. Danielle Bragg, and the latest in accessibility research efforts, visit Microsoft.com\/research<\/i><\/b><\/p>\n And for the record, it WAS the grandfather!<\/i><\/b><\/p>\n","protected":false},"excerpt":{"rendered":" Many computer science researchers set their sights on building general AI technologies that could impact hundreds of millions \u2013 or even billions \u2013 of people. But Dr. Danielle Bragg, a senior researcher at MSR\u2019s New England lab, has a slightly smaller and more specific population in mind: the some seventy million people worldwide who use sign languages as their primary means of communication. Today, Dr. Bragg gives us an insightful overview of the field and talks about the unique challenges and opportunities of building systems that expand access to information in line with the needs and desires of the deaf and signing community.<\/p>\n","protected":false},"author":37583,"featured_media":667383,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/62515838\/","msr-podcast-episode":"118","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[240054],"tags":[],"research-area":[13556,13551,13545,13554],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243990],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-664818","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-artificial-intelligence","msr-research-area-graphics-and-multimedia","msr-research-area-human-language-technologies","msr-research-area-human-computer-interaction","msr-locale-en_us","msr-post-option-podcast-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/62515838\/","podcast_episode":"118","msr_research_lab":[199563],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[614286],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"Related:<\/h3>\n
\n
\nTranscript<\/h3>\n
","byline":"","formattedDate":"June 17, 2020","formattedExcerpt":"Many computer science researchers set their sights on building general AI technologies that could impact hundreds of millions \u2013 or even billions \u2013 of people. But Dr. Danielle Bragg, a senior researcher at MSR\u2019s New England lab, has a slightly smaller and more specific population…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/664818","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37583"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=664818"}],"version-history":[{"count":7,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/664818\/revisions"}],"predecessor-version":[{"id":668109,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/664818\/revisions\/668109"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/667383"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=664818"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=664818"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=664818"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=664818"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=664818"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=664818"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=664818"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=664818"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=664818"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=664818"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=664818"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}