July 13, 2016 - July 15, 2016

Faculty Summit 2016

Location: Redmond, WA, USA

Wednesday, July 13

  • Each year, Microsoft Research sponsors a semester-long class at leading design schools. Students are asked to form interdisciplinary teams of two to four students to design a user experience prototype that solves a real-world problem. From these groups, a representative team from each school presents its work to Microsoft.

    Design Expo began as partnership from Microsoft and Apple to integrate technology into the curriculum of design schools worldwide. The goal is to build long term relationships with the design schools, and build community across schools.

    To date, around 50 design schools have participated, such as: RISD, Stanford, USC, UCLA, Bezalel (Israel), IDC (Israel), IIT Mumbai (India), NID (India), Tsingua (China), Hong Kong Polytechnic, Delft, UMEA, Dundee (Scotland), TU Einhoven, Iberamericano (Mexico), Northumbria and many more.

  • Speakers: Joshua Bloom, University of California-Berkeley; Andrew McCallum, University of Massachusetts; Kathleen McKeown, Columbia University; Rob Mauceri

    With the advancement of data production, storage capabilities, communications technologies, computational power, and supporting computational infrastructure, data science is now recognized as a highly-critical growth area with impact across many sectors including science, government, finance, health care, manufacturing, advertising, retail, and others. As such, this has created a supply problem for highly trained data scientists. And since data science technologies are being leveraged to drive crucial decision making, it is of paramount importance to be able to educate professionals with an appropriate skill set to use appropriate rigor when they draw inferences from data. This means they need a broad set of skills that cut across multiple disciplines from statistics to computer science as well as strong critical reasoning in the context of specific business and scientific needs. On this panel we discussed the following:

      • the exploding demand of employers for data scientists
      • how educational programs must change to bridge the gap between supply and demand
      • best practices for establishing cross-departmental programs working to meet the challenge
      • metrics of success
    • Speakers: Jeff Cox, Microsoft; Ian White, University of Cambridge, Ming Wu, University of California-Berkeley

      Optical communication is already ubiquitous in cloud infrastructure. It is poised to play an even bigger role as we demand higher capacity and lower latency, which traditional architectures may not be able to deliver. Speakers in this session discussed the needs of the next generation of cloud networks and promising optical technologies that can be leveraged to meet those needs.

    • Speakers: Joe Finney, Lancaster University; Ben Shapiro, University of Colorado

      The BBC micro:bit is a wearable and programmable device that visibly features a 5×5 LED display, accelerometer, compass, buttons, I/O pins, Micro USB plug, Bluetooth Low Energy antenna, ARM Cortex-M0 processor, and battery plug. The first wave of micro:bits landed in the UK this spring, with every Year 7 student in the UK receiving one, for free. Microsoft Research has been working on the hardware and software technology behind the BBC micro:bit. We also have been working with academics and others in Microsoft to explore how to make use of the BBC micro:bit in CS and STEM education in the United States. We discussed the BBC micro:bit and what it means for teachers and students in the US.

    • Speakers: Dan Bohus, Microsoft Research; John Langford, Microsoft Research; Ragan Majumder, Microsoft; Eric Xing, Carnegie Mellon University

      Much research is being done today on improving existing systems with intelligence. This session focused instead on the questions to be addressed when designing systems to create and enable AI. The talks presented different points of view on how to design systems to create AI from examples of research directions in academia and industry to practical considerations when creating physically situated interactions to engineering constraints faced by developers.

    • Speakers: Nicolo Fusi, Microsoft Research; Dana Pe’er, Columbia University; Eran Segal, Weizmann Institute of Science

      Molecular biology, healthcare and medicine have been slowly morphing into large-scale, data driven sciences dependent on machine learning, natural language processing, applied statistics, privacy and security, compression and efficient search. For example, drug development timelines can be dramatically reduced by modelling the effect of already-approved drugs on large-scale measurements of expression of genes in diseased cells; the progression of cancer and the evolution of stem cells can be tracked using manifold embedding techniques leading to better understanding and more effective treatments; latent variable models deployed on large-scale DNA sequencing of microbial communities pervasive in our bodies and environment are transforming our understanding of health and disease; revolutionary new techniques for gene editing are being made more effective by leveraging machine learning predictive models. In this session, we highlighted a few examples that are helping to transform the world and shine a light on where this quickly-moving area is headed.

    • Speakers: Peter Bodik, Microsoft Research; Ryan Calo, University of Washington; Juan Carlos Niebles, Stanford University; Josh Smith, University of Washington

      Cameras are becoming increasingly ubiquitous. They are deployed for a wide variety of commercial & surveillance purposes by private enterprises and governments. Collectively analyzing videos produced by cameras (e.g., city-wide or enterprise-wide or in a datacenter) is a grand research challenge with great commercial importance. Large scale video analytics – on real-time videos and off stored videos – represents an exciting frontier for big data systems and networks. What should large-scale video analytics systems look like? What are the challenges and opportunities for vision algorithms at scale? What are the privacy implications of video analyses? This was a lively session meant to brainstorm and learn about video analytics.

    • Speakers: Roozbeh Jafari, Texas A&M University; Jessica Lundin, Microsoft; Joseph Paradiso, Massachusetts Institute of Technology

      Today’s wearable devices are rapidly shrinking in size. The next frontier of personalized computing is the interaction of these wearables with other proximal devices. Future interconnected wearables will thus exchange information with other ambient devices, perform aggregated analytics and take richer decisions. A web of wearable devices around the body constitutes a body area network (BAN), which is a subclass of the internet-of-things (IoT). Such networks form an enabling technology for novel applications such as precision healthcare, sports training, lifestyle monitoring and individualized security. Talks in this session spanned energy management, systems infrastructure and data-processing platforms for interconnected wearables.

    • Speakers: Bryan Parno, Microsoft Research; Zachary Tatlock, University of Washington; Xi Wang, University of Washington; Nickolai Zeldovich, Massachusetts Institute of Technology

      The world’s software infrastructure is built on top of low-level systems software, such as operating systems, file systems, databases, embedded systems, secure networking software, and distributed systems. Unfortunately, bugs in these low-level systems can lead to lost data, broken security, and catastrophic failures of devices and services. Recent advances in formal verification technologies have raised hopes that critical pieces of our infrastructure may be proven correct and secure. This session explored how this goal may be accomplished, what tools and techniques will be most useful, and what challenges remain.

    • Speakers: Chris Benner, University of California-Santa Cruz; Meredith Ringel Morris, Microsoft Research; Winifred Poster, Washington University-St. Louis

      Some argue that AI will soon take over most human jobs. Yet, despite advances in automation, IoT and on-demand services are fueling novel worlds of work and productivity. The panel brought together a diverse group of scholars to discuss: What technical, social and political challenges and opportunities do AI and crowdsourcing pose for the future of work? How will we build productive and equitable work environments as economies shift from a need for full-time, 9 AM to 5 PM formal employment to the dynamics of a contract-driven always-on, on-demand digital economy? Attendees took away a more nuanced sense of AI and crowdsourcing as disruptive technologies poised to redefine not just workflows but the future of work itself.

    • Speakers: Emiliano De Cristofaro, University College London; XiaoQian Jiang, University of California-San Diego; Kim Laine, Microsoft Research

      Over the last 10 years, the cost of sequencing the human genome has come down to around $1,000 per person. Human genomic data is a gold-mine of information, potentially unlocking the secrets to human health and longevity. As a society, we face ethical and privacy questions related to how to handle human genomic data. Should it be aggregated and made available for medical research? What are the risks to individual’s privacy? This panel highlighted some new cryptographic solutions for securely handling computation on genomic data, including homomorphic encryption and multi-party computation. We have developed demos of SEAL, the MSR Homomorphic Encryption library, which focus on private genomic predictions and health risk related predictions.

    • Panelists: David Culler, Friesen Professor of Computer Science, University of California-Berkeley; Susan B. Davidson, Chair, Board of Directors, Computing Research Association (CRA); John Launchbury, Director, Information Innovation Office, Defense Advanced Research Projects Agency (DARPA); Norman A. Whitaker, Managing Director of Special Projects, Microsoft Research

      As scientists and engineers we have a deep desire to work on something that will have a big (positive) global impact. The question is: are we doing this? Are enough of us engaged in research and inventing technologies that will change the world? The purpose of this panel was to have a meaningful discussion about what it takes to create a culture where bold bets are the norm. The panelists dug deep into what they have learned over the many years of pursuing and directing large-scale, big-idea research projects. They combined this with what they see happening in our field broadly to try and predict which scientific endeavors, research directions and technologies will likely make it big. Our hope is that these discussions inspired us to bet big while providing some non-obvious pearls of wisdom on how to manage risks and how to make our projects huge successes.

    Thursday, July 14

    • Speakers: Mohammad Alizadeh, Massachusetts Institute of Technology; Vishal Misra, Columbia University; Keith Winstein, Stanford University

      As Internet evolves, so must transport protocols and congestion control. Today’s dominant transport protocol, TCP, was designed primarily for wide area, wired network, and for throughput-sensitive traffic. As cloud-based services, mobile internet access, internet-of-things start sourcing or sinking majority of the internet traffic, new transport protocols and new ways of thinking about congestion control are needed. The workshop brought together leading researchers to ponder many of these issues, including, but not limited to: – As data center networks continue to evolve and use ever-more exotic technologies (e.g. free space optics), do we need changes to transport protocols? – What is the right congestion control protocol for RDMA networks? – Can wide area networks be made lossless? Do we need to? – Do we need better transport protocols for mobile networks? – Today, majority of the Internet traffic is generated by, or sent to a handful of major players (Google, Netflix, Facebook etc.). Is TCP the right solution in this scenario? – Many of today’s applications are delay sensitive (e.g. collaborative office document editing). Do we need new transport protocols for such applications? – Edge computing (also known as fog computing) is being widely deployed, and will likely play a major role in internet-of-things ecosystem. What transport protocol innovation is needed in this space? The format of the workshop included short talks by leading academic researchers, followed by a joint panel.

    • Speakers: Mark Billinghurst, University of South Australia; Ramani Duraiswami, University of Maryland; Hannes Gamper, Microsoft Research

      Head mounted displays for virtual and augmented reality are a hot topic for research and product development. An integral part of these devices is the spatial audio rendering system. Unlike the vision, where humans have approximately 100° field of view, human hearing covers all directions in all three dimensions. This means that the spatial audio system is expected to provide realistic rendering of sound objects in full 3D to complement the stereo rendering of the visual objects. This session discussed the problems and solutions around the spatial audio systems in the devices for virtual and augmented reality.

    • Speakers: Jan Gray, Gray Research LLC; David Soloveichik, University of Texas; Dave Wecker, Microsoft

      We have seen the birth of many exotic architectures in recent years, from a quantum computer that promises to achieve exponential speed-ups over conventional computers, to DNA computation that performs disease diagnostics and therapy, to Field Programmable Gate Arrays (FPGAs) that provide a flexible toolkit for implementing architectures such as Microsoft’s Catapult fabric for large-scale datacenters. Each of these exotic technologies enable novel solutions to challenging problems and require equally novel methods to program and design them. We highlighted the advances in their applications and the challenges behind developing their toolchains and programming environments.

    • Speakers: Oren Etzioni, Allen Institute for Artificial Intelligence; Hoifung Poon, Microsoft Research; Chris M. Re, Stanford University

      Machine reading automates knowledge extraction from text. Traditional machine learning methods are hindered by the scarcity of annotated examples, which motivates the development of modern approaches that leverage “free lunches” such as existing ontologies and databases for indirect supervision. This has enabled the scope of machine reading to expand significantly from its traditional newswire focus. The ensuing impact to science and society is just beginning to manifest. In this session, we gave an overview of modern machine reading approaches and highlighted some example applications such as cancer precision medicine, fighting human trafficking, and education.

    • Speakers: Jeff Bigham, Carnegie Mellon University; Michael Bernstein, Stanford University; Jaime Teevan, Microsoft Research

      Online, networked societies have embarked on a massive shift to take work online. Increasingly, computers are not just helping people get work done, but also helping them figure out what to do, with whom, and when. This shift is being supported by the algorithmic decomposition, structuring, and allocation of tasks. The transformation of information work into micro-work creates an opportunity for people to accomplish small but meaningful tasks that contribute towards larger goals in short bursts of time from their mobile devices. Additionally, micro-work enables individuals and automated processes to efficiently and easily work together to complete tasks that currently seem impossible to automate.

    • Speakers: Mayur Naik, Georgia Institute of Technology; Lenin Ravindranath Sivalingam, Microsoft Research; Junfeng Yang, Columbia University

      Mobile and cloud app ecosystems are growing at a tremendous pace. Today, there are hundreds of thousands of developers building cloud apps and cloud-backed mobile apps. Unlike traditional software, these apps are typically run in uncontrolled “wild” environments: wide range of user interactions, hardware platforms, network connectivities, and fault conditions. Coping with the ensuing performance and reliability issues is difficult enough for sophisticated developers and well-funded organizations, but for small teams with fewer resources at hand, the problem is acute. In this session, we looked at modern developer tools to address these problems.

    • Speakers: Jignesh Patel, University of Wisconsin-Madison; Matei Zaharaia, Massachusetts Institute of Technology

      This session consisted of three technical talks focused on lessons learned from Big Data Platforms from the last decade and outlined ideas that could drive innovations for the next generation of Big Data Platforms.

    • Speakers: Ernie Brickell, International Association of Cryptologic Research; Scott Charney, Microsoft; Susan Landau, Worcester Polytechnic Institute

    • Speakers: Joseph Gonzalez, University of California-Berkeley; John Langford, Microsoft Research; Patrice Simard, Microsoft Research

      Most machine learning papers are about some algorithm for finding suitable parameters to a function to accomplish some task, but applying machine learning in practice calls for much more—it calls for systems which generate their data, consume it, and use it effectively to accomplish some task. What do such systems look like? And how useful can they be? Each of the speakers has developed a system for complete learning. The “is it complete or not” bit matters enormously for application of ML in practice, because every missing component must be filled in by someone sufficiently experienced to do the correct thing.