2014年7月14日 - 2014年7月15日

Faculty Summit 2014

地点： Redmond, WA, USA

Monday, July 14

Presenters: Rahul Mahajan and Arjmand Samuel, Microsoft Research

An increasing number of research areas rely on collecting data from sensors and devices deployed in homes and beyond. Researchers typically deploy such devices, collect data, analyze and make interesting inferences based on this data. In order to collect sufficient data and have confidence in the research findings, it is desirable to collect data from a large numbers of locations. However, doing so requires major investment in engineering expertise and technology infrastructure—neither of which is readily available to the academic community. Microsoft Research’s Lab of Things aims to provide such an infrastructure to facilitate at-scale in-situ research in a number of research areas. In this demo, the Lab of Things team and academic collaborators demonstrated the research platform, along with some of the current deployments.
Presenters: Guobin Wu, Microsoft Research Asia; Chunmiao Zheng, Peking University

Understanding ecological and hydrologic processes and their interactions in large watersheds is of critical importance to society in need of sustainable fresh water supplies. This project is to support the comprehensive data processing and numerical modeling in the Heike River Basin using Microsoft Azure, the cloud platform, and to further develop the cloud computing as a cost-effective solution to large-scale integrated eco-hydrologic modeling.
Presenters: Hyunju Lee, Gwangju Institute of Science and Technology; Miran Lee, Microsoft Research Asia

Genes usually contribute to the development of diseases through biological events such as gene expression, regulation, phosphorylation, localization, and protein catabolism. Our disease-gene search engine, DigSee, services the sentences from MEDLINE abstracts with identified triple relations that ‘which genes’ are involved in the development of ‘which cancer’ through ‘which biological events’. Since the current version of DigSee supports only cancer, our goal is to incorporate more diseases types other than cancers into the system. This new system will allow researchers working on various types of diseases to search which genes are related to the disease through which biological events.
Presenters: Jie Liu, Bodhi Priyantha, and Mohammed Shoaib, Microsoft Research

NFC-Ring is an always-available user input device in the form of a form-factor wearable ring. It has built-in low-power gesture recognition capabilities that interprets various gestures performed on arbitrary surfaces and transmit those gestures wirelessly. The NFC-Ring recharges its limited capacity internal battery by on-demand NFC energy scavenging. It uses a low-power NFC tag emulator to scavenge energy when the user holds a screen-unlocked mobile phone in her hand.
Presenters: Isabelle Guyon, ChaLearn; Percy Liang, Stanford University; Christophe Poulain and Evelyne Viegas, Microsoft Research

CodaLab is an open-source web-based platform that enables people to share code and data in order to advance the state of the art in research fields where Machine Learning is used. CodaLab focuses on:
- Reducing the amount of time that researchers spend on preprocessing datasets, writing evaluation or visualization scripts, and getting other people’s code to run
- Reducing duplicated efforts across different groups
- Enabling the creation of competitions to help focus the community on areas needing benchmarking or better methods
CodaLab helps solve these problems by creating an online community where people share worksheets or participate in competitions. CodaLab Worksheets lower the barrier to documenting and publishing detailed experiments, streamlining the research and learning process. CodaLab nurtures an environment of scientific rigor by enabling reproducibility and transparency, and it opens new avenues for collaboration between researchers, developers, and data scientists.
（在新选项卡中打开）
Presenters: Michael Isard, Frank McSherry, and Derek Murray, Microsoft Research

Naiad is a .NET-based platform for high-throughput, low-latency data analysis. These properties have made it suitable not just for traditional “big data” processing, but also for stream processing on real-time data, complex graph analyses, and machine-learning tasks. Moreover, Naiad is built with extensibility in mind, providing analysts with simple interfaces, but enabling them to integrate custom business logic when required. Using Naiad on Azure enables an analyst to develop an application locally before deploying it seamlessly to the cloud. Several tools have been built atop Naiad, to use Azure to provide interactive analyses over massive data sets.
Presenters: Andrei Aron, Rob Deline, and Danyel Fisher, Microsoft Research

Tempe is a living research notebook for analyzing large datasets, from either offline or online data sources. The goal of Tempe is to provide an interactive, informative user experience for a data scientist’s entire workflow, including cleaning up and transforming raw data, defining new data features, training classifiers, and producing visualizations. Tempe uses the Trill data processing engine to provide progressive query results for offline data and temporal data results for online data.
Presenters: Hannes Gamper, David Johnston, Ivan Tashev, and Mark Thomas, Microsoft Research

This research project features two technologies:
- Rendered personalized head-related transfer functions (HRTFs), synthesized using anthropometric data tailored to an individual user’s audio input
- Creation of an immersive audio experience using headphones and person/head tracking through rendered 3-D audio
The project generates personalized HRTFs by scanning a person using a Kinect for Windows device, then using a headset to identify a predefined area. It enables the user to interact with a virtual set of physical objects-such as an AM radio, a manikin, a phone, or a television-that start to play music, speak, and ring. The user can move freely, rotate her head, and approach each individual sound source within a virtual experience.
（在新选项卡中打开）
Presenters: Larry Heck, Microsoft Research; Mouni Reddy, Microsoft

Building the world’s most advanced digital assistant using state-of-the-art machine learning and keeping a deep focus on the user experience.

Tuesday, July 15

Presenters: Mona Soliman Habib, Syed Fahad Allam Shah, Subhojit Som, and Xinwei Xue, Microsoft

Microsoft Azure Machine Learning is a service on Windows Azure, which a developer/data scientist/BI analyst can use to easily build a predictive model using machine learning over data, and then deploy and manage that model as a cloud service. ML Studio offers functionality to support the end-to-end workflow for constructing a predictive model; from ready access to common data sources in the cloud, data exploration, feature selection and creation, building training and testing sets, machine learning over data and experimentation, to final model evaluation and deployment.
Presenters: Judith Bishop, Microsoft Research; Nigel Horspool, University of Victoria; Daniel Perelman, Microsoft Research; Nikolai Tillmann, Microsoft

Code Hunt is a browser-based game for anyone who is interested in coding. We built Code Hunt to take advantage of the fact that any task can be more effective and sustainable when it’s fun. Coding competitions usually give specifications for problems and then check solutions automatically using a test suite. Code Hunt is different. Instead of presenting a problem, Code Hunt presents an empty slate to the player and a set of constantly changing test cases. It thus teaches coding as a by-product of solving a problem that is presented as pattern matching inputs and outputs. The fun is in finding the pattern. We have run very large competitions with thousands of students and found that Code Hunt differentiates the top students from the others. We’ll demonstrate the game, and give statistics, as well as offer opportunities for research.
Presenters: Syed Shoaib Ali and Deepti Desai, Microsoft Research India; Mary L. Gray, Microsoft Research New England; Sara Kingsley, Microsoft Research NYC; Kate Miltner and Gregory Minton, Microsoft Research New England; Rajesh Patel, Microsoft Bing Redmond; Siddharth Suri, Microsoft Research NYC

This demo draws on ethnographic and quantitative data to produce a richly contextualized data visualization of crowdsourcing’s global workflows. We analyze data from several sources: survey responses from crowdworkers across three popular crowdsourcing platforms; ethnographic interviews and participant observation among crowdworkers; U.S. and India census data; and backend data from Microsoft’s Universal Human Relevance System (UHRS) and MobileWorks, a Bay-area start-up. We present visual, interactive map overlays generated from the data sets to illustrate how crowdworker demographics, such as income, education and employment, compare with those of the general population. We also present analyses of platform data, gauging workflows, ranging from time-on-task to systems quality control. Combining computational and qualitative approaches we ask: who are crowdworkers and how might seeing who they are help us build more responsive, expansive and ethical platforms?
Presenter: Dave Wecker, Microsoft Research

Languages, compilers, and computer-aided design tools will be essential for scalable quantum computing, which promises an exponential leap in our ability to execute complex tasks. LIQUi|> is a modular software architecture designed to control quantum hardware. It enables easy programming, compilation, and simulation of quantum algorithms and circuits, and is independent of a specific quantum architecture. LIQUi|> contains an embedded, domain-specific language designed for programming quantum algorithms, with F# as the host language. It also allows the extraction of a circuit data structure that can be used for optimization, rendering, or translation. The circuit can also be exported to external hardware and software environments. Two different simulation environments are available to the user, which allow a trade-off between number of qubits and class of operations. LIQUi|> has been implemented on a wide range of runtimes as back-ends with a single user front-end. We describe the significant components of the design architecture and how to express any given quantum algorithm. Learn more >
Presenters: Arvind Bala, Microsoft; Anoop Gupta, Microsoft Research; Isaac Harris, Microsoft

Availability of high quality education is widely acknowledged as the pathway to success in modern society. The past few years have seen a tremendous interest in use of MOOCs, SPOCs, flipped-classrooms / blended-learning to provide more scalable and affordable models for student learning. However, it is still hard to author interactive online lessons, so only a small fraction of faculty create or use them. This session will introduce Office Mix, a brand new offering from Microsoft that dramatically simplifies the creation of such online lessons, including their publishing and sharing, and associated analytics. Office Mix builds upon the deep familiarity of faculty and students with PowerPoint to create such lessons, and use the slide decks they already have in their arsenal. We will also discuss use cases beyond online learning, to sharing and communication of academic research.

The session will also cover two other efforts from Microsoft Research. Sumit Basu will show Powergrading, a powerful method for increasing the efficiency of grading students’ answers to online short-answer questions. Rakesh Agarwal will discuss technologies for inferring a knowledge graph from current education material, enriching the graph with rich content in multiple format mined from the Web as well as crowd-sourcing, and then overlaying it with the social graph of teachers and students to enable dynamic formation of study teams with the goal of maximizing overall learning.
Presenter: Kati London, Microsoft Research

HereHere NYC is a research project that generates cartoons to express how neighborhoods are doing based on public data. The project summarizes how your neighborhood, or other New York City neighborhoods of interest, are doing via a weekly cartoon, neighborhood-specific Twitter feeds, and playful neighborhood comparisons. The goals are to:
- Create compelling stories with data to engage larger communities
- Invent light rituals for connecting to the hyperlocal
- Using characterization as a tool to drive data engagement
HereHere uses Project Sentient Data, an early-stage project to explore how interactions can be improved by understanding ecosystems of data in terms of characterization, personalities, and relationships. Sentient Data provides a server and a representational-state-transfer API that enables developers to assign personalities and translate data sets into their relative emotion states.
（在新选项卡中打开）
Presenters: Tom Ball and Michal Moskal, Microsoft Research

We are experiencing a technology shift: Powerful and easy-to-use mobile devices like smartphones and tablets are becoming more prevalent than traditional PCs and laptops. Mobile devices are going to be the first and possibly the only computing devices that virtually all people will own and carry with them at all times. In this session, we will show how anyone can develop software directly on their mobile devices. We have created TouchDevelop, a modern software development environment that embraces the new reality of cloud-connected mobile devices. TouchDevelop comes with typed, structured programming language that is built around the idea of using a touchscreen as the input device to author code. Access to the cloud, flexible user interfaces, and access to sensors such as accelerometer and GPS are easily available. In our experience, TouchDevelop is well suited for education, as mobile devices engage students, and the programming environment focuses on core programming tasks supported by interactive tutorials. TouchDevelop is available as a web app on Windows tablets, iOS, Android, Windows PCs and Macs, and as a native app on Windows Phone.
Presenters: Escola Superior De Desenho Industrial–Rio de Janeiro, Brazil; New York University–New York, NY, United States; Interdisciplinary Centre–Herzliya, Israel; Carnegie Mellon University–Pittsburg, PA, United States; University of Washington–Seattle, WA, United States; Royal Danish Academy of Fine Arts–Copenhagen, Denmark; Copenhagen Institute of Interaction Design–Copenhagen, Denmark; University of London, Goldsmiths–London, United Kingdom; Art Center College of Design–Pasadena, CA, United States

In our daily lives we encounter sensors all the time, like when a motion sensor turns a light on in a dark place, or when a carbon monoxide detector tell us that the air is becoming hazardous. Sensors extend our abilities to see, hear, and feel far beyond what we ourselves can take in—from arrays of telescopes sensing the edges of the universe to nano-scale biological sensors amplifying our own sense of smell.

In a world with a billion sensors, how will we make sense of it all?

Faculty Summit 2014

Monday, July 14

Lab of Things—A Research Platform for the Internet of Things

Numerical Modeling of Eco-Hydrological Processes in the Heike River Basin

Disease Gene Search Engine (DigSee): Text Mining for Identifying Disease-Gene-Biological Events Relationships

NFC Ring

New Perspectives on Machine Learning and Reproducible Science with CodaLab

Naiad on Azure: Rich, Interactive Cloud Analytics

Tempe: Quick Answers from Large Data

3-D Audio for Telepresence and Virtual Reality

Cortana

Tuesday, July 15

Microsoft Azure Machine Learning Service

Code Hunt: What if Coding Were a Game?

Meet the Crowd: Visualizing the People Who Make Crowdsourcing Possible

LIQUi|>: A Software Design Architecture and Domain-Specific Language for Quantum Computing

From Exceptional to Everyone: Microsoft’s Efforts to Democratize Blended Learning

HereHere NYC and Project Sentient Data

Touch Develop: Create Rich Mobile Cloud Apps on Your Device

Design Expo