PhD Scholarship Programme in EMEA

地区： Europe

A Smart Care System for Healthcare using Contextual Reinforcement

Supervisor: Mihaela van der Schaar, University of Cambridge, UK
MSR Supervisor: Danielle Belgrave

Summary: Clinicians are routinely faced with the practical challenge of integrating longitudinal, multi-modal data collected from a variety of sources (including wearables) for a given patient and then making decisions (monitoring and intervention choices) on the basis of what is learned from this data. This work is applicable to numerous areas in healthcare, including chronic diseases, mental health, elderly care etc. As the range of available information and interventions increases, the difficulty of selecting appropriate information collection and interventions for a particular patient grows as well. As a result, the rate of misdiagnosis and mistreatment remains high. More and more data is collected for each patient and more and more data is available from past patients. The challenge is to use this data to personalize diagnosis and interventions. Current diagnosis and treatment which continues to rely on Clinical Practice Guidelines (CPGs), are geared towards the “lowest common denominator” and are targeted toward a “representative” patient rather than toward the unique characteristics and circumstances of the current patient.

Deep Learning for Graph-structured Data

Supervisor: Jose Miguel Hernandez-Lobato, University of Cambridge, UK
MSR Supervisor: Alexander Gaunt

Summary: We describe work to investigate the limitations of graph-based deep learning models and provide extensions that overcome these limitations. Graphs are a powerful representation of relations in interacting systems, and the ubiquity of graph-structured data in a variety of natural science and engineering domains has generated great excitement in deep learning models that capture graph structure. Typically, these approaches implement a message passing algorithm where by each node aggregates information from its incident edges to represent its environment. This aggregation is lossy and the message passing schedule (e.g. number of rounds of message exchanged) is not tailored to the complexity or embedded structure (e.g. hierarchical subgraphs) of each processed graph. The proposed work will target these limitations, yielding more performant models and providing opportunity to better interpret the model predictions using the exposed embedded structures or highly activated subgraphs.

Designing for Human Partnership with AI in Everyday Life: The Future of Work as a Case Study

Supervisor: Elisa Giaccardi, Delft University of Technology, The Netherlands
MSR Supervisor: Richard Banks

Summary: The proposed PhD research will study social, informal work practices – both distributed and colocated – as a case study for emergent forms of Human-AI partnership. In the future, people at work will be surrounded by AI systems, both hardware and software based. This PhD would look at how these systems should adapt to the complexities of the working environment, particularly as we shift from traditional forms of office work, to new, more informal ways of working that might include gig work and co-working spaces. This will be investigated by balancing and integrating traditional ethnographic methods and Wizard of Oz prototyping into a future-oriented approach that uses hypothetical and simulated AI systems as a means to explore Human-AI partnerships in context (see Methods of research). By means of ethnographic accounts of the future, the research will contribute an innovative design method for the design of AI systems, meant to help systems developers and designers to have empirically grounded discussions about critical factors, social norms and ethical rules, and to make decisions on what AI systems to develop.

Exploring motif-based design patterns for biological computation

Supervisor: Thomas Gorochowski, University of Bristol, UK
MSR Supervisor: Boyan Yordanov

Summary: In software engineering, design patterns provide reusable solutions to frequently encountered problems that are independent of the programming language used for implementation. In this project, we explore whether biology also exploits design patterns in the regulatory programs controlling life. Focusing on the role of small regulatory motifs that are known to be enriched in living systems, and which cluster in specific ways, we will employ formal methods to study the cellular functions that motifs support and their robustness to being used in different ways. This information will form the foundation of a computational tool where motifs and their clustering are used as design patterns to synthesise new regulatory programs. To test their generality, several programs implementing broad functionalities (e.g. logic, oscillations and pulse generation) will be verified in living cells with synthetic regulatory components not used by nature. Insight from this work will fuel new approaches for the design of our own biological programs, while also providing a deeper understanding of the way biology harnesses the computational substrate of life itself.

Integrating machine learning in to the IDE

Supervisor: Andrew Rice, University of Cambridge, UK
MSR Supervisor: Miltiadis Allamanis

Summary: Modern integrated development environments (IDE) offer a whole suite of tools and support to improve programmer productivity. Recent successes in the application of machine learning approaches to software development suggest that a revolution in this area is on the way. Machine learning techniques have been shown to capture subtleties of programming style in a way that has never been done with traditional analysis techniques. Recent successful results include the automated suggestion of variable names, comment generation, code summarisation, and defect detection. However, none of these techniques could be practically deployed in an IDE today. This research seeks to tackle one major challenge in this area: that of suitably designing and training a model for wide-scale use. Our vision is to deploy globally-trained models to an IDE that can then be augmented or specialised as needed. We propose three packages of work for the PhD student: 1) reimplementation and replication of key existing research results to form the basis for our experiments; 2) an investigation into the sensitivity of these techniques to their training data and the use of program-analysis techniques for understanding dataset quality; 3) development of techniques for specialising a model (or its suggestions) for style-guide differences, language versions, or library choices.

Privacy in Distributed and Collaborative Learning

Supervisor: Emiliano De Cristofaro, University College London, UK
MSR Supervisor: Olya Ohrimenko

Summary: Distributed and collaborative machine learning are increasingly being deployed in the wild. Because the training data either never leaves the participants’ machines, or is never exposed to untrusted parties, these techniques are a good fit for scenarios where data is sensitive, and the participants want to construct a joint model without disclosing their datasets. However, since model updates/outputs are still indirectly based on the training data, we need to measure whether, and how much, we can prevent the leakage of unintended information about the participants’ training data; and, if so, what can be done to mitigate such leakage. Concretely, we plan to look for leaky inferences in real-world settings, investigate defenses, including training regularizers, such as Dropout or Weight Normalization, user-level differential privacy, adversarial learning techniques, as well as relying on black-box trusted servers. In particular, with respect to the latter, and overall centralized machine learning, we plan to study its trade-offs compared to distributed learning based on communication/computation overhead incurred on end-users and privacy/security guarantees it offers.

Programming biological systems by reverse-engineering reaction-diffusion model

Supervisor: Attila Csikasz-Nagy, Kings College London, UK
MSR Supervisor: Neil Dalchau

Summary: As genetic engineering becomes more precise, the need for high-throughput automated biological scientiﬁc discovery increases. The conceptual gap between computer science and biology is drawing to a close: biological processes can be mapped to algorithms and we are beginning to leverage the asynchronous logic of biochemical reactions to design soft matter computing devices. The overarching aim of this project is two-fold: to develop automated and analytical tools for the design of biochemical reaction networks in synthetic and systems biology settings; to apply these tools to the control of patterns for spatial biochemical computation.

Reinforcement Learning for Enabling Next Generation Human-Machine Partnerships

Supervisor: Adish Singla, Max Planck Institute for Software Systems, Germany
MSR Supervisor: Sam Devlin

Summary: The ultimate goal of AI systems is to support people in achieving their goals more efficiently. While recent advances in AI have led to a remarkable performance of machines in challenging tasks, e.g., in image/speech perception and playing games like Go, these feats have largely been limited to well-specified tasks with known dynamics and predictable outcomes. These limitations can be addressed by designing AI systems that emerge from the complementary abilities of humans and machines by enabling close partnerships between them. For instance, in autonomous driving, this partnership could manifest in the form of an AI auto-pilot handing over control to the human driver in safety-critical situations. To enable this partnership, this project will focus on developing novel reinforcement learning (RL) approaches that effectively and efficiently learn with-and-from people in complex real-world environments. More specifically, we tackle the following fundamental research questions: (i) Given potential differences in perception and behavioral biases between machine and human, how can we design robust multi-agent RL algorithms? (ii) Given that a human could adapt its behavior in the presence of an AI agent, how can we design reactive RL algorithms for enabling long-term human-AI collaborations? (iii) How can we empower a human to steer the behavior of an AI agent, for instance by teaching interactions, and how to make this teaching process more effective? By answering these questions, the project’s mission is to enable next generation human-machine partnerships for the benefit of people and society.
AI for Teams: The Future of Assisted Collaborative Work

Supervisor: Alex Taylor, City, University of London, UK
MSR Supervisor: Sean Rintel

Summary: The proposed PhD research will study the uptake of AI in team-working tools, with a particular emphasis on shaping the future design of this new genre of collaborative work. A qualitative methodology, using predominantly an ethnographic orientation, will be applied to investigate contemporary modes of collaborative work and specifically how AI-based tools and services are being incorporated into work practices and knowledge-based activities. As with long-established workplace studies, the detailed and in-depth results of this orientation and analysis will be used to develop design thinking in this domain. To best exploit this, the PhD research will run alongside and intersect with concurrent work at Microsoft Research and in the Human Experience and Design group focused on teams and the integration of AI.

Compartmentalized RNA Computing using Protein-based Microcompartments

Supervisor: Tom de Greef, Eindhoven University of Technology, Netherlands
MSR Supervisor: Andrew Phillips

Summary: DNA strand-displacement has been widely used for the design of molecular circuits, motors, and sensors. Typically, DNA-based molecular computations based on strand-displacement are executed under non-compartmentalized conditions where all DNA gates are, in principle, able to interact with each other. Recently, we developed an experimental platform that allows compartmentalization of DNA gates into protein-based microcompartments. In addition, we have shown that addition of single-stranded DNAs to the outside of the microcompartments is able to trigger a DNA-based strand displacement reaction from a localized DNA gate, which in turn is able to trigger a second strand-displacement in a different microcompartment. Here we propose to further expand the scope of our experimental platform by using microRNA (miRNA) as chemical inputs. In addition, we will investigate the performance of our compartmentalized RNA and DNA circuits in biological relevant media such as bovine serum. Successful completion of the project will pave the way for distributed DNA computing using both RNA and DNA inputs in biological relevant conditions, with potential applications in high-precision medical diagnostics.

Enabling Repair in Goal-Directed Natural Language Personal Assistants

Supervisor: Joel Fischer, University of Nottingham, UK
MSR Supervisor: Nate Kushman

Summary: Recent advances in speech recognition have brought closer the long-standing vision of the ubiquitous intelligent personal assistant. This has become particularly relevant in the context of smartphones and in-home devices where natural language interaction has the potential to significantly improve the user experience. While already extremely useful, these agents fall far short of the future we have imagined. In general, the speech recognition can successfully transcribe the uttered words, but quite often the agent still does not infer the desired intent correctly. This is particularly evident in the case of ‘trouble and repair’, where the agent’s response does not correspond to the user’s expectation and requires ‘repair’ through subsequent repetition and rephrasing (Porcheron et al., 2017a). Our own prior research shows that this is not an exception, but the norm; in our study of conversational interaction with the Alexa agent, more than half of 800+ requests we recorded ‘in the wild’ were unsuccessful initially and occasioned the user to follow up with further requests (Porcheron et al., forthcoming). Repair may often involve some form of ‘guesswork’ as to what went wrong, as the agent does not generally account for what may have gone wrong (e.g., explain, describe, or offer an interpretation of some form). Arguably this amounts to an example of what has been criticised as ‘black box’ AI, which has recently led to significant funding calls for research proposals for “explainable AI”. In this PhD we propose the exploration of methods to facilitate repair in goal-directed natural language dialog between a human and a AI assistant, contributing towards the higher level aim of making AI more transparent, or intelligible, to user(s). This PhD aims to investigate three general different forms of repair: 1. The AI agent’s response does not match the user’s intent, the human would like to course correct towards the desired outcome. To facilitate repair the PhD will explore the potential benefits of making the AI agent’s ‘understanding’ available to the user, and methods to direct the user towards outcomes which are both feasible for the agent and at least more closely aligned with the user’s intent (i.e., a mutual configuration). 2. The user has made an utterance for which the machine cannot confidently infer the intent. To facilitate repair, the PhD will explore methods to seek clarification from the user. 3. The human has made a mistake and needs to correct themselves. The PhD will explore avenues for the user to initiate self-repair.

Intrinsically Motivated Exploration for Lifelong Deep Reinforcement Learning of Multiple Tasks

Supervisor: Pierre-Yves Oudeyer, Inria, France
MSR Supervisor: Katja Hoffmann

Summary: This project aims to develop autonomous lifelong machine learning techniques that enable virtual or physical intelligent agents to acquire large repertoires of tasks in open uncertain environments. This is key for developing intelligent agents that need to continuously explore and adapt interaction skills to new or changing tasks, environments, people to interact with, and preferences of others. The approach will leverage recent advances in curiosity-driven developmental learning (also called intrinsically motivated learning) to drive exploration in a Deep reinforcement learning framework. They will be evaluated on benchmarks involving the Microsoft Malmo/Minecraft video game platform, for autonomous learning of intelligent non-player characters controllers that can adapt creatively to changing environments.

Machine Learning for Program Analysis and Defect Prediction

Supervisor: Philipp Rümmer, Uppsala University, Sweden
MSR Supervisor: Marc Brockschmidt

Summary: This research proposes harnessing recent dramatic advances in machine learning for automatic program analysis and defect prediction. The project will break new ground in the way machine learning models process programs, advancing from a mostly syntactic treatment of source code to methods that take semantics and existing program analysis techniques into account. This will enable flexible and sophisticated high-level guidance: of developers and test engineers towards likely cases of program defects, and of program verification tools towards complex program annotations and invariants.

Mechanistic Model Misspecification

Supervisor: Richard Wilkinson, University of Sheffield, UK
MSR Supervisor: Ted Meeds

Summary: Mechanistic or simulation-based models are used in scientific research to understand complex natural phenomena. A mechanistic model can take the form of ordinary/partial/stochastic differential equations (O/P/SDEs) and can be rigid in form but have the benefit to the scientist of having interpretable and testable parameter settings. In part due to the inflexibility of the model forms, misspecification of the model can lead to computationally expensive inference procedures, and more importantly, misleading conclusions, whereby the parameter estimations are confidently incorrect. There is increasing evidence that the inference framework called approximate Bayesian computation (ABC) is more robust to model misspecification than other simulation-based parameter estimation techniques. We propose to study the mathematical and statistical properties of this robustness, and explore improvements over current model misspecification approaches. The research will be pragmatic, embedding the theory with practical examples (where domain knowledge is understood, and hence misspecification can be detected), including using semi-mechanistic models that are used in the public health domain.

Optical Fabric for Zero-‐latency memory disaggregation at Pb/s cluster-‐ scale network

Supervisor: Georgios Zervas, University College London, UK
MSR Supervisor: Hitesh Ballani

Summary: Data centres have been historically based on a server-centric approach with fixed amounts of processor and directly attached memory resources within the boundary of a mainboard tray. The mismatch between fixed proportionalities and diverse set of workloads can lead to substantially under-utilized resources (some cases even below 40%) that account for 85% of the total data centre cost. The project aims to explore memory disaggregation at Rack and Cluster level and identify the scalability limits both in terms of number of end-points, network capacity per CPU/Memory, physical distance and the associated penalties to processing power, sustained memory bandwidth etc. It will decouple CPU from memory and use an all-optical interconnect network using scalable on-board transceivers and optical switches to minimize and offer deterministic round-trip latency, maximize throughput and bandwidth density at the lowest cost and power consumption.

Optics for the Cloud: Cost-effective assembly for Hybrid Integrated Photonic Switches

Supervisor: Kevin Williams, TU Eindhoven, Netherlands
MSR Supervisor: Hitesh Ballani

Summary: Optical switching using photonic integrated circuits holds the promise of low-latency, data transparent routing at the packet time scale. However, the assembly of integrated photonic switches involves the attachment of tens of optical fibers with deep-sub-micron-precision and this is now considered to be one of the greatest barriers to deploying switch technology. In this work, we propose to research assembly methods which relax the precision requirements. This is inherently scalable in terms of capacity and connectivity and provides a route to System-on-Chip integration with electronics.

Pixelated Spatial Light Modulator for Phase and Polarisation Modulation of Light (PPSLM)

Supervisor: Daping Chu, University of Cambridge, UK
MSR Supervisor: Andreas Georgiou

Summary: Phase holograms have been used to split the laser beam into many spots thus increasing the write speed of the optical data storage system on glass. However, the throughput and hence the costs are affected by the lack of a physical light engine which can introduce both phase modulation (to split the beam) and polarization modulation (to encode data) at the same time. The objective of this work is to develop the technology to allow polarization and phase modulation of light using a single element. With a single device that can both, slit the beam and introduce polarization modulation, it is possible to accelerate significantly the speed of data write, reduce the amount of energy required and reduce cost, for an ultimate high-density fast-access and low-cost data storage for the cloud.

Securely and Efficiently Supporting Managed Language Runtimes in Confidential Computing

Supervisor: Peter Pietzuch, Imperial College London, UK
MSR Supervisor: Manuel Costa

Summary: In domains such as healthcare and finance in which regulatory and competitive reasons prohibit complete trust in cloud providers, organisations are left with limited solutions for protecting against cloud security threats. A new direction is to use hardware security features in moderns CPUs in order to protect computation in otherwise untrusted cloud environments. Trusted execution environments (TEEs) such as Intel’s recently launched Software Guard Extensions (SGX) introduce the concept of secure enclaves. Yet, before TEEs will be widely used to protect real-world data-intensive cloud applications, we observe two open challenges: (1) how can we protect against vulnerabilities inside the TEE itself?; and (2) how can we support high-level programming languages inside TEEs effectively? We plan to address these challenges by exploring two complementary research directions: (a) to protect against security vulnerabilities in complex TEE implementations, we will develop new compiler-based techniques for implementing protection boundaries inside of secure enclaves. By compartmentalising TEE implementations, we can contain the impact of security vulnerabilities within a secure enclave following a privilege separation policy; (b) to support complex language runtimes inside TEEs, we will employ functionality from library OSs and investigate the security and performance implications. In particular, we will explore how common functions performed by managed runtimes, such as just-in-time (JIT) compilation and garbage collection can be supported effectively and securely.

Statistical Learning and Adaptive Observation in Clinical Prediction: Methodology and Applications

Supervisor: Glen Philip Martin, University of Manchester, UK
MSR Supervisor: Danielle Belgrave

Summary: Limited resources restrict how frequently healthcare professionals can observe patients and/or monitor information collated in information systems. For example, within hospital acute care admissions, vital signs are collected at 4-hourly intervals to monitor escalation/ de-escalation of treatment. While there is scope to adapt an observation process to target high-frequency observation in patients at high-risk of adverse outcome, there is currently no systematic and automated mechanism in which to inform this process. Therefore, this PhD project aims to integrate clinical prediction models (CPMs) with adaptive observation by developing methodology in this space, which will be motivated by two real-world exemplars (acute hospital care and cancer early detection). Key objectives of this PhD will include: (i) adapting/developing probabilistic modelling/ machine learning frameworks in the context of adaptive observation, (ii) investigate how to maintain predictive accuracy at the extremes of a predictive distribution, and (iii) examine the clinical and statistical utility of incorporating adaptive observation within prediction models. Potential modelling strategies could include graphical models, mixed effects models and Gaussian processes. Each modelling strategy will be compared and tested through extensive simulation studies, and applied to the two clinical exemplars and associated data streams. This project has potential to improve the targeting of resources in near real-time within clinical workflows.

Reinforcement Learning for Enabling Next Generation Human-Machine Partnerships

Supervisor: Adish Singla, Max Planck Institute for Software Systems, Germany
MSR Supervisor: Sam Devlin

Summary: The ultimate goal of AI systems is to support people in achieving their goals more efficiently. While recent advances in AI have led to a remarkable performance of machines in challenging tasks, e.g., in image/speech perception and playing games like Go, these feats have largely been limited to well-specified tasks with known dynamics and predictable outcomes. These limitations can be addressed by designing AI systems that emerge from the complementary abilities of humans and machines by enabling close partnerships between them. For instance, in autonomous driving, this partnership could manifest in the form of an AI auto-pilot handing over control to the human driver in safety-critical situations. To enable this partnership, this project will focus on developing novel reinforcement learning (RL) approaches that effectively and efficiently learn with-and-from people in complex real-world environments. More specifically, we tackle the following fundamental research questions: (i) Given potential differences in perception and behavioral biases between machine and human, how can we design robust multi-agent RL algorithms? (ii) Given that a human could adapt its behavior in the presence of an AI agent, how can we design reactive RL algorithms for enabling long-term human-AI collaborations? (iii) How can we empower a human to steer the behavior of an AI agent, for instance by teaching interactions, and how to make this teaching process more effective? By answering these questions, the project’s mission is to enable next generation human-machine partnerships for the benefit of people and society.

Visual Fast Mapping

Supervisor: Richard Turner, University of Cambridge, UK
MSR Supervisor: Aditya Nori

Summary: There has been considerable progress since 2011 on the classification of objects, textures and scenes in images. This is in substantial measure thanks to developments in deep learning and their widespread adoption and development in machine vision. It is notable that machine performance appears to differ markedly from human performance at object classification tasks in requiring big data for training. Humans however can learn to recognise new categories from data sets with remarkably few labels. This doctoral project explores possibilities for machine learning that may substantially increase the training efficiency of machine classification of images. The ideas will be tested on data sets of medical images and scenes for autonomous systems. The research is expected to impact both medical imaging and the way people interact with and train AI systems.

Joint Initiative with Informatics with University of Edinburgh

High-Level Synthesis of Neural Networks for FPGAs with LIFT

Supervisor: Christophe Dubach
MSR Supervisor: Dimitrios Vytiniotis

Summary: Machine-learning applications are becoming pervasive throughout our entire society. They are already used extensively in areas such as machine translation and business data analytic and are set to revolutionise our world with applications such as self-driving cars. This has become possible thanks to the massive amount of data available for training coupled with the development of powerful parallel hardware. However, writing efficient parallel implementation for these algorithms remains a challenge for the non-experts. The presence of parallel accelerators such as GPUs (Graphic Processing Units) or FPGAs (Field-Programmable Gate Arrays) means that software has to be specifically written for these devices. Programmers have to use different programming models and often need to fine-tune their code for the special characteristics of the targeted hardware. This expensive and time-consuming process needs to be repeated every time new hardware emerge or even when the software stack is updated. To enable machine-learning expert to unlock the potential of future systems, we need to focus on new software programming model that abstract away most of the hardware details. In this project, we propose to build upon our existing LIFT project, an Open Source language and compiler initially developed in my group. LIFT combines a high-level functional data parallel language with a system of rewrite rules which encodes algorithmic and hardware-specific optimisation choices. An applications written in LIFT is able to take advantage of parallel accelerators available in the systems, transparently from the user. This proposal is about augmenting LIFT with the ability to express and optimise machine-learning algorithms and exploit effectively FPGA hardware.

Joint Initiative with University College London

Computer Vision on the Edge

Supervisor: Gabriel Brostow
MSR Supervisor: Matthew Johnson

Summary: Computer vision algorithms are now accurate enough to be useful, yet they usually require too much infrastructure for deployment in the real world. Typically, a trained CNN requires substantially more memory and power than is available on an edge-device. Mobile phones are the obvious frontier for new energy- and memory-efficient computer vision models, but the Internet of Things age is also hampered for these same reasons; embedded devices and ground-breaking machine learning algorithms are being developed by disjoint groups of people. This project will produce the break-throughs for making advanced resource-aware algorithms that can scale to match the available resources and desired behaviors.
Learning Computing with Torino: a physical programming language inclusive of children with visual disabilities

Student: Alex Hadwen-Bennett
Supervisor: Sue Sentence, King’s College London, UK
MSR Supervisor: Cecily Morrison

Summary: Torino is a physical programming language for teaching programming concepts and computational thinking to children age 7-11 regardless of level of vision. To create code, the children connect physical instruction beads and tune their parameters to create music or stories. Computational thinking skills are encouraged through specific challenges that require the children to break down problems, work with constrained resources, or debug programs. Intended to be inclusive of children with visual disabilities in mainstream classrooms, particular emphasis is placed on the tactile nature of the beads and the collaborative interactions enabled between blind children and their sighted peers. This research will study the implementation of the Torino device in mainstream classrooms for visually disabled children. The research will address the question: “How does the Torino environment and associated pedagogical approaches support the development of early programming skills by visually disabled children?” The hypothesis is that for children with a visual disability the learning of elementary programming skills and computational thinking skills is enhanced using a physical language as a first medium, in comparison to text-based approaches. The research will involve the adaption of know pedagogical techniques to Torino followed by an empirical study in 5-10 primary schools. An interpretative research paradigm will be used involving both qualitative and quantitative research and data analysis. The PhD will make a contribution to the area of programming for children with visual disabilities and the use of tangible environments to achieve this.

Decoding the Network Logic Governing Resetting of Pluripotency

Student: Arthur Radley
Supervisor: Austin Smith, University of Cambridge, UK
MSR Supervisor: Sara-Jane Dunn

Summary: This proposed studentship project concerns the process of cellular reprogramming: converting a somatic cell type to an embryonic stem cell-like state. These so-called induced pluripotent stem cells (iPSCs) are capable of making all cell types of the adult body and they can be generated from essentially any cell type, from any individual. Because cells with such capacity arise only transiently at the very earliest stages of development, the efficient reprogramming and maintenance of iPSCs ex vivo is fundamental to realise personalised regenerative medicine. Despite reprogramming techniques being demonstrated a decade ago, the process remains inefficient and poorly understood. Here, we propose an interdisciplinary project that seeks to uncover the molecular network transformations that mediate cell fate progressions from an established identity to the pluripotent stem cell state. Knowledge generated from this research will benefit effective and efficient cellular reprogramming for drug discovery and cell replacement therapy, while the approach can also be applied to other cell-fate decisions. Advancing understanding of the logic of cellular decision-making will help us to program and design living systems. This will bring significant benefit to many sectors of science, technology and medicine.

OutSider: Assessing and Mitigating Side-channel Leaks on Commodity Platforms

Student: Benjamin James Gras
Supervisor: Herbert Bos Vrije Universiteit Amsterdam, Holland
MSR Supervisor: Manuel Costa

Summary: Side channels are the elephant in otherwise well-protected rooms, as even the most sophisticated and recent defenses typically exclude side channels from their threat model. However, new research is showing that side-channel attacks are increasingly practical. In this research project, we will conduct an in-depth investigation of side-channel attacks over the memory subsystem in three research tasks. The proposed tasks allow us to (i) assess the worst-case impact of known side channels and develop both (ii) detection and (iii) protection techniques. In the first research task, we will investigate new side-channel attacks and assess their potential for violating confidentiality on commodity platforms. While several side channels on commodity software/hardware are known, no complete assessment has been carried out. Our approachwill highlight the spectrum of side-channel attacks on current platforms. In the second and third research tasks, we will develop practical detection and protection techniques against side-channel attacks to protect the confidentiality of critical user data and system data. Our detection techniques are based on observing resource usage and inferring on-going attacks, while our protection techniques seek to proactively remove any side channel. We will balance detection and protection techniques to stop what we can and to detect any remaining attacks.

Shareable Dynamic Media in Hybrid Meetings

Student: Banu Saatci
Supervisor: Clemens Klokmose, Aarhus University, Denmark
MSR Supervisor: Kenton O’Hara

Summary: Sharing documents in hybrid meetings between participants often require additional coordination or mental gymnastics. For example, sharing a document with co-workers, between applications or devices, and keeping its state synchronized across them becomes a tremendously challenging task. In this project, we will investigate the potential of using shareable dynamic media as realized through Webstrates as the foundation for supporting hybrid meetings. Resonating with the Social Devices project at MSRC, we further plan to unlock capabilities of personal (laptop, tablet, and smartphone) and shared (Microsoft Surface Hub) devices available during a hybrid meeting to provide fluid transitions between individual work (I-work) and collaborative work (we-work). Results of this project will inform future designs of business tools used in hybrid meetings such as Skype and Skype for Business, Microsoft Office, Microsoft SharePoint, and Microsoft OneNote.

Training and Tuning Deep Neural Networks: Faster, Stronger, Better

Student: Chen Liu
Supervisor: Volkan Cevher, EPFL, Switzerland
MSR Supervisor: Ryota Tomioka

Summary: Deep learning has recently received significant attention. However, training and tuning hyperparameters in deep learning models is notoriously difficult. We propose to attack the training problem through a novel algorithmic framework, and propose careful implementation in order to achieve state-of-the-art. We propose new performance criterion, and present new algorithm, for achieving fast and robust hyperparameter tuning simultaneously.

Modelling Infective Exacerbations in Cystic Fibrosis

Student: Damian Sutcliffe
Supervisor: Andres Floto, University of Cambridge, UK
MSR Supervisor: John Winn

Summary: Cystic Fibrosis (CF) is an inherited life-limiting multi-systemic disorder affecting chloride and bicarbonate ion transport across epithelial layers. In the lung, CF is characterized by defective muco-ciliary clearance, as a result of reduced airway surface liquid, which leads to chronic respiratory infection, progressive airway inflammation and bronchiectatic lung damage. Although CF can affect a number of other organs (such as the intestines, liver and pancreas), mortality results primarily from progression of lung disease and respiratory insufficiency. As such, acute pulmonary exacerbations (APEs), characterised by increasing breathless, cough, sputum production and sometimes chest pain, remain one of the most significant clinical events for patients with CF, and are the single most important cause of CF morbidity. Since little is understood about the pathophysiological processes that trigger APEs, we have undertaken a UK multicenter study of adults with CF (SMARTCARE). We combined daily remote physiological monitoring and sputum analysis of 200 individuals with CF for 6 months. The aim of this project would be to develop generative graphical models of the processes leading to APEs with a view to using the model to i) gain insight into the pathophysiology of APEs, which might subsequently be testable clinically or experimentally, and ii) develop algorithms that might predict the onset of APEs, which could potentially be used clinically to trigger early antibiotic therapy.

Human-Centred Machine Learning for Adaptive Agents with Vision

Student: Esben Andersen Sørig
Supervisor: Rebecca Fiebrink, Goldsmiths University of London, UK
MSR Supervisor: Cecily Morrison

Summary: This PhD is part of a project focused on developing agent technologies that allows blind and partially sighted people to tune their sense-making of surrounding people. The research will develop novel human-centred machine learning approaches that enable people who are blind and partially sighted to personalise their own agent. There will be a particular focus on end-user creation and personalisation of machine learning systems (ie. human-in-the-loop/interactive machine learning) and the development of appropriate spatial audio interfaces for the agent.

SMVRF: Secure Messaging Verifiably Realized in F*

Student: Konrad Kohbrok
Supervisor: Chris Brzuska, Technische Universität Hamburg-Harburg (TUHH), Germany
MSR Supervisor: Markulf Kohlweiss

Summary: This project provides foundations to the security of secure messaging protocols and advances the scope and understanding of F* as a verification tool for cryptographic protocols. Secure messaging is a relatively young research area of cryptography and the IETF will soon set out to discuss a standard for secure messaging requiring researchers to evaluate and analyze protocol proposals. We contribute to this project via verified protocol implementations in F*. As the implementation needs to cover all details of the protocol, just like for miTLS, the verification process uncovers security-relevant protocol aspects that are often elided in simplified models. We aim to channel these conceptual insights back to the cryptographic community by formalizing our new understanding of secure messaging in classical cryptographic security models. As an immediate consequence, the wider community can build on our models in their simultaneous analyzes of the numerous messaging protocol proposals that will be discussed by the IETF. Indeed, more than one protocol proposal will be on the table, and we will also closely coordinate and collaborate with researchers that use different analysis techniques than ours. While other techniques provide a more coarse analysis, they tend to be fast (as long as the protocol model is not too large) and are thus particularly useful in the early stages of the IETF standard discussion when a cryptographic core protocol needs to be chosen amongst all proposals. In turn, once the core protocol is fixed, the remaining protocol design needs to be checked for subtle attacks that rely on implementation details (and, whenever relevant, those details need to be covered by the standard), and our verified F*-implementation will achieve this task—conveniently, we thereby also provide a solid reference implementation once the draft has converged into a standard. The F*-implementation and verification approach to the security of cryptographic protocols has been developed at INRIA Paris and MSR Cambridge within the miTLS project. SMVRF advances the scope and understanding of F* as a verification tool by using it for messaging protocols and, importantly, by adding cryptographic ideal functionalities to HACl, a low-level cryptographic library written in F*. As the ideal functionalities can be plugged into arbitrary cryptographic protocols in a modular way, we thereby enable the wider community to use the F*-implementation and verification approach to analyze arbitrary cryptographic protocols beyond TLS and secure messaging—and to provide sound reference implementations.

Reinforcement Learning for Adaptive User Interaction

Student: Luisa Zintgraf
Supervisor: Shimon Whiteson, University of Oxford, UK
MSR Supervisor: Katja Hofmann

Summary: The project has the mission of developing technology that allows blind and visually impaired people to tune their understanding of their surroundings through being able to query the social, physical and textual environment. You would join a project team that brings together researchers with backgrounds in machine learning, artificial intelligence, and human computer interaction. Starting from a rich understanding of human-agent interaction and advances in machine learning and artificial intelligence, you will devise novel solutions that allow artificial agents to adapt to and empower their users. The research will focus on developing novel reinforcement learning approaches that effectively and efficiently learn in complex real-world applications. For example, this can include questions such as how to incorporate prior knowledge with reinforcement learning approaches in order to learn with minimal on-task / online training; how to learn from noisy or biased user demonstrations or user feedback; and how to effectively adapt or transfer knowledge to a novel task.

STARCH: SmarT ARchitectures for Data Center Switching

Student: Nadeen Gebara
Supervisor: Wayne Luk, Imperial College London, UK
MSR Supervisor: Paolo Costa

Summary: This proposal explores the feasibility of deploying programmable FPGA-based switches in data center networks with the goal of making them smarter: improving network performance while reducing costs, and accelerating critical applications such as distributed machine learning and data analytics. The effectiveness of the proposed approach, called STARCH, will be evaluated analytically and experimentally. Prototype tools will be devised to automate the development and experimentation of STARCH designs.

Power Efficient Rack-Scale Fabrics

Student: Omer Shimon Sella
Supervisor: Noa Zilberman, University of Cambridge, UK
MSR Supervisor: Ant Rowstron

Summary: Rack-Scale Computing is an emerging field in systems which explores the tight integrated design of server, networking and storage. This then becomes the basic building block in enterprise data centers and cloud infrastructure. Rack-scale computing provides superior performance compared with rack enclosures fitted with stand-alone servers, providing scalable high performance networked systems. This project has two goals, the first is to investigate power efficient rack-scale fabrics. Rack-scale computers are bounded by the power budget of their enclosure and their performance will always be limited by their power consumption. From studying the power efficiency and bottlenecks of existing fabric topologies, both from the data center and HPC world, we will develop new power efficient rack-scale fabric architectures. These fabric architectures will be evaluated using a simulator, and then implemented and evaluated in actual systems using real workloads. There is currently no widely adopted simulator in this emerging rack-scale computing field. As this will be needed to evaluate the power efficient rack-scale fabric architectures, the second goal is to develop the de facto standard simulator for designing and evaluating rack-scale computing. We plan to take the current Microsoft Research Rack-Scale Simulator, which was used to evaluate Pelican, XFabric and other rack-scale designs, and to work closely with Microsoft Research researchers to generalize and extend it. We plan to use our experiences building open source communities to drive adoption across the emerging rs computing community.

Towards Ethical Development of Symbiotic Human-Machine Systems; creating ethical frameworks and solutions

Student: Sarah Joy Bennett
Supervisor: Ewa Luger, University of Edinburgh, UK
MSR Supervisor: Alex Taylor

Summary: Digital devices and services have become an embedded feature of everyday life. Such computational systems, which are already wholly or partly ‘invisible’ from the perspective of the user, routinely collect ever-broadening types of human data in order to provide intuitive, intelligent and responsive services. Machine Intelligence (MI) is the engine upon which our core systems rely and it has been argued that the opacity of system operation makes user interactions difficult, and can result in poor user experience. However, as such systems become increasingly autonomous, so the idea of clearly defined human agency is problematized, and issues of ethics are raised – key to this is the extent to which such intelligent systems are intelligible to the user. There are currently no coherent ethical frameworks or guidance for MI systems. This PhD would engage in conceptual development, empirical research and technical studies to develop an ethical framework, interactional heuristics and interface designs for MI systems. It would focus specifically upon project Tokyo as an instantiation of an advanced example of such a platform.

Programmable Single-Cell Biocomputers with Scalable Signal Processing Capacity

Student: Yiyu Xiang
Supervisor: Baojun Wang, University of Edinburgh, UK
MSR Supervisor: Neil Dalchau

Summary: The project aims to engineer an expanded library of versatile orthogonal (non cross-reacting) genetic building blocks to enable programming advanced biological signal processing capabilities in live bacterial cells. We will focus on repurposing a new scalable tool, split inteins, to engineer libraries of modular and orthogonal genetic logic gates and switches. We will also build multi-layered genetic programs from these blocks, guided by modelling, to implement complex signal processing and computing functions in a single cell. The successful outcome will significantly scale up our capacity for building large complex circuits to program high-level behaviours in cells such as diagnosing diseases and overproducing high value chemicals, and will be of enormous benefit to researchers in the biocomputing and bioengineering communities.
Bit-level Accurate Reasoning and Interpolation

Student: Adrián Rebola Pardo | TU Wien, Austria
Supervisor: Georg Weissenbacher
Microsoft Research Supervisor: Christoph Wintersteiger

Summary: Many contemporary verification tools make simplifying and simplistic assumptions about data types such as bit-vectors and floating point numbers. Such assumptions may result in imprecise abstractions that thwart successful verification or fail to account for corner cases responsible for intricate bugs. The goal of the proposed project is to apply as well as improve bit-level accurate decision procedures (SAT and SMT) to enable interpolation-based model checking as well as the detection and analysis of complicated bugs for software that uses bit-vectors and floating point arithmetic.

Rack-Scale Interconnect Fabrics for Disaggregated Memory

Student: Antonios Katsarakis | University of Edinburgh, UK
Supervisor: Boris Grot
Microsoft Research Supervisor: Paolo Costa

Summary: Two technologies are set to reshape tomorrow’s large-scale data-processing systems. The first of these is rack-scale computing, which relies on integrated fabrics to interconnect thousands of processing nodes at a low latency and high bandwidth. The other is memory disaggregation, which physically separates processing and memory resources onto separate blades inside a rack, thus maximizing cost/performance through flexible resource allocation. To realize the marriage of these technologies, this project will investigate high-performance and complexity-effective rack-scale interconnect stacks, particularly focusing on network protocols, topology, router microarchitecture and SoC integration.

Dense Visual Tracking for Active Manipulation

Student: Christian Rauch | University of Edinburgh, UK
Supervisor: Maurice Fallon
Microsoft Research Supervisor: Jamie Shotton

Summary: Autonomous robotic hardware had made great progress recently. However robotic manipulation has some persistent limitations. Grasping and use of objects is typically very crude and utilizes little or no perceptual feedback. Robots are limited to carrying out manipulation in structured situations with visually distinct objects without requiring the precise pose of the object of interest. A better approach would be to utilize visual tracking to improve manipulation capability. In particular, we are interested in situations where the precise orientation of the object is important. Perception in these situations is particularly challenging because it requires the understanding of many factors: the motion and kinematics of the robot, calibration of both sensors and the robot itself. In this project we will develop visual tracking algorithms which utilize dense visual image simulation and multi-modal optimization to achieve precise tracking of a robot’s end-effectors and manipulands.

Reducing the Annotation Tax of Programming Language Types Using Machine Learning and Big Code

Student: Eirini Vlassi Pandi | University of Edinburgh, UK
Supervisor: Charles Sutton
Microsoft Research Supervisor: Andy Gordon

Summary: Decades of research in programming languages has resulted in rich formal languages that can be used to assert facts about programs and to infer whether those facts hold. Explicitly annotating a program with semantic properties supports program verification and bug finding, but it comes with a cost, which we call the “annotation tax.” Type annotations are not free, but rather cost developer time to develop, debug, and maintain. We propose that it is possible to reduce the annotation tax by applying new ideas from machine learning for “big code”. In this PhD project, we work within the framework of refinement types and propose that the student build a probabilistic machine learning model that can (a) suggest precise type annotations to a human developer, and (b) suggest fixes to programs that have type errors. We aim to make it far easier for human developers to apply complex and sophisticated type systems in practice, reducing the annotation tax.

Learning to Infer in Graphical Models and Probabilistic Programmes

Student: Emile Mathieu | University of Oxford, UK
Supervisor: Yee Whye The
Microsoft Research Supervisor: Danny Tarlow

Summary: Efficient general purpose inference algorithms play a central role in the dream of creating ultra-flexible probabilistic programming and graphical modelling tools for machine learning. Adaptive strategies for inference allows for the automated adaptation of inference algorithms to specific models and data at hand, and can significantly improve both the accuracy and efficiency of inference algorithms. We propose to develop and study novel methods for adaptive inference, for both Monte Carlo and variational contexts, by viewing the process of adaptive inference as online learning, and bringing to bear the wide array of successful learning algorithms and frameworks on the problem.

Computational Design of Nonlinear Functions using Nucleic Acids

Student: Iuliia Zarubiieva | University of Warwick, UK
Supervisor: Vishwesh Kulkarni
Microsoft Research Supervisor: Andrew Phillips

Summary: In this project, we propose the first known results on how nucleic acids can be used to implement a wide range of polynomials, rational functions, and programmable Hill-type nonlinearities. This work builds on our prior collaboration with Microsoft Research and will facilitate a relevant toolbox for the Visual DSD software.

Lightweight Concurrency Modelling

Student: Ioannis Stefanakos | University of York, UK
Supervisor: Mike Dodds
Microsoft Research Supervisor: Matthew Parkinson

Summary: Concurrent systems remain extremely challenging to develop. For this PhD, we propose developing lightweight modelling tools for concurrent algorithms, based on ideas from recent concurrent logics. The aim will be to support rapid prototyping and counter-example finding during the development process, while incorporating the expressiveness and modularity of modern logics.

Smart Molecular Biosensors: Design Principles and Foundational Technologies

Student: James Coxon | University of Cambridge, UK
Supervisor: James Ajioka
Microsoft Research Supervisor: Boyan Yordanov

Summary: Inspired by the potential applications of biological computation for improved biosensor function, this project will establish strategies and design principles for the construction of smart biosensors, using biological components to perform information processing in addition to sensing. The main goals of this project are (1) use a model driven approach to systematically investigate the advantages and limitations of combining sensing and signal processing functions within the same biochemical system in the context of biosensors, (2) to design, experimentally implement, and characterize biochemical components as sensing and computational primitives to enable rapid prototyping of smart biosensors, and (3) to build and experimentally test proof-of-concept smart biosensors for analytes relevant to environmental sensing applications.

Towards Total Immersion: Accurate Reconstruction of Lights, Materials and 3D Geometry from RGB, Depth and Motion

Student: Lars Mescheder | Max Planck Institute for Intelligent Systems, Tübingen
Supervisor: Andreas Geiger Germany
Microsoft Research Supervisor: Sebastian Nowozin

Summary: In the past decade computer vision technology and novel depth sensing devices have enabled a new quality of natural interaction with computing devices. A key factor enabling this progress has been a more accurate understanding of the user and his environment through RGB+depth cameras, allowing accurate position tracking and approximate recovery of the scene geometry in real-time. However, current systems are limited in that the recovered scene representation is coarse and approximate, for example, material properties and light sources are not recovered. Our goal is to develop efficient probabilistic models of light, materials, and geometry in order to provide high-fidelity reconstructions of the user environment from RGB and RGB-D video. The research proposal below outlines the key challenges and the proposed approach in more detail.

Modelling the Survival and Proliferation of Cancer Cells in Metastasis

Student: Laure Talarmain | University of Cambridge, UK
Supervisor: Benjamin Hall
Microsoft Research Supervisor: Jasmin Fisher

Summary: The progression of a cancer to metastasis is the final stage of the disease, and is presently incurable. Cells derived from different cancers show distinct metastatic propensities which may affect their responsiveness to drug treatment, but the mechanisms for this remain poorly characterised. Here I propose to develop models of populations of metastatic cells, to identify the key factors which differentiate cancer cell types. From the models of cell behaviour in a population we will further go on to develop new abstractions of the population dynamics using ideas developed from the use of fluid approximations in dynamic modelling. Ultimately, we will use these tools to both describe the observed differences between cancer cell types, and with collaborators in AstraZeneca test cell line responsiveness to drug challenges. This work will lead to the generation of a new conceptual platform for understanding this critical stage of the disease.

Medical Image Analysis in the Cloud: Application to Early Stage Cancer Detection

Student: Nick Pawlowski | Imperial College London, UK
Supervisor: Ben Glocker
Microsoft Research Supervisor: Antonio Criminisi

Summary: The aim of this project is to develop automated image analysis tools to support radiological reading during early-stage cancer detection. The objective is to significantly increase sensitivity and specificity while drastically reducing the required reading time. At the core of the research is an automatic abnormality detection system which is based on statistical, generative population models derived from an unprecedented large-scale normative imaging database (UK Biobank). Detection of abnormalities is achieved by comparing clinical image data to the model statistics, and any variation outside the norm is highlighted automatically and reported to the user for further visual inspection. Such reading tools have the potential to significantly improve the cost-effectiveness and diagnostic workflow in whole-body cancer detection. Implementing the tools in the cloud enables a seamless integration into clinical sites, and wide dissemination of the research outputs.

Estimating the Credibility of Health Information on the Web

Student: Peter Hayes | University College London, UK
Supervisor: Ingemar Cox
Microsoft Research Supervisor: To Be Appointed

Summary: A significant problem with current web-based search algorithms is their inability to infer the credibility of information available on the Web. It is our hypothesis that a new measure is needed to assist in the ranking of documents, and that this measure should be based on notions of credibility. Credibility is, amongst other factors, based on what is being said (content), who is saying it (trust), and who is publishing it (authority). Significant research has focused on individual factors, but less on the combination. Even less work has considered the subjective nature of credibility, i.e. what is credible to one person may not be to another. Here we propose to use various machine learning tools, particularly deep learning, to both estimate credibility and generator more credible texts based on user profiles.

Conversational Search: Mathematical Modelling and Applications

Student: Rafal Muszynski | University College London, UK
Supervisor: Jun Wang
Microsoft Research Supervisor: Filip Radlinski

Summary: Many complex information retrieval tasks involve conversations and explorations. These range from simple cases where the same user fails to obtain suitable results in an earlier interaction and keeps refining queries, to complex exploratory search scenarios where no single item or single query suffices and users explore the information space sequentially. We see in many cases like personal assistant (e.g, Cortana, Siri), dialogues might be needed in order to provide personalised content and services. This proposal aims to develop theoretical groundwork for such a multi-interaction retrieval and recommendation setting, then measure the effectiveness of resulting algorithms in a practical multi-interaction system. This will deliver a principled and effective Conversational Search (CS) framework.

Continuous Listening Services: Leveraging Sociological Understandings of Talk for Continuous Agent Input

Student: Razan Jabber | University of Stockholm, Sweden
Supervisor: Barry Brown
Microsoft Research Supervisor: Kenton O’Hara

Summary: New developments in speech, image recognition, and smart agents, have created opportunities for technology to pre-emptively assist users. Our own work explored how always-on audio recordings of ‘everyday life’ could be fed into recognition systems to both prime systems, and to trigger pre-emptive system actions. This thesis project will combine sociological understandings of talk and action with capturing constant near always-on streams of individuals’ everyday activity. Recordings of everyday action and talk will be used to experiment with agent technologies that exploit the pragmatics of interaction to help systems potentially pre-empt user actions; assist conversation; prevent confusions; change environments and inform and adjust systems long term to users’ preferences. This project aims to produce computationally tractable understandings of ordinary action that can be used to enable new system activity, built from a broad multi-media corpus of recordings of everyday action.

Data-Efficient Reinforcement Learning from Image Pixels

Student: Steindor Saemundsson | Imperial College London, UK
Supervisor: Marc Deisenroth
Microsoft Research Supervisor: Katja Hofmann

Summary: The key aspects of an autonomous (robotic) system are autonomous thinking and decision making using sensor measurements only, intelligent exploration and learning from mistakes. In this project, we devise a reinforcement learning framework that does not have direct access to the system’s state (e.g., a robot’s joint configuration and the environment), but only to pixel information from a camera observing the scene. Our aim is to devise a learning framework that allows a robot to solve tasks in continuous state and action spaces in a data-efficient way: Data efficiency is crucial in practical robotic applications where data collection costs time and money. We approach this challenge by using Gaussian processes as a model for (a) probabilistic representation learning, and (b) learning of probabilistic predictive models that faithfully capture model uncertainty. We expect the learning algorithms to learn an order of magnitude faster than the current state of the art.

A Highly Scalable Optical Switch Fabric for Data Centre Networks

Student: Thomas Gerard | University College London, UK
Supervisor: Benn Thomsen
Microsoft Research Supervisor: Hitesh Ballani

Summary: This project aims to develop the physical layer technologies required to implement a cost effective, scalable, flat (non-hierarchical), high bandwidth optical switching fabric for rack-scale computers (interconnecting system-on-chip micro-servers) or data centres (interconnecting top-of-rack switches) with switching latency around 2 orders of magnitude faster than other proposed solutions. We propose an over-subscribed optical broadcast and select switch architecture that employs both fast wavelength switching (WS) for route selection and Time Division Multiple Access (TDMA) for packet based switching. The project will design and implement the physical layer communications links for the data and control planes to produce a switch fabric that is suitable for integration with data centre servers for evaluation of the switch performance in real applications.

Meaningful Metadata: The Things I Wish I Knew

Student: Zhen Ge | University of Dundee, UK
Supervisor: Wendy Moncur
Microsoft Research Supervisor: Sian Lindley

Summary: The “grammar of actions” for files is currently extremely limited. Read. Save. Update. Copy. Delete. Developments in metadata have enriched the material stored about a file, but the question identified by Harper et al, “what else can we do with files?” remains. This research will consider the corpus of digital materials – personal, social, organisational and environmental – generated as life evolves in relation to significant life transitions. It will explore how this corpus and associated metadata can be exploited in novel ways, both for functional purposes creating long-term utility, and to create new experiences that enable creative, evocative, and contextual explorations of the corpus. A key part of the research will be to identify what additional metadata is needed to facilitate this exploitation. Central will be the ability for users to answer functional questions and to develop social narratives using this corpus.
Global Fitness Maximising Approaches to Evaluate the Trade-Offs Involved in the Evergreen and Deciduous Conundrum

Student: Anne Uilhoorn | Leiden University Amsterdam, Netherlands
Supervisor: Peter van Bodegom,
Microsoft Research supervisor: Matthew Smith

Summary: Information about the planet undoubtedly has the potential to become the world’s most valuable commodity in the coming decades. Traits-based approaches are increasingly incorporated in Dynamic Global Vegetation Models (DGVMs) to fundamentally improve their representation of the globe. Microsoft’s Computational Science Lab is at the forefront of this development. However, the challenge of building a prognostic model to understand how evergreen-deciduous strategies affect plant fitness and the global distribution of these leaf habits has not yet been addressed. We will apply eco-evolutionary principles to develop a mechanistic foundation with respect to this conundrum, evaluating key trait trade-offs in plant carbon and nutrient management and hydraulics. The impacts of trait constraints will be tested in the modular fully data-constrained DGVM of Microsoft Research to provide generic ecological insights on emergent trait values for different leaf habits, now and in a future climate.

Self-Optimising Internet Services

Student: Chad Verbowski | University of Edinburgh, UK
Supervisor: Hugh Leather
Microsoft Research supervisor: Flavio Junqueira

Summary: Modern Internet services can span as many as 1,000,000 servers, requiring multiple geographically distributed instances to serve customers around the world. The financial cost of building and operating each of these various services can be more than US$1 billion annually. In addition, it is critical that the performance of these services is highly optimised across all servers and datacenter locations. Experiments slowing user responses by as little as 100ms have caused a measurable decrease in user engagement. Given the large scale and impact of these systems, optimising them presents a significant resource saving opportunity and a chance to improve overall service quality. This proposal presents a novel idea for making significant improvements to the capacity utilisation of these servers thus potentially saving many millions of dollars annually, and in addition will improve the end-user perceived availability of these services and reduce the latency they experience when using them.

System-Level Support for Persistent Memory

Student: Christos Perivolaropoulos | University of Edinburgh, UK
Supervisor: Stratis Viglas
Microsoft Research supervisor: Aleksandar Dragojevic

Summary: This proposal argues for the introduction, definition, and implementation of a system-level API for accessing next generation persistent memory arrays. Our key idea is that developers use an expressive API to describe the I/O flow of their applications. With the API in place, the system itself can dynamically adapt to the workload and either optimise the I/O flow of individual applications, or optimise the I/O workflow for the entire system. This optimisation is achieved through lightweight adaptation, dynamic cost-benefit analysis, refinement of optimisation objectives, and potentially just-in-time code generation.

Model Worlds—Interacting with Machines which Read the Future

Student: David Benque | Royal College of Art, UK
Supervisor: James Auger
Microsoft Research supervisor: Alex Taylor

Summary: Machine learning and predictive modelling systems are developing rapidly and promise to have an increasing impact on society and everyday life. This project proposes to examine this impact from a design perspective. Using practice based research methods such as visualisation and speculation, this project aims to explore how we relate to machines that appear autonomous and make predictions about the future. Engaging with specialists in the field, an informed technical basis will be built up from which to extrapolate scenarios and proposals for new ways to interact with predictive models. These will take a tangible form such as fictional products or services, mock-ups and interactive experiments–their primary aim will be to provoke discussions with experts, scientists and end-users about how we perceive this technology and where we might want it to go in the near future.

Fast, Flexible and Functional Software

Student: David Mersinjak | University of Cambridge
Supervisor: Anil Madhavapeddy
Microsoft Research supervisor: Cedric Fournet

Summary: This proposal focusses on the challenge of building a secure TLS stack using the operating system and modularity techniques developed as part of the Mirage OS unikernel project, and combine it with the rigorous formal model developed as part of the miTLS stack.

Designing the Design of the Internet of Things

Student: Iulia Ionescu | Royal College of Art, UK
Supervisor: Ashley Hall
Microsoft Research supervisor: Richard Banks

Summary: As designers have shifted activity from objects to networks a raft of new design challenges have arisen in the move from products to experiences via cloud computing. Amongst these are a series of interconnected questions clustered around the concept of the computer file as a unitary object, the point at which designers engage with technology and how designers can understand when they expose users’ data and privacy. In particular, these questions revolve around how designers can appreciate their agency, methods, and impact while developing designs in the context of the Internet of Things. The aim of this research is to design and build a series of interactive experiments that engage with significant design questions in the Internet of Things that identify the variety of creative design methods that could be used to explore the research questions identified above. A practice based design research approach will be used where theory and practice are stitched together in reciprocal cycles.

Designing for Spatial Sharing—Architectural Thinking as an Approach to Technology Design

Student: Jens Emil Grønbæk | Aarhus University, Denmark
Supervisor: Marianne Graves Petersen
Microsoft Research supervisor: Kenton O’Hara

Summary: The fields of architecture and design have developed a rich understanding of different forms and spaces, such as seen in the work by Christopher Alexander [Alexander et al, 1977] and Thiis-Evens [1989]. Despite the extensive knowledge about form and space, there is a limited understanding of how technology can influence the qualities of space and the organization of people. On the other hand, within the field of computer science and interaction design, the spatial literacy is rather limited. Regardless of the fact that the way we design technologies to a large extend influences the properties of our spaces, impacts the way we orient ourselves towards other people, and our opportunities for social engagement with others in a co-located space. This project brings together competences from computer science, design, and architecture in order to develop a spatial literacy within technology design, as an approach to thinking about the design of future technologies.

Constructing robust synthetic gene feedback systems using a combined experimental and Bayesian approach

Student: Luca Rose | University College London, UK
Supervisor: Chris Barnes
Microsoft Research supervisor: Andrew Phillips

Summary: The aim of synthetic biology is to apply engineering principles to the design and construction of novel biological systems. One application area is the engineering of commensal bacteria within the gastrointestinal tract – the gut microbiome – to create novel therapeutics and diagnostic tools which has the potential for treatment of a wide range of disorders, as well as tackling important issues such as antimicrobial resistance. For advanced therapeutics to be realised, we must be able to design and build systems that are robust to a wide range of internal and environmental conditions. This can be achieved through the addition of feedback mechanisms in addition to the basic architecture. While this is common knowledge in control engineering, building in additional feedbacks in synthetic biology is not commonplace. We will construct versions of existing gene circuits with additional feedback and demonstrate their increased robustness directly using time course imaging and microfluidics.

Preference Elicitation and Mechanism Design for Complex Dynamic Systems

Student: Ludwig Dierks | University of Zurich, Switzerland
Supervisor: Sven Seuken
Microsoft Research supervisor: Peter Key

Summary: In this project proposal, we consider dynamic resource allocation problems in complex multi-agent systems. In particular, we focus on situations where agents’ preferences are high-dimensional and thus difficult to elicit, and where agents are self-interested, i.e., where they have conflicting interests. The two domains we consider are 1) cloud computing, and 2) electric vehicle (EV) charging. In both domains, agents’ preferences are complex, which calls for sophisticated “preference elicitation” techniques. At the same time, agents are competing for resources, which calls for “mechanism design” techniques to align agents’ incentives. However, to date, no prior work has combined these two techniques. The topic of this project will be to develop a general methodology to combine preference elicitation with mechanism design, and then to apply this in the cloud computing and EV charging domains.

Controlled Malleability in Cryptography

Student: Mary Maller | University College London, UK
Supervisor: Sarah Meiklejohn
Microsoft Research supervisor: Markulf Kohlweiss

Summary: In the past, malleability in cryptography—that is, the ability to transform a cryptographic representation of some underlying message or secret into a representation of a related secret—has long been viewed as an undesired opportunity for attackers to modify messages; e.g., in the classic example, an attacker Alice might modify a ciphertext encrypting the message “send 100 GBP to Alice” to encrypt “send 1000 GBP to Alice.” Recently, however, with the increasing flexibility that users require online—e.g., the ability to outsource both computation and storage and share private information on social networks—malleability has increasingly been viewed as an important feature in cryptographic design. The objective of this project is to gain a richer understanding of controlled, or restricted, malleability and the role it can play in supporting existing cryptographic applications and engendering new ones.

Formal Decompositions of Strongly Coupled Systems

Student: Patric Fulop | University of Edinburgh, UK
Supervisor: Vincent Danos
Microsoft Research supervisor: Boyan Yordanoff

Summary: Central to this project is the Karr whole-cell model (2012) which describes in details all processes within a simple cell. Our goal is to build a software platform with conceptually clear organising principles which will allow one to build the better and faster cell-models of the next generation. Such models can be large and may rely on different paradigms such as differential equations, discrete event systems, etc., and often include stochastic elements. Simulation will be a predominant tool of investigation and there are tremendous problems of consistency and correctness of the simulations, computational efficiency, and flexibility of the model manipulation. To address these problems and reach our goal, particular attention will be given to two fundamental issues: the development of a flexible formalism based on formal approaches of concurrent systems to design the module communication structure, and, the development of correct and efficient strategies for code execution.

Capture of Real-World Non-Rigid Scenes from a Single RGB Camera

Student: Qi Liu Yin | University College London, UK
Supervisor: Lourdes Agapito
Microsoft Research supervisor: Andrew Fitzgibbon

Summary: Turning low-cost sensors, such as commodity RGB or depth cameras, or even smartphones, into 3D scanners that can acquire the dense geometry of static scenes in real is now a reality that is revolutionising areas such as 3D design and 3D printing by making it easy to capture our environment instantly in 3D. While some impressive systems have extended 3D capture to non-rigid shapes, these are usually restricted to the use of multi-camera or stereo setups or RGB+depth sensors. Instead, the avenue of live, real-time capture of non-rigid objects with a single RGB-only camera remains less explored. The ambition of this PhD project is to extend live, easy, and flexible capture of 3D geometry of non-rigid and articulated objects to the case of a single, low-cost, RGB-only camera. Solving this problem will instantly make it possible to reconstruct videos from other sources such as film archives or computerised video libraries (i.e., YouTube).

Image Quality Transfer

Student: Ryutaro Tanno | University College London, UK
Supervisor: Daniel Alexander
Microsoft Research supervisor: Antonio Criminisi

Summary: This project will develop a nascent image reconstruction technology called image quality transfer into a working software tool cutting across several applications. The technique learns a model of the low-level structure of an image ensemble from high quality data sets that are expensive to obtain and uses the model to enhance reconstruction from sparse data that is more practical to acquire. The project develops and general formulation for this idea and demonstrates and evaluates it within applications to video, depth map, and magnetic resonance image enhancement.

History-Aware Testing

Student: Sascha Just | Saarland University, Germany
Supervisor: Andreas Zeller
Microsoft Research supervisor: Brendan Murphy

Summary: We want to improve software testing by automatically exploiting the project history. We prioritize testing towards locations and features that have shown to be most critical in the past, and use past exploits to search for novel security issues. We focus on high-impact tests—that is, inputs that are as similar as possible to past inputs, yet trigger behaviours that are as different as possible from past behaviours. All techniques are fully automatic, from mining past history via test selection to final ranking by change across history.

Sequential Structured Prediction

Student: Sergey Prokudin | Max Planck Institute for Intelligent Systems, Germany
Supervisor: Peter Gehler
Microsoft Research supervisor: Sebastian Nowozin

Summary: Structured prediction refers to machine learning models that predict multiple interrelated and dependent quantities. Many applications in a wide range of domains can naturally be understood in this way. A wide variety of expressive and powerful models have been proposed, mostly tailored to specific applications. A shared common problem amongst many structured prediction models is that of intractable inference. This results in inefficient computation and in turn reduces the practical significance. We propose to study structured prediction models in the context of sequential decision making. Each decision is allowed to depend on a rich context that includes previous decisions as well as context-dependent observations. This approach puts emphasis on tractable inference and we believe that progress in this field will result to practical impact in multiple applications.

Real World Data with Dependent Types: Integrity and Interoperation

Student: Stuart Gale | University of Strathclyde, UK
Supervisor: Conor McBride
Microsoft Research supervisor: Don Syme

Summary: Data integrity, manipulation and analysis are key concerns in modern software. We are often work with a corpus of files–spreadsheets, databases, scripts–which represent and act on data in some domain. I seek to improve the integrity and efficiency with which we can work by 1. developing a language for the conceptual structure of data models: what kinds of things exist in a given context, what we expect to know about them, and when our data are consistent-contextualization and consistency are readily expressible by dependent types; 2. extending the language to data views for a given model, specifying the content of a spreadsheet or database, whether it is a data source or an output; 3. exploiting these models and views to deliver richer tools for data editing, auditing, integration and analysis, whether by internal spreadsheet calculation or database query, or by interfacing with programming languages; 4. exploring automated discovery of views and models from extant data.
Advancing Random Forests and Other Ensembles

Student: Alban Desmaison | University of Oxford, UK
Supervisor: Nando De Freitas
Microsoft Research supervisor: Antonio Criminisi

Summary: This work will advance the theory and practice of one of the most popular classification algorithms in industry: random forests. These algorithms are behind many innovations including Kinect and the popular news app Zite (which powers CNN trends).

Leveraging Data Reuse for Efficient Ranker Evaluation in Information Retrieval

Student: Anna Sepliarskaia | University of Amsterdam, Netherlands
Supervisor: Maarten De Rijke
Microsoft Research supervisor: Filip Radlinski

Summary: Interleaved comparison methods enable the assessment of rankers using implicit feedback from actual users. The use of interleaved comparisons gives rise to a key challenge in ranker evaluation: how to efficiently determine the best ranker from a set using only pairwise comparisons. This challenge can be formalized as a K-armed dueling bandits problem. We aim to develop new K-armed dueling bandits algorithms that reduce the number of evaluations required. The idea is to create methods that use the results of each interleaved comparison to update estimates of the relative quality of all rankers under evaluation, not just of the two rankers that were interleaved. Doing so is possible if each interleaved comparison is conducted probabilistically, so that unbiased and consistent estimates of each ranker’s quality can be obtained. This allows us to extend interleaved comparison methods to multileaved comparisons, in which the ranking shown to the user is constructed from multiple rankers.

Rethinking Resource Allocation in Data Centres: Optimization, Incentives, and Beyond

Student: Asaf Valadarsk | Hebrew University of Jerusalem, Israel
Supervisor: Michael Schapira
Microsoft Research supervisor: Hitesh Ballani, Ian Kay

Summary: Cloud services, ranging from search to email and social networks, are driving the creation of larger and larger data centres. We advocate a holistic, application-centric approach to resource allocation in data centres: jointly optimizing the allocation of computational resources (for example, CPU, memory) and network resources (for example, bandwidth) by leveraging predictability in data centres. We argue that this approach holds great promise both for rendering data centres more profitable (via more efficient use of the data centres resources) and for improving data centre-provided services and perceived fairness from the user’s perspective. We aim to draw ideas from (CS and economic) theory devise practical schemes for resource allocation in data centres with provable guarantees.

Large Scale Diverse Learning for Structured Output Prediction

Student: Diane Bouchacourt | Ecole Centrale Paris, INRIA Saclay, France
Supervisor: Pawan Kumar
Microsoft Research supervisor: Sebastian Nowozin

Summary: Structured output prediction plays a central role in many domains of artificial intelligence. In order to greatly enhance its applicability, we plan to develop novel learning paradigms that provide accurate models from diverse data. The risk with diverse learning is that it often results in large non-convex problems, which makes the learner computationally inefficient and prone to bad local minimum solutions. In order to tackle the aforementioned risk, we will work in three directions. First, we will develop novel, mathematically sound techniques based on self-paced learning (SPL). Second, we will incorporate active learning with SPL such that a user can be employed to provide annotations for challenging samples. Third, we will develop large-scale parallel algorithms for solving the resulting optimization problem using dynamic dual decomposition. The infrastructure developed as part of this work will be tested on the challenging problem of semantic segmentation.

GeoGraph: Efficient Geographically Distributed Graph Infrastructure

Student: Enrique Fynn | University of Lugano, Switzerland
Supervisor: Fernando Pedone
Microsoft Research supervisor: Flavio Junqueira

Summary: Many current online services build on graph data structures (for example, social networks, collaborative applications, recommendation systems). Services in this category share a number of common characteristics: they typically serve a large user base, possibly geographically distributed; the underlying graph structure has particular properties; users can tolerate certain anomalies but expect some guarantees from the system (for example, preserving causal dependencies among requests); most requests are either small graph updates or relatively large queries. The goal of this project is two-fold: First, we aim to propose consistency criteria well-adapted to graph-dependent online services. In this sense, consistency must account for the typical operations performed on graphs by online services. Second, we intend to design, implement, and experimentally assess an infrastructure that implements this isolation level (or levels).

Statistical Models and Methods for Privacy Technologies

Student: Fatemeh Dokht | KU Leuven, Belgium
Supervisor: Claudia Diaz
Microsoft Research supervisor: Markulf Kohlweiss

Summary: Social interactions are increasingly conducted online. This introduces privacy risks as the analysis of user communication patterns might reveal sensitive private information. To address this problem, a number of distributed architectures for privacy-friendly communications have been proposed. While content protection can be achieved through the use of cryptographic mechanisms, protecting these systems with respect to traffic analysis attacks remains largely an open problem. In this project, we will develop behavioural models that express how users interact with each other, and methods for extracting statistics on networked data in a privacy friendly way. We will also develop advanced inference techniques for evaluating the security properties of distributed communications networks with respect to traffic analysis attacks. Based on the results of this analysis we will propose traffic analysis resistant designs that take into account user communication behaviour.

Ad-hoc Cross-Device Interactions Facilitating Small-Group Collaborative Explorations and Curation of Historic Documents

Student: Frederik Brudy | University College London, UK
Supervisor: Nicolai Marquardt
Microsoft Research supervisor: Abigail Sellen

Summary: Our proposed project explores the design of ad-hoc cross-device interactions facilitating collaborative small-group activities. The case study for contextualising our ideas is curating digital content for an ongoing cultural history project in Cambridge. This enables us to link in with a local community project that is based on real needs and provides new opportunities for technology interventions. The process of curating documents in new ways can be a demanding task, requiring to collect, review, and combine a large collection of raw material. Our goal is to design novel interaction techniques across multiple tablet computers that let people use their portable devices in concert, for new ways for sharing, extending, transferring, or overlaying content across these devices. Our project will investigate current curating practices, develop a technical framework for cross-tablet interactions, and design interaction techniques that fit into the current practices of the community project.

Sketching Algorithms for Massive Graphs and Matrices

Student: Jacques Dark | University of Warwick, UK
Supervisor: Graham Cormode
Microsoft Research supervisor: Milan Vojnovic

Summary: Increasingly, we are faced with larger and larger volumes of data from which to extract insights and intelligence. An important case surrounds data that can be represented as a graph or (adjacency) matrix. A promising approach is to look for ways to “sketch” such structures: to build a representation that is much more compact than the input, but which allows some function of interest on the original data to be approximated accurately using the sketch. Such sketches are well known and widely used for data that can be represented as a vector (such as to identify the most frequent elements, or to count the number of distinct items). The goal of this scholarship project is to develop new algorithms for sketching of massive graphs and matrices, and to demonstrate their usefulness via theoretical analysis and empirical evaluation.

TypeScript: The Next Generation

Student: John Williams | University of Edinburgh, UK
Supervisor: Philip Wadler
Microsoft Research supervisor: Andy Gordon

Summary: Our short-term goal is to build a tool, TypeScript: The Next Generation (TNG), that generates wrapper code from TypeScript “import” declarations to detect and pinpoints type errors. A wrapper will accept any JavaScript value as input, and either raise an error or return a value guaranteed to satisfy the invariant associated with the corresponding type. Our hypothesis is that TypeScript TNG will aid debugging and increase reliability of TypeScript and JavaScript code. The long-term goal is to extend the foundations of the blame calculus to support a wide-spectrum type system, ranging from dynamic types (as in JavaScript or Racket) through Hindley-Milner types (as in F# or Haskell) to dependent types (as in F* or Coq). Our hypothesis is that a wide-spectrum type system will increase the utility of dependent types, by allowing dynamic checks to be used as a fall back when static validation is problematic.

Why Do People Communicate?

Student: Katja Alice Behrens | Oxford Brookes University
Supervisor: Constantine Sandis
Microsoft Research supervisor: Richard Harper

Summary: This project proposes a philosophical anthropology of communication accompanied by an exploration of whether, and if so how, such an understanding can offer insights for innovation in the area of communications engineering and HCI. It does so by appealing to a picture of human nature which understands communicative minds in terms of shared intentional behaviour, rather than the other way round.

SMT for Nonlinear Constraints with Application to Computational Biology

Student: Kristjan Liiva | University of Edinburgh, UK
Supervisor: Paul Jackson
Microsoft Research supervisor: Christoph Wintersteiger

Summary: Given their immense success in powering modern program verification, Satisfiability Modulo Theories (SMT) solvers have begun to gain the attention of practitioners in other areas of science and engineering. Common to many new disciplines applying SMT (among them computational biology) is a pressing need for reasoning with nonlinear constraints over continuous variables. Unfortunately, this is an area where SMT solvers are still very weak. The proposed research shall address this shortcoming by developing nonlinear reasoning techniques which radically enhance the utility of SMT solvers for computational biology.

Formal Language Support for Ecological Modelling

Student: Ludovica Luisa Vissat | University of Edinburgh, UK
Supervisor: Jane Hillston
Microsoft Research supervisor: Matthew Smith

Summary: Advances in computational power, and our ability to collect data on an unprecedented scale, have led to a fertile period for computational modelling. In this project, we consider ecological modelling, seeking to explore the benefits that may be gained through the use of formal modelling languages, with automated translation into mathematical and computational representations. We are encouraged by previous success in this direction in systems biology, but there are significant new challenges in this domain. We will focus on representation of space and how the spatial relationship between entities can affect patterns of interaction and consequently, ecological processes. Developing over a number of phases, motivated by increasingly rich ecological scenarios, the outcome will be a spatial population process algebra; mappings from this modelling language to appropriate computational modelling engines; and complementary static analysis techniques.

Vision as Inverse Graphics

Student: Lukasz Romaszko | University of Edinburgh, UK
Supervisor: Christopher Williams
Microsoft Research supervisor: Pushmeet Kohli

Summary: The goal of scene understanding is to infer the objects present in a scene, their positions and poses, the illuminant, and so forth These are the *latent variables* which must be inferred in order to understand the scene. We will develop a stochastic scene generator, and render these scenes to produce images; we will then train recognition networks to infer the relevant latent variables, and refine the predictions by probabilistic programming on the generative model. The great advantage of using a scene generator and renderer is that large quantities of image data with the associated latent variables can be obtained for training the recognition networks.

The Generality and Mechanism of Bet-Hedging in Bacteria

Student: Om Patange | Sainsbury Laboratory, UK
Supervisor: James Locke
Microsoft Research supervisor: Andrew Phillips

Summary: Gene circuits exhibit fluctuations (‘noise’) in the level of key regulatory components. Increasingly, noise appears to play functional roles. For example, noise could enable a subpopulation of cells to enter a transient antibiotic-resistant state, enhancing their survival. By ensuring that cells do not all exist in the same transcriptional state, the colony can ‘bet hedge’ against future environmental changes. We will use single-cell time-lapse microscopy to examine the generality of bet-hedging in B. subtilis. After screening for pathways that show variable gene expression, we will use a combination of synthetic biology and computational modelling to discover the mechanisms that allow cells to probabilistically enter these states. The work will lead to an understanding of how and why alternative transcriptional states are generated.

Reasoning about Side Channels in Cryptographic Protocols

Student: Pablo Martin | IMDEA Software Institute, Spain
Supervisor: Boris Kopf
Microsoft Research supervisor: Cedric Fournet

Summary: Side-channel attacks break cryptosystems by exploiting signals that are unwittingly emitted by their implementations. Many defence mechanisms rely on the context in which a cryptographic primitive is used, that is, the protocol. In the course of this project, we will devise techniques that enable reasoning about side-channel leakage in cryptographic protocols. The promise of our approach is to achieve high degrees of security and performance at the same time. To this end, we will tackle two open challenges: first, how to do compositional reasoning about leakage and its aggregation; second, how to embed low-level binary analysis into this compositional context.

Computation During Development: Characterising the Molecular Programs that Underlie Pluripotency and Differentiation in Embryoni

Student: Pedro da Silva | University of Cambridge, UK
Supervisor: Brian Hendrich
Microsoft Research supervisor: Stephen Emmott

Summary: Embryonic stem cells are a unique type of cell derived from early mammalian embryos, which possess the ability to self-renew and to differentiate into all somatic lineages—a characteristic known as pluripotency. This potential makes them an attractive prospect for regenerative medicine, as well as a useful model for understanding mammalian development. A challenge remains to uncover the processes that underlie cell fate determination and how to translate this information into a predictive explanation of behaviour. Such challenges necessitate new approaches, and in particular, novel computational methods to uncover the information processing that is performed throughout cellular decision-making. With this PhD project we propose to combine both state-of-the-art experimental and computational methods to yield new insight into a fundamental biological question, and to impact significantly how research is conducted within the field of stem cell biology.

Private Computation as a Service

Student: Raphael Toledo | University College London, UK
Supervisor: George Danezis
Microsoft Research supervisor: Cedric Fournet

Summary: The objective of this project is to produce tools and services that can be used by software engineers to build and deploy systems using private computation techniques without any need for them to understand the underlying cryptology, mathematics or theory. This represents a serious research problem on multiple fronts: Special languages or subsets of high-level languages are necessary to express computations to be performed on private data as simple programs. Those have to be automatically transformed into the form necessary to be run privately; Secondly, the architecture required to support private computations will be different from traditional cloud storage and compute nodes, and has to be designed with security in mind; Finally, there is a need to provide systems’ support for a number of private computation techniques that require both high level code and / or private data to securely migrate across machines and trust boundaries.

Computational Algorithms as Biological Regulatory Networks

Student: Rosa Hernansaiz | King’s College London, UK
Supervisor: Attila Csikasz-Nagy
Microsoft Research supervisor: Luca Cardelli

Summary: Evolution selected for biological network designs that are capable of fast and proper responses to inputs. Computing is also designed to handle tasks in a fast and precise way. The efficiency of biological systems is often copied when designing biologically inspired tools (evolutionary algorithms, machine learning, and so forth). The supervisor of this project, in collaboration with Luca Cardelli, investigated the similarities between a cell cycle regulatory switch and the Approximate Majority (AM) algorithm of distributed computing. The AM algorithm computes the majority of two finite populations and the cell cycle switch ensures that cells divide only after DNA replication is finished. The functions of the two switches differ but their dynamical behaviour is similar. The project goes further to investigate how biological regulatory systems can be converted to computational algorithms and how algorithms can provide us hints about non-clearly understood biological systems.

Understanding the Moving Quadruped: Computer Vision to Advance Science, Medicine, and Veterinary Care

Student: Stephan Garbin | University College London, UK
Supervisor: Gabriel Brostow
Microsoft Research supervisor: Jamie Shotton

Summary: Driven by specific scientific questions, we will build new algorithms and devices to track the body and feet of mice, horses, and other quadrupeds that fall between these two extremes. On the biological side, we aim to investigate how body morphology and the ultimate constraints of stability, energetic cost, and dexterity shape animal gait. On the clinical side, this studentship will make essential contributions to applying the developed techniques across species with the potential for large welfare and economic benefits. This project offers a rare opportunity to bring together top experts in vision and biology just when their respective strengths are ready to be really challenged by the broader scientific community.

Understanding Flows of Personal Information in a Connected World

Student: Stephan Kollmann | University of Cambridge, UK
Supervisor: Alastair Beresford
Microsoft Research supervisor: Natasa Milic-Frayling

Summary: Applications today consume, synchronise and share personal information between a multitude of computers, often owned and controlled by different entities. Therefore personal information, once created, travels across a myriad of devices and machines, often geographically distributed across continents. Given this, the central research question we wish to address in this project is: when carrying out a task, how does the user know what personal data is involved, how it is shared, and what the privacy trade-offs are in competing products? Our approach combines together technical analysis, to understand where personal information flows today, and lab experiments, to understand what humans believe is going on. Our aim is to improve our understanding of the flow of personal information in modern distributed applications, and to develop methods of sharing our understanding with privacy experts, developers and application users.

3D World: Creation, Abstraction, and Application of Massive Crowd-sourced Collections of Heterogeneous 3D Models

Student: Tuanfeng Yang Wang | University College London, UK
Supervisor: Niloy Mitra
Microsoft Research supervisors: Shahram Izadi, Pushmeet Kohli

Summary: Fast, portable, and cheap sensors for 3D acquisition (for example, Microsoft Kinect, other RGBD scanners) are becoming commonplace in the technology marketplace. This will invariably lead to the creation of massive repositories of unstructured acquisitions (for example, depth scans of indoor environments) collected by users such as Kinect@home, similar to what we witnessed for images (for example, Flickr, Picasa, and so forth).
A Simple System for Morphogenetic Engineering

Student: Anton Igorevich Kan | University of Cambridge
Supervisor: Dr. Jim Hasseloff | University of Cambridge
Microsoft Research supervisor: Andrew Phillips

Summary: Biological self-organisation is a complex and emergent process, whereby many interacting cells robustly grow and differentiate into a whole organism. This project aims to engineer a simple system capable of programmable morphogenesis. By using a simplified system, we have a high level of understanding and control over the cells themselves and their interactions. This project will focus on altering the morphology of surface growing Escherichia coli colonies, since they are relatively well understood and easy to manipulate. In order to change the morphology of the colony in vivo, the genetic makeup of the bacteria will be manipulated in order to regulate cellular shape, adhesion and growth rates. The altered biophysical interactions between cells will in turn alter colony morphology, and such colonies shall be imaged with high-resolution microscopy and characterised through the use of image analysis to quantify the interactions within colonies. Computational models of the bacterial colonies, that take into account the physical and genetic aspects of cells, are used to further understand the details of the mechanisms involved that are not amenable to experimentation. Through the modelling of the development of particular morphologies, the models can also serve as tools to design biological architecture.

Relational Reasoning for Programs using Higher-Order Store

Student: Aleš Bizjak | Aarhus Universitet, Denmark
Supervisor: Prof Lars Birkedal | Aarhus Universitet
Microsoft Research supervisors: Nick Benton and Andrew Kennedy

Summary: The research project is concerned with developing models and reasoning techniques for programming languages with a combination of sophisticated features, such as higher-order functions, impredicative polymorphism, and general references. Models of such languages allow us to precisely state and prove properties of programs and provide criteria for judging, and methods for proving, correctness of language implementations, i.e. compilers and interpreters.

Investigation of a Radically New Energy Technology Based Upon Programmable Artificial Photosynthesis

Student: Alexander Fokas | University of Cambridge, UK
Supervisor: Alex Chin | University of Cambridge
Microsoft Research supervisor: Stephen Emmott

Summary: There is an urgent need to find alternative (non-carbon-based) sources of energy to both meet a threefold increase in energy demand this century, and to mitigate climate change effects. As a consequence, the concept of Artificial Photosynthesis is rapidly gaining momentum; the prevailing research focus being on the development of solid-state materials able to split water and extract the resulting hydrogen as energy. We propose a radically new, conceptually, form of Artificial Photosynthesis based on photosynthesis as programmable computation, implementable through Living Software on a ‘metabolic material’, to harvest light and convert it to directly usable energy. The project is highly innovative, and is the first ever of its kind, focused at the intersection of twenty-first century computing (programming biology) and a key twenty-first century societal challenge—solving the energy crisis.

Future File Systems: Mechanized Specification, Validation, Implementation and Verification of File Systems

Student: Andrea Giugliano | University of Leicester, UK
Supervisor: Tom Ridge | University of Leicester
Microsoft Research supervisor: Andrew Kennedy

Summary: We aim to produce a formal specification of file system behaviour, and a verified file system implementation. The specification will be validated against existing real-world implementations, to ensure that it is reasonable. The implementation will be verified correct according to the specification, and the proof mechanized in the HOL4 theorem prover. This would result in a file system implementation with exceptionally strong reliability guarantees. The specification and implementation have many applications. For example, the specification could be used as a foundation for the further verification of applications that use a file system, such as databases and persistent message queues. The implementation could be used as a key component in systems with high reliability requirements, such as NASA space exploration missions. A verified file system would also be a key component of a fully-fledged verified operating system, such as those currently being developed at Microsoft and elsewhere.

Compositional Verification of Scalable Joins by Protocol-Based Refinement

Student: David Swasey | Max-Planck Institute (Software Systems), Germany
Supervisor: Derek Dreyer | Max Planck Institute for Software Systems (MPI-SWS)
Microsoft Research supervisor: Nick Benton

Summary: Fournet and Gonthier’s “join calculus” supports a powerful combination of message-passing concurrency and declarative atomicity, thus enabling the convenient encoding of higher-level synchronization abstractions. Turon and Russo have recently developed an efficient implementation of the C# library for joins that scales well to multicore architectures. In this project, we aim to verify a realistic software stack for scalable joins, including both Turon and Russo’s implementation and a representative suite of joins-based client code that depends on it. Our hypothesis is that a feasible and effective way to achieve this goal is by using contextual refinement to isolate the verification of the implementation from the verification of the client code. Toward this end, we propose to use recent advances in the technology of “Kripke logical relations” in order to formalize the “protocols” that govern the use of private state in both implementation and client code.

Solving the Problem of Cascading Costs: Better Approximate Bayesian Inference for Data Pipelines

Student: Georgios Papamakarios | University of Edinburgh, UK
Supervisor: Iain Murray
Microsoft Research supervisor: John Winn

Summary: Bayesian statistical methods give state-of-the-art performance in many machine learning applications. However, full joint Bayesian inference is rarely used in large-scale applications with several stages of data processing. Early stages of a pipeline are often seen as “pre-processing”, processes outside of a statistical model, and model criticism is usually also a separate manual stage. We aim to provide tools that allow more steps of processing to be routinely part of probabilistic analyses. We hypothesise that new Monte Carlo fitting methods will give practical tools to achieve this aim, resulting in more accurate and trustworthy results for a variety of data-processing tasks involving pipelines. Successful proof-of-concepts will be developed into general tools, where possible as extensions of Microsoft Research’s Infer.NET project.

Intuitive and Efficient Design of Enclosures for .NET Gadgeteer

Student: Kapil Dev | Lancaster University, UK
Supervisor: Manfred Lau | University of Lancaster
Microsoft Research supervisor: Nicolas Villar

Summary: The Gadgeteer framework provides a platform for designing and fabricating electronic devices with a set of small electronic components. The current approach for modelling an enclosure (or physical case) for such a device is with Solidworks, and this process typically takes hours. We propose to develop a set of easy-to-use 3D modelling and digital fabrication tools that novices can use to design an enclosure in the order of minutes. We propose to explore various methods to build these tools, including: a 3D hand gestures and augmented reality interface, a 3D component-centric interface, and grammar-based representations and algorithms. We will take a user-oriented approach to evaluate our interactive tools.

Weakness as a Virtue

Student: Kareem Khazem | University College London, UK
Supervisor: Jade Alglave
Microsoft Research supervisor: Byron Cook

Summary: The number of interleavings that a concurrent program can have is typically identified as the root cause of the difficulty of automatic analysis of concurrent software. Weak memory, as implemented by modern multiprocessors such as Intel x86, IBM Power and ARM, is generally believed to make this problem even harder. On the contrary, I believe that weakness can be a virtue. The hypothesis of this proposal is that by embracing rather than fleeing from weak memory, we can obtain efficient verification techniques. We will experiment with this hypothesis by rejecting total orders when modelling the executions of concurrent programs. Following the partial order semantics tradition, we will model executions with partial orders. I believe that these models have a great potential for practical verification, which has not been fully realised yet.

Learning to Index

Student: Marco Fraccaro | Technical University of Denmark
Supervisor: Ole Winther | Technical University of Denmark
Microsoft Research supervisor: Ulrich Paquet

Summary: Algorithmic scalability is important for “Big Data” Machine Learning. However, an oft-neglected axis is that of the scalability of decision-making. For matrix factorization-based recommender systems, it is a linear in the number of objects. Under a strict time budget with millions of objects, sublinear retrieval algorithms are required for true scalability. Fast tree structures exist for O(log n) retrieval when the triangle inequality holds between objects, or when the objects are inherently indexable. However, current algorithms are not principally designed to operate within a sublinear retrieval criterion. Models can be specified such that it is forced to learn a good index over the data: This proposal investigates how a “matrix factorization” recommender can learn constrained parameters that are amenable to O(log n) indexing. This setting of “learning to index” is of cardinal importance to large-scale online retrieval, which is hampered by the lack of indexable solutions.

Bayesian Probabilistic Programming for Security

Student: Martin Szymczak | University of Edinburgh
Supervisor: David Aspinall
Microsoft Research supervisor: Andy Gordon

Summary: Our hypothesis is that a probabilistic programming language will enable us to formulate computer security problems as Bayesian inference problems and to solve them using state-of-the-art inference algorithms. We want to investigate this with a PhD project using the language Fun invented at Microsoft Research. The project will code existing and new security analyses based on a range of Bayesian methods, such as those used from evaluation/attack of systems, estimating trustworthiness of network participants, measuring information hiding, anomaly detection, and forensics.

Experimental and Computational Studies on the Human Antigen-Specific T Cell Repertoire

Student: Mattia Cinelli | University College London
Supervisor: Benny Chain
Microsoft Research supervisor: Neil Dalchau

Summary: The adaptive immune response is defined by clonal expansion of individual lymphocytes in response to antigen. However, the number of different clones responding to any given antigen, and the relationship between clonal frequency and antigen specificity and affinity remain unknown. In this project, we use high throughput sequencing to determine the repertoire of T cells responding to specific antigens. We will develop machine learning algorithms which will be used to classify these T cell receptors, and distinguish between antigen specific and antigen non-specific receptors. The project will also work with Microsoft Research to integrate the frequency data obtained with deterministic models of T cell activation, thus contributing to the overall aim of building a multi-scale model of the evolution of an antigen specific immune response.

3D Reconstruction of Live Scenes Using Multiple Heterogeneous Mobile Depth Cameras

Student: Moos Hueting | University College London
Supervisor: Jan Kautz
Microsoft Research supervisor: Shahram Izadi

Summary: Imagine an event like a concert, a play or even a formal ball, that is taking place at a distant location. Given its location, you cannot attend in person but you would still like to view and explore the event as if you were attending. Thus, you should be able to roam around freely and explore the event live from any viewpoint. Recent advances in camera technology, in particular depth-sensing (Kinect) and stereo-capable cameras (e.g., some mobile phones) from which depth can be derived, can aid in this context. We propose to use a heterogeneous network of Kinect cameras and stereo-capable mobile devices, where the Kinect cameras would be statically mounted and the 3D mobile devices would be worn / carried by attendees of the event. This proposed setup allows us to collect rich colour and depth data from many viewpoints. Given this input, we believe it possible to reconstruct enough information to facilitate inspecting the scene from any viewpoint at any time instance.

3D Smart Memory for Scale-Out Servers

Student: Nooshin Sadat Mirzadeh | École Polytechnique Fédérale de Lausanne (ÉPFL)
Supervisor: Babak Falsafi | EPFL
Microsoft Research supervisor: Dushyanth Narayanan

Summary: With the end of Dennard scaling on the horizon and the emergence of Big Data and the growing demands for processing, communication and storing massive data, 3D-stacked IC promises to be viable approach to scalability in servers. While 3D-stacked RAM is already available as a product, there a number of fundamental challenges related to power and thermal density, precluding the use of stacking of server grade logic dies on top of DRAM. Moreover, while the recently proposed scale-out processor architectures based on efficient mobile cores dramatically improve performance density and throughput in servers (e.g., by 10x), they do not alleviate the power density and thermal problems in stacking as compared to conventional server dies because they operate at a commensurate power budget. In this proposal, we will investigate logic die organizations using extremely low-power cores that would readily be stackable on existing DRAM stacks with available thermal and power density budgets.

All-Pay Auctions in the Real World

Student: Omer Lev | Hebrew University
Supervisor: Jeffrey Rosenschein | Hebrew University
Microsoft Research supervisor: Yoram Bachrach

Summary: In the past decade, the field of algorithmic game theory (AGT) has grown considerably, both with respect to advances in research, as well as a significant increase in its real-world application. The use of AGT includes the utilization of game-theoretic concepts (such as maximizing auction revenue and making voting less easily manipulated) in various Internet scenarios. One of the major areas that has developed significantly is the involvement of a large number of individuals in working, as freelancers, on specific problems or tasks, i.e., crowdsourcing. Interest in crowdsourcing has rekindled research into all-pay auctions, which can help in modeling these interactions. We propose exploring various novel aspects of all-pay auctions, such as the possibility of mergers and collusions, and the use of all-pay auctions in real-world mechanisms.

Performance Portability for Large-Scale Heterogeneous Systems

Student: Paul-Jules Micolet | University of Edinburgh
Supervisor: Christophe Dubach
Microsoft Research supervisor: Ant Rowstron

Summary: Computing systems have become increasingly complex and hard to program. This is especially true for data centres since these systems have become more heterogeneous with the recent availability of GPUs. As a result, achieving high performance for such large-scale systems is an extremely challenging task. This problem is further exacerbated with each new hardware generation, which means software written and tuned for today’s systems will need to be adapted frequently to keep pace with ever changing hardware. This proposal is an attempt to tackle the inherent complexity of computing systems by: (1) abstracting away the details of these systems using a hierarchical hardware description, (2) providing a higher-level programming model that does not expose any specific hardware features, and (3) automatically tuning the software to the hardware using statistical techniques. The key idea is that software should be written only once and automatically tune itself for the underlying system.

First-Order Satisfiability Modulo Theories

Student: Peter Backeman | Uppsala University
Supervisor: Philipp Ruemmer | Uppsala University
Microsoft Research supervisor: Christophe Wintersteiger

Summary: SAT and SMT solvers form the backbone of many of today’s verification systems, responsible for discharging verification conditions that encode correctness properties of hardware or software designs. SMT solvers are essentially ground decision procedures, in the sense that quantifier treatment is implemented by means of instantiation on top of a procedure for quantifier-free reasoning. For verification and other applications, efficient handling of quantifiers has been identified as one of the main challenges in the development of solvers. This proposal is about lifting the SMT paradigm to the first-order level, by including quantifiers as first-class citizens in solvers.

Exploiting Mobile Sensing and Geo-social information in Mobile Recommendation Systems

Student: Petko Georgiev | University of Cambridge
Supervisor: Cecilia Mascolo | University of Cambridge
Microsoft Research supervisor: Natasa Milic-Frayling

Summary: In this proposal we extend location based recommendation by bridging the gap between mobile sensing, location- based and on-line systems. We aim to address the problem of collecting user contextual information in real time from sensor equipped smartphones and integrate that with social and geographical information, also gathered through the interaction of phones with the mobile web, in order to offer optimal content on-line and geographic suggestions to users. The technical challenges of providing high quality recommendations are:
1. The ability to accurately detect the user context in presence of energy limitations for phone sensing
2. The integration of the most effective spatio-temporal data mining features in an environment of sparse, very large datasets
3. The urban sensing of events in the city by observing large streams of online data for the detection of ongoing events
4. The offering of online recommendations using real time back end computation of current user input
Understanding the Dynamics of Embryonic Stem Cells Differentiation: A Combined Experimental and Modeling

Student: Stanley Strawbridge | University of Cambridge
Supervisor: Graziano Martello | University of Cambridge
Microsoft Research supervisor: Hillel Kugler

Summary: Pluripotency is a key feature of embryonic stem (ES) cells, and is defined as the ability to form all cell lineages present in the adult body (Smith, 2006). Pluripotent ES cells are thus potentially a great biomedical resource and, at the same time, a unique and intellectually fascinating cell type. For these reasons ES cells have been extensively investigated in the past three decades, leading to a comprehensive understanding of the signals maintaining pluripotency and of the transcription factors that constitute the gene regulatory network (GRN) of ES cells. Surprisingly, very little is known about how ES cells exit the pluripotent state and start to differentiate. Our aim is to understand the molecular events associated with this transition and its dynamics.

Probabilistic Databases of Multimodal and Universal Schema

Student: Tim Rocktaschel | University College London
Supervisor: Sebastian Riedel
Microsoft Research supervisor: Thore Graepel

Summary: Databases can readily return answers explicitly stored in their tables, but it is difficult to query the implicit information captured in the database. To overcome this problem, we propose to develop probabilistic databases with relations corresponding to heterogeneous and multimodal content (number of Starbucks per km2), and relations corresponding to natural language surface patterns (“is gentrified”) appearing in text. The databases will automatically learn correlations between the multimodal and the surface pattern relations, and hence induce (some notion of) the meaning of surface patterns, grounded both with respect to other surface patterns, and to the databases’ multi-modal content. Querying implicit content then amounts to using the surface pattern relations in a query. Our approach to realize such databases will be based on latent feature models and matrix completion methods, and will require innovation in the intersection of IE, Semantics, Machine Learning and Databases.

Natural User Interfaces for the Developing World

Student: Thomas Reitmaier | University of Cape Town
Supervisor: Prof Edwin Blake
Microsoft Research supervisor: Richard Harper

Summary: The goal of this research is to investigate the design and development of interfaces that are ‘natural’ for those living in the developing world. Natural user interfaces (NUI) hold out the promise of revolutionising the experience and use of computers. To date, the bulk of research in this area has explored natural interfaces for the office and home of the twenty-first century in the developed world. Much less research attention has been given to creating such interfaces for the developing world. This is because innovation in this setting requires a much broader approach to create something that is both practical to design and build, and which deals with the particular constraints of the developing world.

Provenance for Configuration Language Security

Student: Weili Fu | University of Edinburgh
Supervisor: James Cheney
Microsoft Research supervisor: Dimitrios Vytiniotis

Summary: Declarative, high-level configuration languages (e.g., LCFG, Puppet, Chef) are widely used in industry to configure large system installations. Configurations are often composed from distributed source files managed by many different users within different system and organisational boundaries. Users may make changes whose consequences are not easy to understand, and such systems also currently lack mature security access controls; the few currently available techniques have idiosyncratic behaviour and offer no formal guarantees. In the worst case, misconfiguration can lead to costly system failures; because of the complexity of the configuration build, it is difficult to recover from failures, trace the source of the error or identify the responsible party. In this project, we will explore the application of provenance techniques (originally developed in the context of databases) to establishing well-founded and effective techniques for security and audit for configuration languages.

Expert Visual Classification with Thousands of Categories

Student: Yani Ioannou | University of Bath
Supervisor: Matthew Brown
Microsoft Research supervisor: Antonio Criminisi

Summary: Most work in visual classification is aimed at emulating, rather than augmenting, human recognition performance. This is partially attributable to the current paradigm in object recognition, which involves non-expert human labelling of a large number of training images, followed by supervised classification. This approach works well for small numbers of visually sparse classes, but fails to replicate the abilities of experts in more technical domains, such as medical image understanding. Furthermore, state-of-the-art techniques fail to scale effectively to large numbers of visual categories, which are commonplace in medical or biological taxonomies. In this project we will develop general techniques for scalable, expert visual classifiers that augment the capabilities of a typical person, aiming to furnish them with the visual insight of a botanist, architect or doctor.
LumiConSense

Student: Alexander Koppelhuber | Johannes Kepler University, Austria
Supervisor: Oliver Bimber
Microsoft Research supervisors: Shahram Izadi, Otmar Hilliges

Summary: We present a first attempt towards a fully transparent, flexible, scalable, and disposable image sensor. Our approach is based on thin-film luminescent concentrator waveguides. These are polymer films doped with fluorescent dyes that absorb light of a specific wavelength, re-emit it at a longer wavelength, and transport it by total internal reflection towards the edges of the film. They are normally used for increasing the efficiency of solar cells. By cutting the edges with a specific pattern, we force the light-transport within the film into a two-dimensional light-field. This enables the reconstruction of images that are focused on the film. Following initial simulations, we want to gain a deeper understanding in the physics and mathematics of our imaging approach, which possibly leads to practical software and hardware prototypes that enable the implementation of novel interaction and sensing applications.

Machine Learning Methods for Formal Dynamical Systems: a Systems Biology Case Study

Student: Anastasis Georgoulas | University of Edinburgh, UK
Supervisors: Jane Hillston, Guido Sanguinetti
Microsoft Research supervisors: Luca Cardelli, Andrew Phillips

Summary: Techniques from computer science are having a profound effect on computational science. In the context of systems biology and representation of intracellular processes, there have previously been two powerful but distinct approaches. One, based on formal model description techniques, has developed languages and associated analysis techniques to capture the (global) dynamic behaviour of biochemical processes. The other relies on more conventional differential/difference equation representation of systems but uses advanced machine learning techniques to incorporate observations and uncertainty into representations of (local) behaviour which can be verified experimentally. The objective of this project is to investigate the amalgamation of these techniques to design a formal modelling language that incorporates observations and uncertainty, and inference algorithms to allow the use of this additional information to improve the interrogation of behaviour using model checking algorithms.

Virtualization and High-Productivity for Many-Cores

Student: Andrey Rodchenko | University of Manchester, UK
Supervisor: Mikel Lujan
Microsoft Research supervisor: Tim Harris

Summary: The goal of this project is to investigate how to implement and understand the trade-offs of managed runtime environments for future computer systems, where each chip can integrate thousands of processing cores.

Automated Design of Revenue-Maximizing Ad Auctions

Student: Argyrios Deligkas | University of Liverpool, UK
Supervisor: Mingyu Guo
Microsoft Research supervisors: Yoram Bachrach, Peter Key

Summary: Similar to combinatorial auctions, an Ad auction is a set of rules that allocate different items (impressions of different types) to different bidders (advertisers). What differentiates Ad auction design from (classic) combinatorial auction design is that, as pointed out [in Emek et al. 2011], there exists information asymmetry in Ad auctions: the website knows the types of the impressions, but the advertisers do not have this information. It is possible to exploit this information asymmetry to achieve higher revenue, for example, by hiding the gender information of a male impression, and sell it as a unisex impression. The proposed project will focus on automated design of revenue-maximizing Ad auctions that exploit information asymmetry. We hope to answer two questions:
1. How can the website determine what information to hide?
2. Instead of hiding, can website sell information for profit?
Passive personality assessment: a psychometric and machine learning approach

Student: Arman Idani | University of Cambridge, UK
Supervisor: Professor John Rust
Microsoft Research supervisor: Pushmeet Kohli

Summary: The goal of this project is to explore the potential of personality assessment methods that are based on records of individual behaviour and preferences recorded in the online environment. Such inferred personality could be used to customize the user’s experience in the context of web-searching, shopping, and the behaviour of their computers and systems. We aim to:
- Measure the personality of groups and individuals “passively” and non-intrusively
- Enhance personality theory in the light of the results of this study
- Build models linking behaviour and preferences to personality
Inferring From Integrity Constraints Through JavaScript Analysis

Student: Ben Spencer | University of Oxford, UK
Supervisor: Michael Benedikt
Microsoft Research supervisor: Matthew Parkinson

Summary: In this project we will deal with improving the process of getting data from the “hidden web”. We will perform static analysis of the JavaScript code that is attached to web forms to discover their access restrictions and integrity constraints. These in turn can be used to optimize search tools that explore hidden web content—by avoiding useless accesses that violate the constraints or restrictions. The proposal will extend the state-of-the-art in both web information management and static analysis.

Holistic Evaluation in LINQ

Student: Fabian Nagel | University of Edinburgh, UK
Supervisor: Stratis Viglas
Microsoft Research supervisor: Gavin Bierman

Summary: This proposal argues for the use of just-in-time compilation of LINQ queries to native code for in-memory databases. In particular, the key idea is to apply recent techniques from SQL code generation in the context of LINQ and the .NET runtime. This will result in having yet another way to evaluate LINQ queries, but one that has the potential of better exploiting the hardware and software capabilities of the underlying platform. This work will improve LINQ by (a) making the implementation of LINQ for .NET objects a lot more efficient by using type-specific high-performing, and hardware-optimised algorithms as opposed to the inefficient generic algorithms based on suboptimal nested iterations that are currently used; and (b) providing LINQ with performance that surpasses that of established relational database technology for in-memory collections.

Content-Based Relevance Estimation on the Web

Student: Fiana Raiber | Technion, Israel
Supervisor: Oren Kurland
Microsoft Research supervisors: Filip Radlinski, Milad Shokouhi

Summary: Search over the web is a difficult challenge due to the noisy and adversarial nature of documents’ content, among other reasons. Web-search retrieval approaches address this challenge by utilizing non-content-based relevance indicators (for example, those based on hyperlink and click-based information) and detecting documents that are considered, in a query-independent fashion, of very low quality (for example, spam). We plan to devise novel content-based relevance estimation approaches that address the noisy and adversarial nature of the web. As a document is deemed relevant to the information need expressed by a query if its content satisfies the need, improved content-based relevance estimation can potentially help to significantly improve overall relevance estimation.

Developing Novel Computational Methods to Describe and Predict Human Behaviour in Earth System Models

Student: Jelte Mense | University of Edinburgh, UK
Supervisor: Paul Palmer
Microsoft Research supervisor: Drew Purves

Summary: We describe a PhD project that will fundamentally improve understanding of how humans will respond to changing climate and associated environmental factors. We present two interrelated projects:
1. We will develop a model of the relationship between the changing climate and conflict, including demographic transitions, and how it is affected by the outbreak and spread of disease
2. We will develop a model of climate-related migration, borrowing ideas from behavioural ecology, to look at how racial tension and bounded rationality might affect how communities eventually migrate.
For both projects, an emphasis will be on these predictive models reproducing observed socio-economic metrics, largely provided by the United Nations, so that we develop confidence before we apply them to future climate scenarios.

Machine Learning Markets

Student: Jinli Hu | University of Edinburgh, UK
Supervisor: Amos Storkey
Microsoft Research supervisors: Peter Key, Thore Graepel

Summary: We propose to develop market systems for solving large scale machine learning (ML) problems through Machine Learning Markets (Storkey 2011). We will investigate different mechanism design procedures, and processes for selling derived information. This will involve generating techniques for developing and building combinatorial prediction markets, and developing improved mechanisms for markets and auctions via the use of machine learning techniques. We will look at applications in the area of learning the conditional probabilities associated with other human markets or the actions of human agents.

Systematic Assessment of Uncertainty in Couples Carbon-Nitrogen Cycle Models and their Climate Feedbacks

Student: Johannes Meyerholt | Max-Planck Institute for Biogeochemistry, Germany
Supervisor: Sönke Zaehle
Microsoft Research supervisor: Matthew Smith

Summary: The new generation of land carbon-nitrogen-cycle models show that nitrogen feedbacks attenuate the responses of the carbon cycle to perturbations, thereby affecting long-term projections of climate change. The magnitude of this effect is very different between the models, leading to considerable uncertainty in projected rates of climate change. This project seeks to better understand and quantify this uncertainty by systematically assessing alternative model components in a common framework. Key observations of global carbon-nitrogen cycling will be used to evaluate competing process formulations. The thoroughly examined set of model components, linked in a common global modelling framework, will be used to make ensemble projections of the effects of future global change on terrestrial feedbacks to the climate system, systematically assessing uncertainty in these projections stemming from uncertainty in both parametric and process-formulation of global carbon-nitrogen cycle modeling.

Laura Parshotam (opens in new tab)

University College London, UK Research title: Dynamic Modelling of HIV Recognition by the Immune System Supervisor: Peter Coveney, Microsoft Research supervisor: Neil Dalchau Summary: The immune system is one of the most complex biological sub-systems within a single individual, involving the organised interaction of a vast array of molecular species both intra- and extra-cellularly. A fundamental component of the immune system is the ability of cytotoxic T lymphocytes (CTLs) to recognise and then destroy virus-infected or cancerous cells thus preventing disease progression. The overall aim of this proposal is to develop an accurate theoretical framework that combines models of the HIV virus lifecycle and peptide processing and presentation, in order to predict the dynamics of the CTL response in the host.

Miltiadis Allamanis (opens in new tab)

University of Edinburgh, UK Research title: Statistical Language Processing for Programming Language Text Supervisor: Charles Sutton, Microsoft Research supervisors: Andy Gordon, Thore Graepel Summary: Complex software systems involve many components and many external libraries, which create a large demand on programmer time and attention. In this project, we envision a new class of development tool, called data-driven development tools, to ease this burden. The idea is to start with a massive corpus of code from other projects (for example, of 1 billion lines of code) and apply tools from machine learning and natural language processing (NLP) to find syntactic patterns that programmers use often. We will do this by using an idea from NLP called a statistical language model, which is simply a probability distribution over strings, for example, over all possible C# files. The main goal of the studentship will be to build a statistical language model for programming language text. Doing this will require new technology on the NLP side as well. This would enable many applications, such as a syntax-based IntelliSense, which could recommend entire code snippets to a developer.

Steven Woodhouse

University of Cambridge, UK Research title: Development of an Executable Model Encapsulating Blood Cell Development from Pluripotent Embryonic Stem Cells Supervisor: Berthold Gottgens, Microsoft Research supervisor: Jasmin Fisher Summary: Blood cell development has long stood as a paradigm for stem cell and cancer research. This proposal is designed to explore the potential of computational modelling to advance our understanding of normal and malignant blood cells, based on the following overall hypothesis: Executable models of complex biological systems such as blood development can encapsulate current experimental knowledge as well as provide a powerful platform for hypothesis generation to investigate the biology of both normal and malignant blood cells. Expected outcomes of this research include:
1. An executable model for blood development, to form the basis for similar models for other organs.
2. New hypotheses about the consequences of leukaemia-associated mutations, with the possibility to model counter-balancing interventions as candidate therapies.
3. Further development of software tools for modelling and analysis of biological systems in the Fisher lab at Microsoft Research in Cambridge.
Thomas Dykes

Northumbria University, UK Research title: Supporting a “Sense of Home” in Care Homes: an Exploration of Digital Design with People Living with Dementia Supervisor: Jayne Wallace, Microsoft Research supervisors: Tim Regan, Siân Lindley Summary: Dementia and the context of life for people living with dementia has become an increasingly important topic in human-computer interaction (HCI) and design over recent years. Dementia has a profound effect globally and on each individual living with the disease. This PhD project seeks to explore creative ways to support a sense of home for people living with dementia in care homes through the creation of innovative digital artefacts. A creative methodology will be developed that has the person with dementia at the heart of the research and design process. This project will add a novel angle to current work in the fields of HCI and design by centering on the reconstruction of home, on residents’ agentive and creative power, and about considerations of the care home as a new phase of life.

Tomasz Kuchta

Imperial College London, UK Research title: Incremental and Adaptive Symbolic Execution Supervisor: Cristian Cadar, Microsoft Research supervisor: Miguel Castro Summary: Symbolic execution is a software testing technique that has gained attention in recent years, due to its ability to systematically explore paths through a program and find deep errors on these paths. Symbolic execution has been successfully applied to a variety of software, but still faces important scalability challenges, such as navigating through the huge execution space of real applications, and handling expensive constraint solving queries. In this project, we plan to address some of these challenges by focusing symbolic execution on code changes (such as program patches) by devising incremental and adaptive symbolic execution techniques that can reuse the result of previous analyses as well as dynamically react to changes in the complexity of the analysis. Applications of these techniques could include high-coverage testing of code changes, reasoning about differences between a patched and an unpatched version of a program, and synthesizing various types of code fragments.

Vu Khac Ky

École Polytechnique, France Research title: Efficient Approximations for Fast Simulations: Application to Building Designs Supervisor: Leo Liberti, Microsoft Research supervisor: Youssef Hamadi Summary: Smart buildings integrate architecture, construction, technology, and energy systems; they make use of building automation, safety and telecommunication devices, and they are managed automatically or semi-automatically on the basis of local information provided by a sensor network. The functioning of such a complex system necessarily depends on several tunable parameters, with respect to which the whole system can optimized as concerns several objectives (cost, energy efficiency, ambience comfort, and so on). For any given parameter value, system performance can only be evaluated by a computationally costly simulation procedure. The object of this PhD thesis is to devise new methodologies for optimizing smart building systems under such computational constraints.
Aleš Bizjak

IT University of Copenhagen, Denmark Research title: Relational Reasoning for Programs Using Higher-Order Store Supervisor: Lars Birkedal, Microsoft Research supervisors: Nick Benton, Andrew Kennedy Summary: The aim of this project is to research and develop relational reasoning techniques for program analyses specified via refined type systems. The reasoning techniques will be used to prove the correctness of program transformations based on the static analyses. The project will focus on programs written in programming languages with features found in modern advanced programming languages, in particular higher-order store and generics.

Alexey Bakhirkin

University of Leicester Research title: Bottom-up shape analysis with 3-valued logic Supervisor: Nir Piterman, Microsoft Research supervisors: Josh Berdine Summary: Shape analysis studies how programs that make use of dynamic memory can be analysed and how their correctness can be proved. As a starting point, we take the approach of Sagiv et al., which is based on abstract interpretation and use of 3-valued Kleene logic. They described and implemented an intraprocedural analysis answering the following question: “Given a small program fragment and its possible inputs (possible values of the variables and contents of the heap at the initial program point), what heaps can arise at other program points?” We aim to extend the approach with the ability to perform bottom-up analysis answering another question: “Given a small program fragment, what inputs make it execute without failure (terminate successfully or run forever)?” In solving this problem, we can reuse some parts of the previous work, but we also need to develop a number of algorithms that were not previously described.

Anja Thieme

Newcastle University, United Kingdom Research title: Support of Mental Well-Being Through Technology Supervisor: Patrick Olivier, Microsoft Research supervisor: Siân Lindley Summary: Cognitive behavioural therapy (CBT) is widely used by therapists to treat depression and anxiety and has been the subject of considerable empirical study, validation and clinical application. The goal of this PhD project is to design and evaluated Interactive Therapeutic Artifacts (ITAs) to support such therapeutic processes for depression. In following an experience-centered approach and combining findings of the fields of psychology, design and digital technology, it is hoped to increase individuals’ engagement in a therapeutic treatment, and thereby its efficacy and success. To empower users to interact with the treatment and to further enrich individuals’ social and emotional lives, form and function of the ITAs will be carefully considered to create and frame a very personal context for individual’s self-management, self-reflection, interaction with important others and their interpretation and appropriation of the artifacts within their daily routines.

Artem Khyzha

IMDEA Software Institute Research title: Modular verification for mainstream operating systems Supervisor: Alexey Gotsman, Microsoft Research supervisor: Josh Berdine Summary: An operating system (OS) kernel is the most critical piece of software running on a computer. Mainstream OS kernels implement complicated abstractions and make extensive use of multicore parallelism; as a consequence, their code is hard to get right. Formal verification can improve their reliability, and first attempts to verify small OS kernels have been encouraging. However, OS verification still remains an extremely laborious process. To make it more effective, we need verification methods tailored to this application domain that are more modular and automatic than existing ones. The aim of the proposed research is to develop novel logics and automatic tools that can reason modularly about OS kernel components, concentrating on problematic features of mainstream kernels. If successful, the proposed research will build the foundations for developing reliable operating systems. In time, its results may feed into industrial tools for OS development and verification.

Carlo Spaccasassi (opens in new tab)

Trinity College Dublin, Ireland Research title: Language Support for Communicating Transactions Supervisor: Matthew Hennessy, Microsoft Research supervisors: Andy Gordon, Nick Benton Summary: Communicating transactions, a novel programming language construct obtained by dropping the isolation requirement from traditional transactions, can be used to model the combination of automatic rollback recovery and coordinated check pointing in distributed systems. We have developed a behavioural theory for communicating transactions in an abstract setting; the aim of the project is to develop the construct in a practical setting. Specifically, the project will extend concurrent Haskell with this construct, investigate how it can be efficiently implemented, identify useful programming idioms, study how communicating transactions can be combined effectively with software transactional memory, and develop verification techniques.

Chandrika Cycil

Dorothy Hodgkin Postgraduate Award, Brunel University, United Kingdom Research title: AutoMedia: Family Life, Interaction and Media Use in the Car Supervisor: Mark Perry, Microsoft Research supervisors: Abigail Sellen, Alex Taylor Summary: The project asks how family and home life is constituted in the car, with a particular emphasis on exploring the role and use of media. The car is a fertile environment for developing digital technologies to support driving but presents a novel site for designing human-computer interactions. Road journeys can be both stressful and enjoyable experiences, while developments in in-car media offer the potential to alleviate some of these stresses and enhance the enjoyment of travel. The role of the car in transportation and its social and physical configuration makes this setting particularly challenging as a design space. To design appropriate media, we need to understand how families inhabit their cars and the routines of social interaction that occur within them. The research investigates the car as a setting for media use in family life to develop insights for the future design of family-oriented, car-based media that are empirically grounded in examples of use.

Daniel Trejo-Banos

University of Edinburgh, United Kingdom Research title: Machine Learning in Systems Biology: Inference and Structure Learning of Plants’ Circadian Clocks Supervisor: Guido Sanguinetti, Microsoft Research supervisor: Christopher Bishop Summary: Computational systems biology is one of the fastest growing areas of research in computer science. Two of the most exciting areas have been the use and development of learning algorithms to extract information from data, and the application of formal methods to rationalise the behaviour of biological systems. The aim of this project is to develop machine learning tools which can handle inference in the continuous-time dynamical systems underlying the theoretical computer science approach to systems biology, and apply these novel methodologies to the modelling and understanding of biological clocks in plants (in collaboration with the Millar lab at Edinburgh). Besides providing novel biological and methodological insights, it is envisaged that this project will contribute to lay the foundations for cross-fertilisation between machine learning and formal methods research in systems biology, capitalising on the exceptional strength in both fields both at Edinburgh and at Microsoft Research.

Elena Tretyak

Max-Planck Institute for Computer Science, Germany Research title: Physically Motivated Scene Understanding Supervisor: Peter Gehler, Microsoft Research supervisor: Carsten Rother

Frits Dannenberg

University of Oxford, United Kingdom Research title: Automatic Verification Techniques for DNA Computing Supervisor: Marta Kwiatkowska, Microsoft Research supervisor: Andrew Phillips Summary: DNA computing is a new and fast-growing field that aims to engineer artificial computing devices using bio-molecular materials such as DNA. Potential applications for this technology include autonomous molecular processes that can diagnose and respond to diseases within living cells. It is an inherently interdisciplinary area that brings together, amongst others, molecular biology and computer science. This project proposes to develop formal verification techniques for DNA computing that can rigorously check the correctness of designs and identify flaws before they are implemented. Furthermore, quantitative verification techniques will be developed, that can analyse a wide range of quantitative properties, such as the likelihood of a flaw occurring in a DNA circuit design or the time required for a computation to execute. The goal is to develop scalable and efficient new verification techniques, built into software tools that will help automate DNA computing design processes.

Guy Golan Gueta

Tel Aviv University, Israel Research title: Enforcing Atomicity for Data Structure Manipulations Supervisor: Mooly Sagiv, Microsoft Research supervisor: Byron Cook Summary: Sophisticated data structure implementations play a key role in the performance of many software systems. Concurrency significantly increases the challenge of developing such implementations. An interesting question is how to automatically enforce atomicity and linearizabilty, while also guaranteeing other desirable properties such as: disjoint parallelism, deadlock freedom, scalability, and good practical performance. We propose to ease the task of designing concurrent data structures tailored to given applications by combining user-abstractions of the intended system behavior together with efficient compile-time and run-time techniques which enforce these abstractions on existing and future systems.

Istvan Haller

Vrije Universiteit Amsterdam, The Netherlands Research title: Security for Legacy Binaries Supervisor: Herbert Bos, Microsoft Research supervisor: Manuel Costa Summary: Security in legacy binary systems is only as strong as the weakest link. The weakest link is often some obscure, older library or program that is vulnerable to memory corruption attacks. In this project, we work towards the protection of legacy binaries against memory corruption attacks, while assuming no knowledge of the binary’s source code, no symbol tables, and no help from the software vendor. Moreover, we aim specifically at stripped C binaries for x86-based architectures. We retrofit security in such binaries by first extracting the program’s data structures (by means of dynamic analysis), and then rewriting the binary to guard all accesses to these structures (so that they do stray beyond the buffer boundaries). To do so, we will conduct research in data coverage (to exercise all data structures), reversing of compiler optimizations, and finally rewriting binaries to harden them against buffer overflows.

Ján Margeta (opens in new tab)

INRIA, France Research title: Automatic Indexation of Time-Series 4D Cardiac MR Images Supervisor: Nicholas Ayache, Microsoft Research supervisor: Antonio Criminisi Summary: This project is about searching through large databases of medical images efficiently. Specifically, given a database of 4D cardiac MR images of patients stored with an expert diagnosis, we wish to select automatically the most similar cases to the cardiac images of a new patient. Similarity must be computed as a function of image content alone so as to negate the need for expensive manual textual tagging.

Jenny Vuong

University of Reading, United Kingdom Research title: Testing 3D and View-Based Models of Human Navigation Supervisor: Andrew Glennerster, Microsoft Research supervisors: Andrew Fitzgibbon, John Winn Summary: Accurate modelling of human performance in navigation and other spatial tasks is of great interest in the field of human visual perception, and successful models are likely to influence future work in related fields such as machine vision and SLAM. This project will explore computational models for two competing hypotheses—view-based representation and 3D reconstruction—to determine which provides the best account of human performance on a navigation task. Experimental participants will carry out small-scale “homing” tasks in a state-of-the-art immersive virtual reality environment. Errors recorded under a range of conditions will be compared to predictions of (i) a view-based model in which observers try to match visual parameters in their initial and final views and (ii) a 3D reconstruction model in which observers try to match their location in a 3D reference frame. This will be the first detailed, direct test of view-based versus 3D reconstruction models in human vision.

Karthik Nilakant

University of Cambridge, United Kingdom Research title: D³N: Data Driven Declarative Networking in Dynamic Mobile Networks Supervisor: Eiko Yoneki, Microsoft Research supervisor: Christos Gkantsidis Summary: Demand on programming in dynamic mobile networks as versatile services is increasing. The intersection between networking and programming languages is an emerging active research topic in this context, and it is crucial for us to explore new programming paradigms in the networking space that allow the use of such future emerging networks. The proposed project aims at addressing these issues by introducing a declarative networking programming paradigm, i.e. Data Driven Declarative Networking (D³N), where communication resources are managed together with network connectivity. It includes provision of an expressive language for building applications for distributed computation. D³N will be implemented in a functional language that provides an intermediate abstraction between implementation detail and reasoning about logic flow. D³N implementation targets mainstream functional languages, including verification and compiler construction targeting to various platforms in mobile devices.

Ken Wang (opens in new tab)

University of Oxford, United Kingdom Research title: Spatially-Resolved Representation Of Sarcomeric Membrane Structures and Function for incorporation into Computational Cell Model Supervisor: David Gavaghan, Microsoft Research supervisors: Hillel Kugler, Neil Dalchau Summary: Computational modelling of the heart is perhaps the most mature area of systems biology. The representation of physiological and pathological behaviour of the heart critically depends on the way in which the behaviour of individual cardiac cells is modelled. Cardiac cell models are still inadequate to address many key scientific questions, in particular where related to translation of molecular level insight to the prediction of organ behaviour in health and disease. In this project we will build on ground-breaking research in the laboratories of Dr’s P. Kohl, A. Hoenger and R. Winslow, to provide geometrically-detailed nanoscopic data on heart cell structure and function. This will be used to build a full 3D spatially-resolved model of the cardiac sarcomere which will be implemented in the in-house developed Chaste package, one of the fastest open source solvers available for the partial differential equations modelling cardiac behavior.

Kumar Sharad

Dorothy Hodgkin Postgraduate Award, University of Cambridge, United Kingdom Research title: Unifying Isolation Technologies to Protect Personal Information Supervisor: Steven Murdoch, Microsoft Research supervisor: George Danezis Summary: Users of the Internet are subject to increasing levels of surveillance, harming privacy and deterring the use of the Internet for sensitive activities. There is a growing desire for tools to protect individuals, but progress in this are is hampered by fundamental gaps in knowledge. The core concept behind protecting the privacy of Internet users is “unlinkability,” i.e., making it difficult to link actions to the identity of the person who carried it out. Where some action is sufficiently distinctive such as to uniquely identify who carried it out (a pseduo-identifier) they must also be unlinkable to other actions. Systems which enforce this unlinkability are known as isolation technologies, but there is no one system which can provide adequate protection by itself. Moreover, the composition of such tools may be insecure even if individual ones are secure. This project will link such tools such that their composition is secure, though capability-based access control.

Mahmoud Awad

Queen’s University Belfast, United Kingdom Research title: AGE – Adapted Games for the Elderly Supervisor: Cathy Craig, Microsoft Research supervisors: Richard Harper, Otmar Hilliges Summary: A healthy mind in a healthy body is key to successful ageing. Can computer games actually help keep us healthy? This study will look at how working with the end user (i.e. older adults) can inform the design of new computer based games that will train the body (through the movements required to play the game) and the mind (having different cognitive tasks that require bodily movement to select the correct answer). By including older adults in the consultation and inclusive design process we will see how the latest motion based gaming technology can be used to create adapted computer based games. By profiling the action capabilities of the older user, the actions required to play the game can automatically be adjusted. This will ensure the best possible user experience for older adults with the added advantage of health benefits from exercising both the body and mind. It is hoped that the principles of game design developed from this project, can be transferred to other domains.

Martin Kiefel

Max Planck Institute for Intelligent Systems, Germany Research title: Intrinisic Image Layers for Image Editing Supervisors: Bernard Schölkopf, Peter Gehler, Microsoft Research supervisor: Carsten Rother

Michael Hornacek

Vienna University of Technology, Austria Research title: 3D Scene Completion Supervisor: Margrit Gelautz, Microsoft Research Supervisor: Carsten Rother Summary: Creating novel views from a sparse set of input images has been a long-standing goal in computer vision. Applications are vast (e.g. games, Photosynth, 3D image manipulation), especially in the light of latest hardware developments (3D TV, stereo cameras). We revisit this fundamental problem from a new perspective. A major challenge for new view synthesis is to reconstruct both the depth and the colour of those pixels which have not been observed in any of the input images. We term this problem 3D scene completion. Our hypothesis is that by extracting and modelling physically aspects of the scene, such as geometry, light and camera, it is possible to improve on competing methods which operate purely in the image space. The key insight is that the methods and models used to solve the completion task are highly dependent on the physical aspect. We further believe that a joint estimation of various scene aspects is possible and superior to solving the problems independently.

Miguel Lurgi (opens in new tab)

Instituto Ciencias del Mar, Barcelona, Spain Research title: Contrasting Assembly and Disassembly of Ecological Networks in a Changing World Supervisor: José Montoya Teran, Microsoft Research supervisors: Drew Purves, Lucas Joppa Summary: The study of species interaction networks provide important insights into how ecosystems are built up (assembly) and whether and how they may collapse (disassembly) under current global change. The relationship between assembly and disassembly processes and trajectories is little understood, with many studies assuming that the effects of species loss or gain are symmetric and interchangeable. Assembly and disassembly studies have not considered the role of evolutionary processes in shaping current ecological networks and in explaining network robustness to species loss respectively. Processes of niche specialization and differentiation producing adaptive radiations have been neglected. Our aim is twofold: first, to develop individual-based evolutionary network assembly and disassembly models to predict network patterns and the effects of biodiversity loss. And second, to contrast whether, when, and how, the assembly and disassembly of ecological networks show different trajectories.

Oliver Bates

Lancaster University, United Kingdom Research title: Home Activity Sensing for Energy Monitoring and Home Automation Supervisor: Mike Hazas, Microsoft Research supervisor: James Scott

Olle Fredriksson (opens in new tab)

University of Birmingham, United Kingdom Research title: Structural Foundations for Heterogeneous Computation Supervisor: Dan Ghica, Microsoft Research supervisor: Nick Benton Summary: We will develop technologies for the efficient mapping of high level programming languages to heterogeneous computing platforms. We are focused on support for functional aspects of languages, which present conceptual and technical challenges in this unconventional setting, using innovative semantical models. Theoretical work will be practically validated using a proof-of-concept heterogeneous compiler. The main aims of the research are theoretical (a location-aware semantics of programming languages), technological (type-enforced protocols for inter-processor communication) and applicative (an experimental hardware compiler enhanced with heterogeneous features).

Rebecca Spriggs

University of Cambridge, United Kingdom Research title: Inferring Forest Structure and Disturbance Dynamics by Combining a Canopy Model with LiDAR Remote Sensing Data Supervisor: David Coomes, Microsoft Research supervisors: Matthew Smith, Drew Purves Summary: Large amounts of tree and forest data are needed to properly include stand- and landscape-level processes within models of the terrestrial carbon cycle. Light detection and ranging (LiDAR), a recent remote sensing technology that returns a high-resolution description of the canopy surface across forested areas, can help meet this need by scaling between plot-based measurements of individual trees, and the structure and dynamics of broad landscapes. We propose to combine a recent model of canopy structure with LiDAR data to (1) map the number, size, and distribution of trees across a temperate forest landscape with greater accuracy than previously possible; and (2) use this new method to infer the size and intensity of recent disturbances on this landscape. By mechanistically linking remote sensing data to individual trees, this project will help enable improved modelling of carbon stocks and fluxes that realistically represents the processes of tree growth, mortality, and disturbance.

Robert Norton (opens in new tab)

University of Cambridge, United Kingdom Research title: Exploring Hardware Support for Multithreading and Message Passing Supervisor: Simon Moore, Microsoft Research supervisor: (to be appointed) Summary: Hardware mechanisms to handle events and schedule an arbitrary number of threads will be investigated. These will replace low-level tasks traditionally performed by the operating system. We hypothesise that such hardware mechanisms will result in applications running at higher performance whilst using less power. By making threading inexpensive (unlike current commercial processors), we should open up new strategies for implementing parallel applications. The expected outcome is a system capable of emulating at least a 1,000-core system, including mechanisms to profile its behaviour. Planned collaboration with Microsoft Research will facilitate the analysis of future application requirements, operating system techniques, and approaches to profiling parallel systems.

Sandro Bauer (opens in new tab)

University of Cambridge, United Kingdom Research title: Knowledge Discovery and Extraction from Large-Scale Entity-Relationship Networks Supervisor: Stephen Clark, Microsoft Research supervisor: Thore Graepel Summary: Current search engines are poor at answering queries regarding named entities. Web users are naturally interested in relationships between people, people and locations, people and times, and so on. For example, a user wanting to discover information about Einstein has to type the name into a search engine, and is then given a link to the relevant Wikipedia page, from which the information has to be manually extracted. Worse still, if the user wishes to know the relationship between Einstein and Eddington, the search engine can only return single pages mentioning both names, which may or may not be informative. The aim of this project is to enable knowledge extraction and discovery from entity-relationship networks. The focus of the project will be on a) building large-scale entity relationship networks using state-of-the-art language processing tools; b) developing efficient algorithms for querying these networks; and c) developing suitable methods for evaluating the acquired knowledge.

Simo Hosio

University of Oulu, Finland Research title: Social Networking Services for Public Spaces Supervisor: Jukka Riekki, Microsoft Research supervisor: Richard Harper Summary: We aim to augment public physical spaces with co-located users’ selected online social media content. We believe that this will increase awareness and interaction among people present in public physical spaces, and positively affect to the user experience of the space itself. This work will generate new knowledge on presenting personal content in casual public spaces, such as cafes or malls, and produce mechanisms for situated control of personal content exposure settings in these spaces, i.e., situated privacy settings for online social media content. We will explore both public screens (displays, projectors, etc.) and mobile interaction devices for displaying and interacting with the content and the people. We can realistically evaluate research prototypes, as we have already a wide infrastructure, including interactive public displays, sensor networks, and communications middleware, available in an authentic urban city setting, used by thousands of real end users.

Simon Lyons

University of Edinburgh, United Kingdom Research title: Learning and Inference in Highly Structured Continuous-Time Stochastic Systems Supervisor: Amos Storkey, Microsoft Research supervisor: Christopher Bishop Summary: Many real-world systems are well described by continuous-time stochastic differential equations. For known distributions, there exist techniques for integrating the equations forward through time and hence evaluating (a probability distribution over) future predictions. A more challenging problem, but one of huge practical importance, is to reverse this process and to infer the nature of the unknown probability distributions given a set of observed time sequences. More generally still, the structure of the equations (for example, the presence or absence of particular interactions) is also unknown, and again we would like to infer such structure from observations. Typical analysis techniques used for inference and learning in many applications of this form are either discrete time methods or use deterministic ordinary differential systems, despite significant deficits of both these approaches. The primary aim of this project is to develop new practical, generally applicable techniques for continuous-time stochastic inference for both discrete-state and continuous-state systems; these methods will then be demonstrated in particular important application areas and will be provided as C++ libraries that will be promoted for wider dissemination of the methods. The result will be to develop improved methods for inference and learning in continuous time stochastic systems, and to allow continuous time stochastic methods to be used in many real problems with greater ease.

Thi Vân-Anh Nguyen

GREYC, France Research title: Multi-Stage Constraint Programming Supervisor: Arnaud Lallouet, Microsoft Research supervisor: Youssef Hamadi Summary: Multi-agent decision under uncertainty as encountered in sustainable development applications has raised new challenges in optimization. Not only is needed to find optimal value for a static problem but also for situations in which several agents have their own goals and interact by sharing some resources. We propose to address this latter category of problem by introducing a multi-agent multi-level optimization language based on quantified and stochastic programming. Three aspects will be addressed: modelling, solving, and applications.

Uwe Schmidt (opens in new tab)

Technische Universität Darmstadt, Germany Research title: Learning Expressive Conditional Random Fields for Image and Scene Modeling Supervisor: Stefan Roth, Microsoft Research supervisor: Carsten Rother

Wei Pan

Dorothy Hodgkin Postgraduate Award, Imperial College London, United Kingdom Research title: Automatic Robust Output Maximisation of Arbitrary Synthetic Biological Circuits In-Vivo Supervisor: Guy-Bart Stan, Microsoft Research supervisors: Andrew Phillips, Neil Dalchau Summary: Synthetic biology is a newly emerging field with huge potential for a paradigm shift in the way that biology is conducted and how it is used. Beyond manipulating cells to understand existing biological functions, synthetic biology endows cells with new functions (e.g., production of biofuels or medicine), by inserting biological “parts” into a host cell. However, by performing non-endogenous functions, the fitness of the host cell may diminish, limiting its capacity to perform the new functions. Rather than attempting to set an optimal activity a priori, we propose that automatic regulation of the circuit may be achieved via feedback control. This has the advantage of providing robustness to perturbations, such as variations imposed by gene expression and the host cells environment. The design will combine the GEC software and H-infinity control analysis. Automatic robust optimal control would improve synthetic circuit yields, a key requirement for profitable biotechnologies.

Xiaokun Wu (opens in new tab)

Max-Planck-Institut für Informatik, Germany Research title: Scene Understanding by Global Structure Inference Supervisor: Thorsten Thormählen, Microsoft Research supervisors: Andrew Blake, Pushmeet Kohli Summary: With the emergence of modern fast acquisition devices, we can easily acquire scanning data in point format by much simplified automatic operations. However, due to cost-effect limitations, and constrained environmental conditions, the data quality is often quite low, with presence of noise and in-completion. It is a great challenge to understand the data and interpret the scene with traditional local geometric analysis, since those methods highly rely on continuous or even high order differential computations, which are unstable or completely incomputable under current input conditions. We argue that global structures inferred from parts’ correspondence are the key to solve this problem. Observing that for limited manufacturing process and aesthetic considerations, man-made objects often present high regularities in both shape composition and configuration. We propose algorithms which consider both local geometric properties, and global relations to solve a broad aspects of scene understanding problem. Applications include geometric reconstruction, content creation, and object arrangement.
Andrej Mikulik (opens in new tab)

Czech Technical University, Czech Republic Research title: Large scale image search for objects and categories Supervisor: Dr Jiří Matas, Microsoft Research supervisor: Andrew Fitzgibbon Summary: This PhD project will investigate large scale image search techniques, their improvements for specific object retrieval, broadening their applicability and extending those approaches to object class search.

Connie Golsteijn (opens in new tab)

University of Surrey, United Kingdom

Research title: Materialising media

Supervisors: Prof. David Frohlich, University of Surrey; Dr. Elise van den Hoven, Technical University of Eindhoven, Microsoft Research supervisor: Abigail Sellen

Summary: Digital media are currently revolutionising the way we capture and share personal experiences through photos, sounds, video and other types of personal content. This change has been most pronounced in the area of domestic photography where digital cameras, camera-phones and internet services have come to replace analogue cameras and photographic prints as the primary means of capturing and sharing snapshots. This has led to the exponential growth of digital photo collections, alongside existing print archives. Similar changes are happening in consumer video and music, leading to a bifurcation of old and new technology and their associated media.

Danielle Belgrave

Dorothy Hodgkin Postgraduate Award, University of Manchester, United Kingdom Research title: Probabilistic causal models for asthma and allergies developing in childhood Supervisor: Prof. Iain Buchan, Microsoft Research supervisor: Christopher Bishop Summary: This trans-disciplinary PhD will focus on the exploitation of Bayesian machine learning methods, based on probabilistic graphical models, in the quest to understand the determinants of asthma and allergies from childhood, including the interactions between genetic and carefully-measured environmental factors. Structured Bayesian models will be built and solved using the Infer.NET library, and will be evaluated alongside conventional bio-statistical methods, such as multi-level models. The ultimate goal of the project is to elucidate realistically-complex causal networks of genetic and environmental factors responsible for asthma and allergies that develop in childhood. An allied PhD project, funded by the NIHR, will take a graphical model approach to studying the factors that determine the outcomes of treating type 2 diabetes. This will use the same type of genetic data and Infer.NET.

David Silver

Technion, Israel Research title: A systems biology approach to detect adverse patient-drug interactions Supervisor: Dr. Itai Yanai, Microsoft Research supervisor: Hillel Kugler Summary: Progress in genomic research has enabled the identification of interactions among the genome and the environment; however, the medical sciences still await the translation to a general method for predicting the adverse side-effects of therapeutics. In part, the delay is due to difficulties involved in attaining a dataset with the required depth and breadth to allow the formulation of a rigorous and general method. Here we propose to address this problem in the experimentally tractable nematode C. elegans where an adequate dataset can be constructed, the underlying principles discovered, and the tools then translated to sparse human datasets. Our method is premised on the notions of gene expression as an efficient marker for disease as well as its overall modularity in co-expressed subsets, and involves the statistical classification and unsupervised learning of large-scale datasets. We hypothesize that an interaction may be inferred from the expression levels of just a small set of genes and conditions (drugs), based upon knowledge of the underlying correlations of expression among the entire set of genes. We propose that this is an efficient approach towards discovering the principles of patient/drug interactions which may contribute to a physician’s ability to predict with reasonable confidence the possible adverse effects of a range of possible therapies.

Gian Marco Palamara

University of Zurich, Switzerland Research title: Computational modelling of the collapse of ecological communities Supervisor: Dr. Owen Petchey, Microsoft Research supervisor: Matthew Smith Summary: We propose a research project about how and why species’ traits, inter-specific dependencies, and chance, influence the trajectory of ecosystem failure that results from species extinctions. As well as providing a step forward in the understanding of the drivers of ecosystem failures and the ability to provide more realistic extinction scenarios, we imagine findings will apply to other types of systems in other scientific fields. In particular, systems in which the failure of one component can affect the failure risk of other components, a situation that may have caused some of the most significant disasters in recent history.

Ismail Kuru

Koç University, Turkey Research title: A static proof system and tool for programs running on relaxed memory models Supervisor: Serdar Tasiran, Microsoft Research supervisor: Shaz Qadeer Summary: We propose a static proof system and an associated mechanical proof checker for programs running on relaxed (weak) memory models. We target the verification of highly-concurrent, highly-optimized code such as virtual machines, language runtimes, transactional memory implementations, and libraries providing concurrency primitives and concurrent data structures. All applications that use such infrastructure software rely on its correctness. We believe that bug-finding tools are insufficient for ensuring this critical correctness. The use of more heavyweight verification approaches is justified for concurrent infrastructure software, and mechanically-checked proofs are necessary. We have previously developed QED, an atomicity, abstraction- and reduction-based proof system and tool, and had success proving the correctness of intricate concurrent code using QED. Unlike application software, the highly-concurrent low-level code on which this proposal focuses does not first ensure and then rely on sequential consistency for correctness. This has led us to propose a generalization of the QED proof system and tool in order to take into account relaxed memory models.

Lara Houston

Lancaster University, United Kingdom Research title: Inventive infrastructures – An exploration of mobile phone ‘repair’ cultures in Uganda Supervisors: Prof. Lucy Suchman, Dr. Adrian Mackenzie, Microsoft Research supervisor: Alex Taylor Summary: This research aims to address two general areas. The first focuses specifically on the practices of use of mobile phones in Uganda, and aims to contribute to an emergent body of work around mobile telephony and ICT in developing regions (the latter somewhat troublingly referred to as M4D and ICTD). The second area of focus will investigate local practices of mobile phone repair and maintenance, and will consider what if any lessons can be learned for ICT in Kampala and more generally.

Larissa Pschetz (opens in new tab)

Dorothy Hodgkin Postgraduate Award, University of Dundee, United Kingdom Research title: Are we nearly there yet? A proposal to explore digital navigation. Supervisors: Dr Jon Rogers, University of Dundee; Dr Chris Speed, Edinburgh College of Art, Microsoft Research supervisor: Richard Banks Summary: The way we navigate has changed, and so too has the way in which we understand the context and environments: the roads, streets, and highways across which we travel. Digital technology is starting to alter the way we plan to get about, how we navigate getting about and the way we get back from getting about. From personal, to private, to public, to group travel—we are starting to do this differently and this may be the start of a cultural shift in the business of navigation. Understanding the social implications for these technologies is vital if we are to design better products to help people navigate the landscapes of the present and the future. We aim to investigate the future of navigation that focuses on the domestic routines of people in their homes, their neighbourhoods, in their relationships, and in their landscapes.

Michal Ficek

Czech Technical University, Czech Republic Research title: Understanding and modelling network migration Supervisor: Dr Lukas Kencl, Microsoft Research supervisor: Milan Vojnovic Summary: Contemporary wireless communication networks present many alternatives for network users to obtain access to the Internet and mobile telephony connectivity. The association of a user mobile terminal to a particular access network may change over time depending on his or her country of location (cross-border roaming), availability of a network signal, pricing plan, provider preference, time of day, location, device capability or desired application of use (voice, data, etc). In this work we propose to conduct research in designing, building and applying tools to trace, analyse, measure and model such network migration behaviour of user terminals and determining the general characteristics of network migration in relation to the space and time attributes. Furthermore, we propose to formulate general guidelines for designing various network services or functions such as content distribution, virtual operators, energy-efficient performance or roaming-customer retention, based on the above findings.

Nicolas Mobilia (opens in new tab)

CNRS, France Research title: A meta environment for biological network modelling Supervisors: Dr Éric Fanchon, CNRS; Prof. Jacques Demongeot, Université Joseph Fourier, Microsoft Research supervisor: Youssef Hamadi Summary: Formal modelling is becoming an essential part of the work of today’s biologists. Our global goal is to develop software tools based on constraint technologies to assist biologists in the process of building models of complex interaction networks. The benefits of such system-level models are: deepening our understanding of life (at the molecular and cellular levels in our case), design of combinatorial therapies taking into account the network structure and exploiting fragility points. The usual process of knowledge acquisition can be schematically viewed as a succession of data production phases and modelling phases. Knowledge being partial, families of models should be considered rather than single instantiated models. A family of models embodies part of the state of knowledge at a given time, and often incorporates hypotheses in addition to features supported by experimental data. A family of model thus defines a synthetic reasoning frame. We advocate here a rational approach to network modelling based on constraints.

Niek Bouman

Eindhoven University of Technology, The Netherlands Research title: Distributed spectrum sharing for wireless networks: Optimal performance, fairness and design Supervisors: Prof. Sem Borst; Dr. Johan Van Leeuwaarden, Microsoft Research supervisors: Peter Key; Alexandre Proutiere Summary: The proposed research focuses on distributed spectrum sharing algorithms for emerging large-scale wireless meshes and cognitive-radio networks. In contrast to today’s cellular architectures, these networks typically lack any centralized control entity for allocating resources and explicitly coordinating transmissions. Instead, these networks vitally depend on the individual nodes to operate autonomously and efficiently share the medium in a distributed fashion. This requires nodes to schedule their individual transmissions and decide on the use of shared resources based on knowledge that is locally available or only involves limited exchange of information. The paradigm of such distributed control has been successfully adopted for end-to-end congestion control in wired communication networks. Wireless networks, however, have fundamentally different characteristics and entail even bigger challenges, particularly due to unpredictable channel conditions and complex interference issues. In cognitive-radio environments, a further key requirement is that the opportunistic access by secondary (unlicensed) devices of unused portions of the spectrum (white spaces) must not interfere with the transmissions of primary users (incumbents). This raises a strong need for agile spectrum sensing techniques and intelligent probing algorithms in order to access the spectrum in an efficient and non-intrusive fashion.

Paul Kelly

University of Oxford, United Kingdom Research title: Assessing the potential for SenseCam to fight the current global public health crisis of increasing obesity and physical inactivity Supervisor: Dr. Charlie Foster, Microsoft Research supervisors: Steve Hodges; Emma Berry Summary: This proposal outlines the potential role of SenseCam device in public health research and understanding. Based on our small pilot study we have identified and tested the potential for using SenseCam to validate the existing benchmark national measure of active travel, used on the National Travel Survey (NTS); indeed our research partners in this application NatCen, who conduct the National Travel Survey, have agreed to field test SenseCam in principle, subject to the development of appropriate protocols. We are confident that SenseCam contains a unique blend of methods from criterion and objective assessment of physical activity. The instrument can directly observe and record, through the camera mechanism, the duration and type of physical activity that is being performed, it’s place and context and provide a criterion method to assess the reliability and validity of a self report methods, for example a diary or questionnaire. SenseCam is an improved, more accurate way to measure physical activity behaviour, and the information it gives us will allow for better understanding of the determinants of behaviour and thus the design of better interventions to promote physical activity.

Peter Wortmann

University of Leeds, United Kingdom Research title: Visualising performance for multi-core Haskell Supervisor: Dr. David Duke, Microsoft Research supervisor: Simon Peyton-Jones Summary: The cost and complexity of programming multi-core processors is encouraging industrial interest in pure functional programming. But high-level abstractions make it difficult to isolate and resolve performance problems. Visualization of run-time behaviour could help, but existing methods were designed for lower-level technologies. This project will investigate how to provide Haskell programmers with salient source and run-time information that underpins a reasoned approach to performance tuning.

Pravin Shinde (opens in new tab)

ETH Zurich, Switzerland Research title: Scalable networking for heterogeneous multi-core systems Supervisors: Prof. Timothy Roscoe, Prof. Gustavo Alonso, Microsoft Research supervisor: Paul Barham Summary: This project will look at whether it is beneficial to rethink the architecture of the OS network stack in the light of new hardware, both network interfaces and overall system architecture. The key idea is to take a global system view, in particular with regard to dynamically placing elements of the networking stack appropriately in the end system. We will revisit the traditional idea of modelling the stack as a dataflow graph of protocol elements, but with significant differences.

Varun Bhaskar Kothamachu

Dorothy Hodgkin Postgraduate Award, University of Exeter, United Kingdom Research title: Computational capabilities and underlying mechanisms in biological signalling networks Supervisor: Dr. Orkun Soyer, Microsoft Research supervisor: Luca Cardelli Summary: There are still major challenges to overcome before we can claim to understand key “design” properties underlying signalling networks. Firstly, it is not clear how features identified so far will pre-dispose certain response dynamics in a larger network where they are embedded. For example, theory suggests that same signalling elements can easily result in different response dynamics when connected in different ways. Secondly, the conclusions of most modelling studies are found to depend on the details of the model structure. In particular, molecular events that are usually disregarded in models as being negligible can significantly influence response dynamics. Finally, and most importantly, we lack detailed understanding of how evolutionary processes can move signalling systems in the “space” of possible response dynamics. In other words, we do not fully understand how mutations can generate novel response dynamics from existing ones. The main aim of the proposed research is to address these challenges by systematically and exhaustively analyzing topological and biochemical features in tractable signalling systems.

Volodymyr Kuznetsov

ÉPFL, Switzerland Research title: Selective symbolic execution Supervisor: Prof. George Candea, Microsoft Research supervisor: Manuel Costa Summary: Our goal is to develop an automated testing technique that can scale to millions of lines of code that interact with their environment. We build upon symbolic execution, a technique originally introduced in the 1970s that has recently gained popularity in automated software testing. Behaviours (such as bugs) discovered with symbolic execution can be easily reproduced using information collected during the corresponding symbolic execution, making this approach a powerful and cost-effective tool for developers and testers.
Abhijit Karnik (opens in new tab)

University of Bristol, United Kingdom Research title: Using optical devices to enhance user interactions on an interactive surface Supervisor: Dr Sriram Subramanian, Microsoft Research supervisor: Alex Butler, Shahram Izadi Summary: Today’s touch-screen technology potentially allows multiple users to simultaneously interact with one another and with digital content by using their whole hand to engage in the interaction but it’s limited in the sense that all users have to share the same visible content. However, most of today’s touch-based systems only support single views for their interaction. In other words, these systems do not allow multiple users to view information that is customized to their view on the same interactive surfaces. There is limited systematic study of combining multi-touch with multi-visibility. There have been a few one-off point-designs (proof of concept systems to show that it’s possible to build such systems) but no systematic investigation into the benefits and limitations of combining multi-touch with multi-visibility has been performed. Here we propose to systematically investigate the design of interactive surfaces that use optical devices (like lenticular lenses and polarizers) to support multi-touch and multi-visibility.

Adam Gundry

University of Strathclyde, United Kingdom Research title: Haskell Types with Numeric Constraints Supervisor: Dr Conor Thomas McBride, Microsoft Research supervisor: Simon Peyton Jones Summary: This PhD project seeks to investigate the practical and theoretical impact of extending Haskell’s type system with numeric expressions (representing sizes, ranges, or costs, for example) and constraints capturing richer safety properties than are currently managed by static typing. There are three strands to the project: (1) to investigate type inference with numeric constraints, (2) to investigate new programming structures, patterns, and techniques which exploit numeric indexing, and (3) to study the performance benefits derivable from richer guarantees. There are considerable opportunities for a bright student to bring significant benefits to developers using Haskell, a language with increasing industrial traction – not least at Microsoft, where its flagship compiler is maintained, and where it plays a key role in a variety of cutting-edge projects. Moreover, Haskell is an established staging post for technology on its way to deployment in mainstream languages, e.g. C#.

Alice Boit (opens in new tab)

University Potsdam, Germany Research title: Visualising the mechanisms influencing ecosystem stability in quantitative food webs Supervisor: Prof. Dr. Ursula Gaedke, Microsoft Research supervisor: Lucas Joppa Summary: The stability of ecosystems has been a central research topic in ecology since the 1950’s and has been found to depend on a multitude of biotic and abiotic factors. The proposed PhD project will continue this research by helping to answer, ‘How do properties of populations influence ecosystem stability?’ In particular, this project aims to account for differences in the importance of individual feeding interactions and to assess the influence of prey edibility and predator food selectivity on ecosystem stability from a population and community level perspective. Combining theory and empirical data to simulate and visualize food webs, current software tools will be extended with new functionality to clarify and visually communicate how food web structure is influenced by its dynamics and vice versa. Thus, new simulations and three-dimensional food web visualisations developed in the project will help scientists better explore patterns of structural and dynamical properties and help predict ecosystem dynamics and stability.

Andrea Flack

University of Oxford, United Kingdom Research title: Collective decision-making in avian navigation Supervisor: Dr Dora Biro, Microsoft Research supervisor: Robin Freeman Summary: Animals that live in social groups must make joint decisions about many aspects of their daily lives, and the mechanisms that mediate such collective decision-making have generated a great deal of theoretical interest in recent years. However, empirical evidence has been almost entirely lacking, and the extent to which mathematical models of group decision-making, conflict resolution, and social learning translate into the real world has remained unresolved. The proposed project combines an experimental approach—the tracking of homing pigeons using miniature GPS technology—with a mathematical and computational modelling framework to examine the mechanisms and consequences of collective motion in co-navigating birds. Homing pigeons provide a unique and ideal study system where individuals’ possession of information can be controlled, social interactions regulated, and the flow of information quantified. A variety of modelling techniques will be tested for validity in explaining the empirical findings, and the generalities for collective decision-making in a wide variety of contexts will be considered, with the ultimate aim of expanding our knowledge of how collectives are shaped by the individuals of which they are composed.

Antje Beyer

University of Cambridge, United Kingdom Research title: Computational insights into the effect of genetic variations on C. elegans vulval development Supervisor: Dr Gos Micklem, Microsoft Research supervisor: Jasmin Fisher Summary: Computational modelling of biological systems is becoming increasingly important in efforts to better understand complex biological behaviours. ‘Executable Biology’ is a pioneering approach focused on the design of executable computer programs that mimic biological phenomena. We have previously established an executable model of C. elegans vulval development that includes key components of the RAS/MAPK and LIN-12/Notch pathways as well as the crosstalk between the two pathways, which is essential for the process of vulval precursor cell fate determination. The aim of this PhD proposal is first to extend the current model to include key components of the Wnt signalling pathway that act in parallel to RAS/MAPK signalling as well as transcription factors (e.g., sur-2, lin-39, lin-1 and lin-31) that act downstream to the MAPK pathway. Once this model has been built, genomic, proteomic and cellular data obtained by the Hengartner, Poulin, and Jansen labs will be integrated into the revised model. Since the genomic analysis will also generate quantitative data in the form of expression levels or phenotype penetrance (obtained by the Hajnal and Hengartner labs) we also aim to develop a hybrid model incorporating some quantitative data such as expression levels for selected key components in the studied signalling pathways. In addition, as many of the known regulators of vulva development also control RAS/MAPK signalling during germ cell development, we intend to expand our vulva development model to include germ cell development.

Antonia Masucci (opens in new tab)

Supélec, France Research title: Advanced mathematical tools for the design of cognitive radios Supervisor: Prof. Merouane Debbah, Microsoft Research supervisor: Peter Key, Bozidar Radunovic Summary: It has already become a common understanding that current mobile communication systems do not make full use of the available spectrum, either due to sparse user access or to the system’s inherent deficiencies, as shown by recent reports from the Federal Communications Commission (FCC). It is envisioned that future systems will be able to opportunistically exploit those spectrum ‘left-overs’, by means of knowledge of the environment and cognition capability, in order to adapt their radio parameters accordingly. Such a technology has been proposed by Joseph Mitola in 2000 and is called cognitive radio. Due to the fact that recent advances on micro-electronics and computer systems are pointing to a -not so far- era when such radios will be feasible, it is of utmost importance to develop adequate mathematical tools for designing and optimizing cognitive radios which can extract and understand the wireless network on their own. The goal of this PhD is to propose new mathematical tools based on free de-convolution and random matrix theory techniques (and for which the PhD advisor has made some important contributions in many other areas) to extract information from the network.

Daniel Gallardo (opens in new tab)

Universitat Pompeu Fabra, Spain Research title: Exploring expanded control possibilities within SecondLight Supervisor: Dr. Sergi Jordà, Microsoft Research supervisor: Shahram Izadi Summary: In recent years we have seen a proliferation of tangible and tabletop interfaces, several of them already becoming real products. And yet, much before the implications and the potential of these new types of interfaces get fully explored and exploited, much less well understood, newer technologies keep bringing amazing new-fangled possibilities. This is the case of SecondLight, a technology developed by Microsoft that combines all the potentiality of surface computing using tangibles and extends it “beyond the display”, allowing to interact also from a distance. We propose to investigate the new interaction possibilities this outstanding and almost magical technology may bring. Following the work developed by our research team within the reactable project, all this research will be oriented towards the idea of expanded control sharing. That is, instead of focusing on a data-centric view of interaction, we will mostly consider representational forms as resources for action. Instead of relying on the transmission and sharing of data, we will be looking for solutions that emphasise user control, creativity, and social action with interactive tools.

Frédéric Besse

University College London, UK Research title: Automatic enhancement of digital photographs Supervisor: Dr Jan Kautz, University College London, Microsoft Research supervisor: Andrew Blake Summary: The main objective of this proposal is the automatic enhancement of digital photographs. In particular, we want to improve a snapshot’s “look” by learning and imposing statistics from good photographs, as well as enhance the snapshot by cropping it according to basic composition rules that are also inferred from good photographs (and their salient features). The main challenges are the choice of statistics and how they can be enforced or applied. Our hypothesis is that data-driven, automatic enhancement of snapshots is viable and we will validate the results with user studies.

Ioannis Psorakis (opens in new tab)

University of Oxford, United Kingdom Research title: The form and function of dynamic sociality in a wild bird population Supervisor: Prof. Ben Sheldon, Microsoft Research supervisor: Robin Freeman Summary: This proposal aims to use an exceptional data set from a large number of marked individual birds, tracked automatically over several winters, to explore the formation and dynamics of social groups of foraging individuals. The aim will be, first, to derive computationally intensive methods for identifying groups; second to understand the stability of those groups, and the way that the characteristics of individuals making up the groups influence their behaviour and composition, and third, to test hypotheses relating to the fitness consequences of sociality. The study will be embedded in the context of a long-term population study which provides richly detailed information about individual life-histories, overlain on environmental and genetic data.

Isabel Rosa

Imperial College London, United Kingdom Research title: A data-constrained predictive model of tropical deforestation and resultant carbon emissions Supervisor: Dr Robert Ewers, Microsoft Research supervisor: Drew Purves Summary: Deforestation is a major source of global biodiversity loss and anthropogenic carbon emissions, but our ability to forecast the magnitude or geographical distribution of future deforestation is very limited at present. Making use of satellite-derived data sets measuring deforestation, in combination with global data on population density, climate and other factors, this project will (a) develop and parameterize a spatially explicit model of tropical deforestation; (b) combine this with data on carbon storage in tropical forests in order to estimate historical carbon loss from deforestation and to predict future losses.

John Hewson

University of Edinburgh, United Kingdom Research title: Constraint-based specifications for system configuration Supervisor: Mr. Paul Anderson, Microsoft Research supervisor: Rebecca Isaacs, Andy Gordon, Eno Thereska Summary: This project aims to develop the use of constraint-based specifications for practical system configurations. Such specifications support composition which is particularly suitable for devolved management. They are also important for specifying autonomic systems in a declarative way since the under-specification provides the flexibility for autonomic adjustment. Reducing the complexity of the constraint problems, and handling ‘soft’ constraints are some of the important challenges to producing practical tools. The research direction will be guided by real problems in practical system administration, and it is expected to lead to tools or techniques that would be of use in this area.

Karen Simonyan

University of Oxford, United Kingdom Research title: A medical image search engine Supervisor: Prof. Andrew Zisserman, Microsoft Research supervisor: Antonio Criminisi Summary: We propose to develop a search engine for medical images that can return examples similar to a query image, e.g. with a particular anomaly, on the fly. We will harness modern and efficient methods of segmentation to provide the image descriptors, and visual words/a ‘Video Google’ architecture for the on-the-fly retrieval. The aim is to provide doctors with a retrieval tool to aid in their diagnosis by retrieving similar cases, their treatment and outcome. We will also investigate methods for data mining in collections of images, and improving segmentations given sets of aligned images or an image sequence.

Lars Schäfers

University of Paderborn, Germany Research title: GOmputer: The GO machine Supervisor: Prof. Dr. Marco Platzner, Microsoft Research supervisor: Thore Graepel, Satnam Singh Summary: The GOmputer project aims at the investigation of novel algorithmic approaches for playing GO and the development of a parallelized and hardware-accelerated GO machine prototype. In the algorithmic part of this project we will leverage recent Monte-Carlo approaches for playing GO and focus on a novel bundling technique, the combination of Monte-Carlo with alpha/beta game tree search, and the investigation of how to deal with patterns. In the computing systems part of this project, we will address suitable parallel programming models and advanced techniques for FPGA-based hardware acceleration. On the longer term, this project should lay the foundation for the development of the world’s strongest GO machine.

Lawrence Hudson

Imperial College London, United Kingdom Research title: Unifying food web structure and dynamics with metabolic theory: a general modular computational approach Supervisor: Dr Daniel Reuman, Microsoft Research supervisor: Lucas Joppa Summary: Theories of the dynamics of interacting species in food webs, though ubiquitous, are not sufficiently well connected with data to be useful for predictions. Much high quality static data on food webs and community structure is available, including recent data on allometry between species population densities and average body masses within a food web. The recent data shows regularities and systematic variation at a level of resolution suitable for precise tests of dynamical models. A flexible software package will be developed for constructing and parameterising dynamical models and for comparing their predictions with the new data to illuminate model mechanisms and constrain parameters. Best-fitting models will be used to make a variety of predictions about whole-food-web-level impacts of climate change, extinctions, and other disturbances.

Long Guo

Université d’Artois, France Research title: Multicore-based satisfiability Supervisor: Prof. Lakhdar Sais, Microsoft Research supervisor: Youssef Hamadi Summary: The SAT problem (decide if a Boolean formula, typically in conjunctive normal form, admits a valuation which makes it true?) is one of the fundamental problems in complexity theory, and probably one of the most studied, theoretically and practically. Modern SAT solvers can now handle propositional satisfiability problems with hundreds of thousands of variables or more. However, the whole picture is not so good since the SAT-solving community does only marginal performance gains (see results from the last SAT competitions). As a result, the progresses on industrial benchmarks are now stalling since it becomes harder and harder to improve the performances of any Chaff-like solver. Today, many SAT problems remain challenging to all the available SAT solvers, and consequently new approaches are clearly needed. In this context and with the light of the next generation of computer architecture, the design of multicore satisfiability solvers is clearly a fundamental issue. This is clearly a hot topic and research on this area is in a preliminary stage. The objective of this proposal is to provide new theoretical and practical advances for Multicore Satisfiability solving.

Martin Suda

Max Planck Institute for Software Systems Research title: Automated reasoning for dynamic authorization policy analysis Supervisor: Prof. Christoph Weidenbach, Microsoft Research supervisor: Moritz Becker Summary: Analysing dynamic policies is challenging because it requires considering unbounded sequences of state changes. Recent work has proposed to leverage model checking tools, AI planners, and automated theorem provers, to analyse reachability and invariance properties of dynamic policies. However, the existing techniques are not entirely satisfactory for several reasons. The project will tackle some of these problems by using, extending or modifying an automated theorem prover such as Spass, or by developing and implementing new automated reasoning techniques.

Mindy Syfert

University of Cambridge, United Kingdom Research title: Novel computational approaches to modelling biodiversity – applications for setting conservation priorities Supervisor: Dr David Coomes, Microsoft Research supervisor: Matthew Smith Summary: The overall objectives of this project are to assess the accuracy of current methodologies for predicting the global occurrence of plant species based on presence data, and to investigate the development of potentially better alternative methodologies. It will develop new methods for establishing the climatic and edaphic constraints of species from presence-only data. It will compare separate techniques for the identification and proposal of protected areas. It will also test the coverage of the global protected area network for plants and identify additional priority areas for plant conservation.

Nadarajen Veerapen (opens in new tab)

Université d’Angers, France Research title: Autonomous neighbourhood management for combinatorial problems solving Supervisor: Prof. Frédéric Saubion, Microsoft Research supervisor: Lucas Bordeaux Summary: Neighborhood functions are crucial components when using local search algorithms to solve combinatorial optimization problems. Unfortunately, the design and the management of these functions are often problem dependant and require a great expertise and knowledge to obtain good results. Therefore, in order to provide more autonomous solving facilities, we propose to use genetic programming to generate suitable neighborhood functions together with suitable control features.

Nikée Groot

University of Sheffield, United Kingdom Research title: The other half of the equation: Global variation in tree mortality Supervisor: Prof. Emanuel Gloor, Microsoft Research supervisor: Drew Purves Summary: Future changes in the global forest carbon cycle could have major impacts on future climate. Whereas we now have a relatively advanced ability to model the input of carbon into forests, via tree growth, almost nothing is known about global variation in and / or the climate dependency of the outputs, which are dominated by tree mortality. By collating a unique, global forest inventory data set, and analysing this data using computational statistics, this project will, for the first time, uncover the global-scale climate dependency of tree mortality, and produce a model of this dependency for use in predictive models of the forest carbon cycle.

Sadia Ahmed

Imperial College London, United Kingdom Research title: A data-constrained predictive model of tropical deforestation and resultant carbon emissions Supervisor: Dr Robert Ewers, Microsoft Research supervisor: Drew Purves Summary: Deforestation is a major source of global biodiversity loss and anthropogenic carbon emissions, but our ability to forecast the magnitude or geographical distribution of future deforestation is very limited at present. Making use of satellite-derived data sets measuring deforestation, in combination with global data on population density, climate and other factors, this project will (a) develop and parameterize a spatially explicit model of tropical deforestation; (b) combine this with data on carbon storage in tropical forests, in order to estimate historical carbon loss from deforestation, and to predict future losses.

Sergey Kosov

Max Planck Institute for Software Systems, Germany Research title: From stereo to 3D faces Supervisor: Dr Thorsten Thormählen, Microsoft Research supervisor: Pushmeet Kohli Summary: The goal of this PhD project is to generate 3D reconstructions of faces using the two video streams provided by a stereo camera system. This involves automated detection, tracking, and recognition of faces and the extraction of certain properties. Compared to algorithms that use only a monocular camera, it is expected that the additional information provided by a stereo camera facilitates a more robust and accurate 3D reconstruction. A collection of real-time algorithms will be implemented that provide a basis for interactive applications, like, the augmentation of webcam videos, robots with stereo cameras for 3D vision, or the usage of a stereo camera as a new input device for games.

Su-Yang Yu

Newcastle University, United Kingdom Research title: Cheating mitigation in online games Supervisor: Dr Jeff Yan, Microsoft Research supervisor: Michael Roe, Ralf Herbrich Summary: While online game is a lucrative multibillion business, cheating in these games has become a serious issue for both game makers and players. In this proposal, I propose to tackle representative cheats in first-person shooters (FPS) and real-time strategy (RTS) games by developing novel security techniques, and also propose to explore the analysis of application programming interfaces (APIs) of Massively Multiplayer Online Games as a new means for identifying vulnerabilities in these systems. This project will contribute to both computer security and games research. Expected outcomes are not only of academic interests, but also of direct relevance to Microsoft’s gaming business.

Theofanis Karaletsos (opens in new tab)

Max Planck Institute for Biological Cybernetics, Germany Research title: Machine learning for genome-wide association studies and phenotype modelling Supervisor: Dr Karsten Borgwardt, Microsoft Research supervisor: John Winn Summary: The proposed thematic area for the MSR/MPI PhD candidate are machine learning challenges that arise in the context of genome-wide association studies (GWA). This covers general method development as well as the application of developed methods to real-world datasets. An overview of GWAs and related open problems in machine learning is given below.

Thomas Simpson

University of Cambridge, United Kingdom Research title: Trust on the Internet Supervisor: Dr Alexander Duncan Oliver, Microsoft Research supervisor: Richard Harper Summary: This project “seeks to understand how technology can support and enrich human values in the everyday world”, under the Socio-Digital Systems mandate, using philosophical methods. Trust and trustworthiness are of vital practical importance, without which human society cannot function. What are they, and how do they work? Further, the smooth functioning of the internet is critically dependent on well-placed trust. What are the specific challenges and opportunities that the internet presents? It is supposed that the values that are required for society to be marked by an atmosphere of trust are ‘thicker’ than liberal conceptions of what is required for us to be free from interference from others. The construction of a future where technology facilitates and enhances the reliable operation of our trust habits is the invention of a future where these values are strengthened, rather than eroded. Thus there is a close iterative relationship between technology and values, which this project seeks to explore.

Timothy Rudge

University of Cambridge, United Kingdom Research title: Language for synthetic biology Supervisor: Dr James Haseloff, Microsoft Research supervisor: Andrew Phillips Summary: The overall objective of the project will be to develop a high-level programming Language for Synthetic Biology (LSB) that will allow formal representation of synthetic designs, simulation and possible solutions. The student will apply Synthetic Biology tools to a bacterial system to generate self-organizing “Turing” patterns that can be formally modelled, generating essential information for the development of LSB. The development of an improved theoretical tool like LSB requires use of a practical biological system for testing and refining the language. Here the student will use an existing bacterial genetic system, the Gram-positive bacterium Bacillus subtilis. The student will benefit from collaboration with an existing programme in the Ajioka and Haseloff laboratories at the University of Cambridge, and participate in the design and construction of a novel library of standard biological parts (known as G+Bricks). These DNA parts will include promoters, terminators, ribosome binding sites and protein coding sequences for use in Bacillus. With these parts, the student will be able to test biological circuits for cell-cell communication. Diffusible signals can work as positive and negative regulators to determine the state of individual cells, establishing local conditions within a population, and form self-organised spatio-temporal patterns, such as those described by Turing. The theoretical development and practical testing of LSB will be tied to the construction of simple self-organising circuits in Bacillus subtilis. These patterning devices will have potential wide application as intelligent switches in genetic systems.

Vijay D’Silva (opens in new tab)

Oxford University, United Kingdom Research title: Generalisation operators for abstraction-refinement Supervisor: Dr Daniel Kröning, Microsoft Research supervisor: Josh Berdine Summary: Abstraction-refinement with predicates forms the basis of state-of-the art software verification tools. Unfortunately, many existing abstraction-refinement techniques diverge even on extremely simple examples, for which finite abstractions are known to exist. The aim of this proposal is to study and identify the source of divergence in abstraction-refinement methods and combat them using generalisation operators. Such operators are used to identify and generalise a pattern in a sequence of abstractions. Unlike abstraction, the basic principles behind the design and analysis of generalisation operators are not well understood. We propose designing generalisation operators using learning algorithms and interpolation-based methods. On the theoretical side, we will study the completeness and termination properties of abstraction-refinement using such operators. On the practical side, we will develop and evaluate a tool for software verification based on abstraction-refinement with generalisation operators.

Ye Yuan

University of Cambridge, United Kingdom Research title: Robust network reconstruction with applications to biology Supervisor: Dr Jorge Goncalves, Microsoft Research supervisor: Hillel Kugler Summary: This project focus on the development of mathematical and software tools to reconstruct causal network structures from data and with application to regulatory T (Treg) cells. These results will allow biologists to unveil causal network structures between measured species. With time series data, this network gives computational predictability through simulations where hypothesis can be tested without the need to perform new experiments. With steady-state data, in addition to the Boolean network, it reveals the gains between measured species and whether they activate or inhibit each other. The discovery of the network helps understand how the system functions. This is fundamental if we wish to control it with synthetic biology to produce a desired system or for drug development. Knowing the actual causal network structure allows to change or remove a specific pathway without affecting others. Given noise present in the data and nonlinearities in the system, we will find the most likely candidate networks that fit the data by measuring the smallest distance, in some norm, from the data to all Boolean network structures. On the application side, we will reveal the network structure among a carefully selected set of genes of Treg cells in collaboration with Professor Balling from the Helmholtz Center for Infection Research in Germany, where all the experiments will be performed. The obtained network structure will help us understand the complicated dynamic functional mechanism of Treg cells and further help produce therapeutic strategies of many of autoimmune diseases in which Treg cells play a vital role.
Calum Brown

University of St Andrews, United Kingdom Research title: Spatiotemporal analysis of complex ecological communities Supervisor: Dr. Janine Baerbel Illian, Microsoft Research supervisor: Drew Purves Summary: Ecologists have presented a variety of models to explain community structure, relative abundance, and species coexistence in biodiverse communities. This project tests the hypothesis regarding whether the neutral model and niche differentiation leave different footprints in multi-species spatial patterns of competing species. Stochastic, agent-based models are developed to generate multi-species spatial patterns under contrasting assumptions of neutrality and niche differentiation and statistical measures from spatial point process theory are derived to characterize the main features of multi-species spatial patterns. These statistical measures are applied to the multi-species spatial patterns generated by the agent-based models, to see how the footprints of different model assumptions can be distinguished. It will then apply the statistical measures to spatial patterns of trees in tropical rain forests to examine the extent to which the multi-species patterns are consistent with different assumptions about neutrality and niche differentiation.

Catalin Hritcu (opens in new tab)

Saarland University, Germany Research title: The Secure Implementation of Cryptographic Protocols Supervisor: Prof. Michael Backes, Microsoft Research supervisor: Andy Gordon

David Hopkins (opens in new tab)

University of Oxford, United Kingdom Research title: Verifying properties of the ML family of programming languages Supervisor: Prof. Luke Ong, Microsoft Research supervisor: Andrew Kennedy Summary: We aim to develop verification techniques for higher-order (call-by-value) procedural languages with reference type, as exemplified by Ocaml, F#, and other members of the ML family of programming languages. Specifically we set ourselves the following tasks:
- Derive an algorithm for deciding observational equivalence (of an appropriate fragment of Reduced ML) by reduction to the equivalence of visibly pushdown automata, and build a prototype implementation of it.
- Develop a notion of reachability test for higher-order procedural languages, identify language fragments for which it is decidable, and determine its complexity.
- Construct algorithms that symbolically compute over-approximations of an appropriate collecting semantics for higher-order procedural languages, with a view to the data and control flow analysis of these programs.
Davide Cacchiarelli (opens in new tab)

University of Rome ‘La Sapienza’, Italy Research title: High complexity cell networks involving microRNAs: A systems biology approach Supervisor: Prof. Irene Bozzoni, Microsoft Research supervisor: Jasmin Fisher Summary: microRNAs are a large class of small non-coding RNAs, which regulate gene expression at the post-transcriptional level through base-pairing with the 3’-untranslated region (3’UTR) of target messenger RNAs (mRNA). miRNAs are a considerable part of the transcriptional output of the genomes of plants and animals, involved in the establishment of complex circuitries acting in many different phases of development and differentiation. It is currently estimated that miRNAs account for ~1 percentof predicted genes in higher eukaryotic genomes and up to 30 percentof the genes might be regulated by miRNAs. We are witnessing a shift in our understanding of the complexity in gene regulatory networks caused by the discovery of miRNAs. This PhD project joins the efforts of two research groups, an RNA molecular biology group and a bioinformatics group. The goal of the collaboration consists of analyzing several aspects of miRNA genomics, expression and activity, from the computational biology predictions, to the biological validation of inferences. Existing and new data will be integrated in a computational study aimed at simulating and drawing new cellular networks mediated by non coding RNAs. A successful approach to unravel the phenotypic role of individual miRNAs consists of detailed experimental studies applied to various model systems. Importantly, the problem can also be approached by computational methods, as we propose in this project. In summary, the target of this PhD project is threefold:
- According to the experimental findings of our molecular biology laboratory and by means of a computational approach, we intend to identify not yet known rules underlying RNA-protein interactions.
- We propose to build specific clusterization methods to manage the great amount of data obtained from global gene expression assays performed in our experimental lab and retrieved from public databases (both miRNA and mRNA expression data).
- The collected data will be used to identify new cellular regulatory circuitries involving miRNAs and mRNA.
Elias Athanasopoulos (opens in new tab)

Foundation for Research and Technology–Hellas (FORTH), Greece Research title: On the misuses of real world large scale distributed systems Supervisor: Prof. Evangelos Markatos, Microsoft Research supervisor: Thomas Karagiannis Summary: Distributed systems dominate in real world applications. Apart from the e-mail system or the World Wide Web, which are considered as large-scale distributed systems, other systems exist providing different applications to the end user. Systems for file sharing, video casting, social networking, and Voice over Internet Protocol (VoIP) are some examples of online communities that are composed by thousands of independent computer nodes. A significant number of these systems operates by using the peer-to-peer (P2P) paradigm. These systems are built over the cooperation of different computer entities and no central mechanism is used to instrument their operation. Such systems can be misused in order to provide different operations than the ones for which they were designed. More precisely, these online communities can be used as attack platforms against any computer that is connected to the Internet. In this project, we seek to develop a generic framework, that is capable of identifying these vital properties that make a distributed system vulnerable to possible misuse.

Emily Lines

University of Cambridge, United Kingdom Research title: Predicting long-term forest dynamics at regional scales: constraining models with data Supervisor: Dr. David Coomes, Microsoft Research supervisor: Drew Purves Summary: The response of forest ecosystems to climate change is a major source of uncertainty in predictions of future climate. The project will use novel Bayesian methods, in conjunction with very large ecological data sets, to estimate a complete set of forest simulation model parameters for each of the major tree species over a large Mediterranean region (Spain). Model simulations will then be used to test various hypotheses about the climatic control of vegetation, and to predict the potential response of Spanish forests to future climate change.

Eno Töppe (opens in new tab)

TU Munich, Germany Research title: Vision for graphics – High level image manipulation using uncertain preimages Supervisor: Prof. Daniel Cremers, Microsoft Research supervisor: Carsten Rother Summary: Manipulation and editing of photographs has moved from Hollywood and professional photography studios into the home. We want to turn our photographs from mere factual records into evocative mementos—even into works of art. However, despite the great leaps in usability that digital image editing allows, most existing software (whether commercial or in research papers) is focused on low-level editing (erasing, cut and paste, and so forth). The challenge addressed by this research proposal is to allow for editing on a higher level of image interpretation. Although such systems have been suggested many times, they fall at the first hurdle: We cannot build reliable algorithms to recover high-level information—such as shape, lighting, object identity, or even camera settings (aperture, focus, and so forth)—from real-world images. The reasons for this are not simply that existing algorithms must solve a complex optimization problem, but that the images themselves are ambiguous: many 3D scenes (or preimages) may give rise to the same image. For example, under different lighting, a shiny cylinder may look identical to a nearly matt prism. We propose to incorporate this uncertainty into common editing tasks explicitly, marginalizing over the probability distribution on preimages to accomplish the following:
- Generate realistic edits despite uncertain scene interpretations.
- Guide the user in choosing among equivalent preimages, but only to the extent necessary to complete each editing task.
Enuo He

Dorothy Hodgkin Postgraduate Award, University of Oxford, United Kingdom Research title: Stochastic modelling of the yeast cell cycle Supervisors: Prof. Béla Novák and Dr. Andrew Dalby, Microsoft Research supervisor: Andrew Phillips Summary: In the cell division cycle, cells replicate their DNA, segregate the duplicated sister chromatids and divide into two daughter cells. The unidirectional repetitive of cell cycle is one of the essential features to maintain proliferation of life. The irreversibility of cell cycle transitions is controlled by a complex network containing cyclin-dependent protein kinases (Cdks) and their regulators. Budding yeast due to its special genetic characters has been intensively studied both experimentally and in sillico. In particular, a large number of deterministic models have been developed for its cell cycle regulations based on ordinary differential equations (ODEs). Recently, a trend of using stochastic techniques has been spotted in this field. The main advantage of stochastic methods is that it is able to capture the noise and fluctuations that are ubiquitous in the cells. The purpose of this work is twofold: first, we study and formulate the existing deterministic Novak-Tyson model in terms of elementary chemical reactions, which can be properly simulated using stochastic algorithms (e.g., SSA, chemical Langevin equations, etc.). In the second part, we will focus on dynamic features and the effects of noise in our model by comparing different stochastic simulation outcomes. We plan to enrich our stochastic model with explicit representations of more completed cell cycle networks and try to explain more observations in specific experimental conditions by employing different stochastic techniques.

François Dupressoir (opens in new tab)

The Open University, United Kingdom Research title: Verifying implementations of security protocols in C Supervisor: Dr. Jan Jürjens, Microsoft Research supervisor: Andy Gordon Summary: One of the successes of formal methods in computer security is the development of tools to analyze abstract specifications of cryptographic protocols against high-level security requirements (such as confidentiality, privacy, and authenticity properties). Despite recent advances in analyzing the implementation code of security protocols [JY05,GP05,BFG06,BFG06a,Gor06,Jur06,Jur07], there are no tools shown to analyze pre-existing protocol implementations in C against high-level security requirements, although tools to check lower-level properties, such as memory safety, do exist. Inventing tools to verify high-level properties of the code—as opposed to abstract specifications—of security protocols is important: in part, because implementation code is often the first and only place where the full details of a protocol are formalized, and in part because even if there is a detailed formal specification, the implementation may introduce additional vulnerabilities. The goal is to start with pre-existing implementations of cryptographic protocols and to verify them against security properties by using formally sound yet practically usable analysis techniques and tools. An important objective is to ensure that these analysis methods are practical by keeping the overhead of their use as small as possible. First, they take input artifacts that are already available in current industrial software development (the program source code). Second, the tools should be reasonably easy to use and have a strong emphasis on automation. Manual effort (such as inserting annotations into the code) should be minimised. The goal of the security analysis is to find code vulnerabilities that give rise to attacks within the Needham-Schroeder threat model (such as man-in-the-middle, replay, or impersonation attacks). The project does not primarily aim to detect buffer-overrun vulnerabilities, as there many existing techniques (including the CCured tool) for detecting or mitigating such problems. Our main aims are to develop a formal security verification methodology to verify implementations of cryptographic protocols in C against security properties, such as confidentiality and authenticity, and to apply the methodology to the implementation of a widely-used security protocol (such as the Internet security protocol standards SSL/TLS, SSH, or Kerberos).

Iain Whiteside (opens in new tab)

University of Edinburgh, United Kingdom Research title: Proof engineering: refactoring proof Supervisor: David Aspinall, Microsoft Research supervisor: Georges Gonthier

Jagadeesh Gorla (opens in new tab)

Dorothy Hodgkin Postgraduate Award, University College London, United Kingdom Research title: Unified relevance models for information retrieval Supervisor: Dr. Jun Wang, Microsoft Research supervisor: Stephen Robertson Summary: In information retrieval, relevance is a very important concept and has been heavily studied. There are two different views on how to assign a probability of relevance of a document to a user need, namely document-oriented and query-oriented views. The classic probability model of information retrieval takes the query-oriented view, while the language modeling approach to information retrieval builds upon the document-oriented view. However, neither view represents the problem of information retrieval completely. This proposed PhD research programme aims at breaking the partial views that exist in current retrieval models, and formally develop a new retrieval paradigm that explores more of the dependency in the data and unifies evidence from different information sources to improve retrieval performance. To achieve our goal, we will do the following:
- Explore the dependency between users who have similar information needs
- Combine evidence from different information sources
- Apply Bayesian probabilistic approaches to model the unified relevance
James Auger

Royal College of Art, United Kingdom Research title: Human/Robot Cohabitation: an interaction-focused approach to designing robots for the home Supervisor: Prof. Anthony Dunne, Microsoft Research supervisor: Alex Taylor Summary: Over the coming years, robots are destined to play a significant part in our daily lives. However, how will we interact with them? What kind of new interdependencies and relationships might emerge in relation to different levels of robot intelligence and capability? Given a choice, what would we like to happen and how would we like our robots to exist in our homes? This project will explore and develop new interactions between people and robots and use them as a starting point for robot design.

Jurgen Van Gael

University of Cambridge, United Kingdom Research title: Machine learning models for large-scale systems and networks Supervisor: Prof. Zoubin Ghahramani, Microsoft Research supervisor: Ralf Herbrich Summary: In the past 15 years, some of the most beautiful and useful machine learning tools have been motivated by problems in biology, natural language processing, computer vision, and other fields. In all these areas, statistical approaches have been crucial for dealing with uncertainty. In this application, we propose to explore new directions in statistical machine learning motivated by challenging problems in the field of computer systems and networking.

Laura Dietz

Max Planck Institute for Software Systems, Germany Research title: Probabilistic Topic Models to Support a Scientific Community Supervisor: Prof. Dr. Tobias Scheffer, Microsoft Research supervisor: Ralf Herbrich

Mariano Díaz (opens in new tab)

Dorothy Hodgkin Postgraduate Award, Imperial College London, United Kingdom Research title: Modelling integrated signalling networks in stomatal guard cells Supervisor: Dr. Radhika Desikan, Microsoft Research supervisor: Jasmin Fisher Summary: Stomata are microscopic pores on the surface of leaves that control water and gas exchange between the plant and the environment. It is estimated that around 65 percent of the Earth’s fresh water passes through these stomatal pores in plants. With global climate change and the lack of availability of water in several parts of the world, an increased understanding of how plants regulate stomatal closure potentially has a significant impact on plant yield and productivity. Stomatal closure occurs in response to a number of input signals, including drought stress, pathogen attack, hormone challenge, humidity, and high carbon dioxide. Guard cells surrounding stomata are highly specialised cells that control the opening and closing of the stomatal pore. A number of signalling pathways that occur in guard cells in response to each of the stimuli mediate the outcome, stomatal closure. Although much research efforts are focused on studying the effects of a single stimulus (for example, the hormone abscisic acid, made in response to drought stress) on stomatal apertures, there is little known about how different stimuli interact to regulate stomatal apertures. It is the purpose of this project to understand, via modelling, how combined signals lead to a cellular response different to that derived from a single stimulus. Ultimately, it will lead to an increased understanding of plant behaviour in response to multiple environmental stresses and the design of better experiments that help reduce water loss from plants. Using available data, it is proposed that construction of simple models to understand signal integration in guard cells will help predict subtle cellular changes in response to the environment. The project will aim to build a model: to help identify which of the key “nodes” already identified are essential for signal integration; to understand how disruption of one signalling pathway can lead to a maximum change in output; and ultimately, to be able to predict a change in the output with variations in input signals.

Masoud Koleini

University of Birmingham, United Kingdom Research title: Verification of statebased access control Supervisor: Dr. Mark Ryan, Microsoft Research supervisor: Moritz Becker Summary: We aim to provide a method that allows fine-grained decentralised access control systems to be verified against abstract high-level properties of a kind that can be set and understood by managers. This kind of verification will give assurance of the correctness of the low-level ACS. We will build a modelling language that supports decentralised systems, delegation, roles, and statebased permissions; and a method for verifying systems described in that language against abstract policies. If time permits, we will also explore whether fine-grained access control systems can be synthesised from the high-level properties.

Mehdi Hosseini

Dorothy Hodgkin Postgraduate Award, University College London, United Kingdom Research title: Understanding resource-constrained information retrieval Supervisor: Prof. Ingemar Cox, Microsoft Research supervisor: Natasa Milic-Frayling Summary: Search engines need to index a huge amount of data (billions of Web pages) and must deal with very high query loads (hundreds of thousands of requests per hour). This places serious strains on the underlying computer systems. This proposal seeks to develop a theoretical framework in which to understand the tradeoffs between performance and cost, where performance is measured with respect to the quality of the retrieved results and cost is a function of hardware costs and economic costs associated with degraded performance. The theoretical developments will be supported by empirical studies to verify the theoretical models and guide the development of improved models.

Mihhail Aizatulin (opens in new tab)

The Open University, United Kingdom Research title: Verifying implementations of security protocols in C Supervisor: Dr. Jan Jürjens, Microsoft Research supervisor: Andy Gordon Summary: One of the successes of formal methods in computer security is the development of tools to analyze abstract specifications of cryptographic protocols against high-level security requirements (such as confidentiality, privacy, and authenticity properties). Despite recent advances in analyzing the implementation code of security protocols [JY05,GP05,BFG06,BFG06a,Gor06,Jur06,Jur07], there are no tools shown to analyze pre-existing protocol implementations in C against high-level security requirements, although tools to check lower-level properties, such as memory safety, do exist. Inventing tools to verify high-level properties of the code—as opposed to abstract specifications—of security protocols is important: in part, because implementation code is often the first and only place where the full details of a protocol are formalized, and in part because even if there is a detailed formal specification, the implementation may introduce additional vulnerabilities. The goal is to start with pre-existing implementations of cryptographic protocols and to verify them against security properties by using formally sound yet practically usable analysis techniques and tools. An important objective is to ensure that these analysis methods are practical by keeping the overhead of their use as small as possible. First, they take input artifacts that are already available in current industrial software development (the program source code). Second, the tools should be reasonably easy to use and have a strong emphasis on automation. Manual effort (such as inserting annotations into the code) should be minimised. The goal of the security analysis is to find code vulnerabilities that give rise to attacks within the Needham-Schroeder threat model (such as man-in-the-middle, replay, or impersonation attacks). The project does not primarily aim to detect buffer-overrun vulnerabilities, as there many existing techniques (including the CCured tool) for detecting or mitigating such problems. Our main aims are to develop a formal security verification methodology to verify implementations of cryptographic protocols in C against security properties, such as confidentiality and authenticity, and to apply the methodology to the implementation of a widely-used security protocol (such as the Internet security protocol standards SSL/TLS, SSH, or Kerberos).

Mohammed Al-Loulah

Dorothy Hodgkin Postgraduate Award, Lancaster University, United Kingdom Research title: Embedded broadband ultrasonic positioning for mobile computing and sensor networks Supervisor: Dr. Mike Hazas, Microsoft Research supervisor: James Scott Summary: Recent research trends have seen localisation systems move away from a heavy reliance on pre-installed infrastructure, and towards being increasingly embedded and deployable in an ad hoc fashion. However, such embedded implementations lack the robust performance and scalability of their infrastructural counterparts. This proposal argues that sensing over broadband ultrasonic channels is a promising avenue to practical embedded positioning systems. Broadband signalling can provide measurement robustness, vast scalability, and extraction of richer information from positioning signals. Piezopolymer ultrasonic transducers and real-time FPGA embedded signal processing will be combined to create sensor nodes, which will be used to explore and experimentally characterise design aspects such as signal structure, Doppler analysis, array manifolds, and beamforming techniques.

Olle Blomberg

University of Edinburgh, United Kingdom Research title: Cognition embodiment and interaction design Supervisor: Prof. Andy Clark, Microsoft Research supervisor: Richard Harper Summary: The cognitive role of body and environment is now a central concern in all of the cognitive sciences. Today, cognitive science is “embodied,” “situated,” and “distributed.” We can refer to this conglomerate of research as Embodied Cognitive Science (ECS). A similar flight has taken place in design-oriented research on how people interact with information technologies. While Human-Computer Interaction (HCI) took off by studying the individual disembodied mind interacting with a desktop computer, the newer field of Computer Supported Cooperative Work (CSCW) is permeated with ethnographic studies of how teams interact with artifacts in organisational environments. Interaction is now “tangible” and computing “ubiquitous,” not just on the desktop. We can refer to this strand of research as Interaction Research (IR). While some researchers have been involved in both ECS and IR, the two flights are not as closely related as one might think. The perspectives that have gained ground in IR mainly come from the social—rather than the cognitive—sciences (for example, ethno-methodology, conversation analysis, and actor-network theory). In ECS, on the other hand, the main thrust—at least originally—came from biologically-inspired works in robotics and artificial life, rather than from naturalistic studies of human beings. Recently, however, the relation between cognition and embodied (social) interaction appears to have become a central issue in both IR and ECS. The aim of the project is to investigate whether, and in which ways, theoretical concepts and empirical findings from ECS and IR can cross-fertilise each other. The aim is not to settle for philosophical clarification but also to propose new concepts and concrete analytical tools that can inform empirical research (in, for example, CSCW). Since IR has developed with largely unexamined assumptions about the nature and role of cognition in interaction, while cognitive science has developed with equally unexamined assumptions about the nature and role of interaction in cognition, both ECS and IR have much to gain from this type of investigation.

Rayna Dimitrova

Saarland University, Germany Research title: Automatic Abstraction for Complex Partial Designs Supervisor: Prof. Bernd Finkbeiner, Microsoft Research supervisor: Byron Cook

Shady Elbassuoni

Max Planck Institute for Software Systems, Germany Research title: Language Models for Structured Information Retrieval Supervisor: Prof. Gerhard Weikum, Microsoft Research supervisor: Stephen Robertson Summary: The success of knowledge-sharing communities like Wikipedia and the advances in automatic information extraction from textual and Web sources have made it possible to build large “knowledge repositories” such as DBpedia, Freebase, and YAGO. These collections contain structured data that can be viewed as graphs of entities and relationships (ER graphs) or, alternatively, triples in the Semantic-Web data model RDF. Similarly, advanced user intentions can be expressed via structured queries that allow users to explicitly express entities, relationships, and concepts. In such context, queries are typically expressed in the W3C-endorsed SPARQL language or by similarly designed graph-pattern search. However, exact-match query semantics often fall short of satisfying the users’ needs by returning too many or too few results. Therefore, IR-style ranking models are crucially needed.

Simon Schubert

EPFL, Switzerland Research title: A framework for overlay bandwidth awareness and allocation Supervisor: Prof. Dejan Kostic, Microsoft Research supervisor: Ant Rowstron Summary: The Internet landscape has changed as today’s Internet end-hosts are powerful and well-connected machines. The existing peer-to-peer (P2P) file-sharing applications leverage these capabilities and already consume more than 60 percent of all Internet traffic. With the percentage of broadband connections steadily growing, users are increasingly demanding easy access to entertainment. A new generation of P2P services is started to be deployed on the Internet, characterized as being real-time and high bandwidth, for example real-time video streaming and video-on-demand (VoD) services. These services are more demanding than file download because once the viewing starts, it ideally should not be interrupted. Further, users expect a low latency between starting the application and viewing the media. The hypothesis of this proposal is that the main problem in allocating bandwidth in the existing systems is that there exists an inherent limit to using local decisions in a large-scale system without having any access to global state. In the single overlay case, the problem can manifest itself as degraded, unsatisfactory performance for all nodes. In the case of multiple overlays, the problem materializes as an over-allocation of bandwidth to a lower-priority overlay (for example, in-overlay replication) that can dramatically reduce the performance delivered to what the application deems to be a high-priority overlay (for example, video streaming). In this context, we propose to design and implement a framework for multi-overlay bandwidth awareness and allocation. The key insight behind this proposal is that some amount of global information should be collected in an efficient manner and made available to the system participants. One of our main goals is, therefore, to propose various bandwidth allocation policies and to design, implement, and evaluate mechanisms for enforcing these policies—even in cases of multiple overlays that are competing for the same network bandwidth.

Yana Mileva (opens in new tab)

Saarland University, Germany Research title: Learning from changes Supervisor: Prof. Andreas Zeller, Microsoft Research supervisor: Brendan Murphy

Zartasha Mustansar

Dorothy Hodgkin Postgraduate Award, University of Manchester, United Kingdom Research title: Extinct but still byte-ing: Using computational biology to bring predatory dinosaurs back to life Supervisor: Dr. Lee Margetts, Microsoft Research supervisor: Hillel Kugler Summary: When talking about dinosaurs in casual conversation, one is usually asked, “How do they know what dinosaurs really looked like?” Today, we are privileged to be able to investigate such questions by using advanced computational biology. The proposed programme of research seeks to use state-of-the-art computational techniques to reverse engineer the walking cycle of a predatory dinosaur. Software for image-based modelling, parallel finite element analysis, and evolutionary robotics will be coupled and deployed over a computational grid comprising many thousands of processors. This resource will facilitate a range of unique meta-experiments, enabling new scientific research to be undertaken that would otherwise not be possible. Not only will these experiments provide insight into the evolution of bipedalism or answer questions such as “How fast could T-Rex run?”, they will also lead to the development of computational techniques that may be applied in classical engineering sectors such as the automotive and aerospace industries.
Axel Rack

Freie Universität Berlin, Germany Research title: New mathematical methods for identification of proteins in mass-spectrometry bases protein profiles Supervisor: Prof. Dr. Christof Schütte, Microsoft Research supervisor: Vassily Lyutsarev Summary: Mass spectrometry (MS) based techniques have emerged as a standard for large-scale protein analysis. The ongoing progress in terms of more sensitive machines and improved algorithms led to a constant expansion of its fields of application. Recently, MS was introduced into clinical proteomics with the prospect of early disease detection using proteomic pattern matching. MS based characterization and identification of peptides in blood appears to be one of the arising key technologies for biomarker discovery, understanding of biological mechanisms, and consequently, it might offer new approaches in drug development. However, the correct identification (ID) of peptides and proteins based on MS signals is challenging due to the complexity of the data, which is further amplified by protein modifications or mutations, cleavage errors, homologous proteins, and sample contamination. Therefore, the capabilities of standard approaches to efficiently perform an exhaustive ID are limited and only a small number of known modifications can be considered. The vastly increased amount of data generated with our sensitive MS processing pipeline will allow for a greatly improved rate of identified proteins / peptides if appropriate ID algorithms are available. We think that the application of sophisticated combinatorial optimization techniques will provide the required means to efficiently perform an exhaustive ID. This means to be able to include large numbers of modifications as well as additional, biologically motivated supportive information, such as modification frequency statistics or cleavage statistics. The project has the following main aims:
- Development of a sensitive ID algorithm that is able to efficiently cope with big datasets that integrates reasonably high numbers of protein modifications and other biological side constraints.
- Integration of this algorithm into the existing MS data processing pipeline to provide a comprehensive framework for MS based analyses.
- Development of a Web-service based front-end to make the MS data processing pipeline readily available to a broad range of scientists via the Internet.
Ben Calderhead

University of Glasgow, United Kingdom Research title: Bayesian inference in systems biology: Modelling organ specificity of circadian control in plants Supervisor: Prof. Mark Girolami, Microsoft Research supervisor: Drew Purves Summary: Two biological processes in which biochemical oscillatory behaviour is observed are the cell-cycle and circadian rhythms. Much progress has been made in the development of mechanistic models to support the study of the underlying biochemical processes controlling such rhythmic behaviours. Ongoing collaboration between the M Girolami and the laboratory of Prof. Hugh Nimmo and Prof. Gareth Jenkins (Both from the Plant Molecular Science Group, Division of Biochemistry and Molecular Biology, Institute of Biomedical and Life Sciences, University of Glasgow) motivates a highly novel investigation to infer mechanistic models, employing inference methods over parameters and structure, which will seek to describe the underlying mechanism(s) responsible for tissue specificity in the circadian control of plant metabolism. The Nimmo Laboratory have made the startling experimental observation that certain genes within Soybean, and more recently in Arabidopsis Thaliana, whose expression are under control of the circadian clock, possess the property that the circadian control is tissue specific. There is currently no detailed definition or adequate model describing the underlying biochemical mechanism(s) of this phenomenon and there is an immediate need to investigate this remarkable property. Indeed current mechanistic models of circadian control within plants do not account for possible communication between different tissues (e.g. leaves and roots) nor do they explain stimulation of circadian control genes (such as CCA1/LHY) in roots being stimulated by sugars rather than by light as in shoots. The available experimental data, in the form of time-course microarray gene expression which the Nimmo Laboratory have produced (and will be producing during the course of this PhD), presents an outstanding opportunity to systematically, in a hypothesis driven manner, explore the space of plausible oscillatory control models which provide mechanistic descriptions of the phenomenon observed.

Bogdan Ciubotaru

Dublin City University, Ireland Research title: Quality-oriented handover scheme for adaptive multimedia streaming in heterogeneous wireless networks environment Supervisor: Dr. Gabriel-Miro Muntean Summary: This project aims at proposing a client-server mechanism employed for quality-oriented multimedia streaming services that monitors current delivery conditions, analyses network performance, cost and end-user perceived quality parameters and takes decisions in order to maximize user experience with the service. The decision involves switching from one network to another and adjusting multimedia content parameters such as resolution and/or frame rate in order to maximize the end-user perceived quality and to minimize the cost the user has to pay for the streamed multimedia content. The novel idea is that instead of waiting for the quality to decrease due to either signal power decrease as the mobile multimedia user reaches network’s maximum range or due to the increase in loss due to network’s load, the proposed scheme actively finds a better solution.

Eugenio Giordano

Universitá degli Studi di Bologna, Italy Research title: Vehicular networks: Safety and beyond. Challenges for the next generation of mobile networks Supervisor: Prof. Marco Chiani, Microsoft Research supervisor: Ant Rowstron Summary: Recent advances in wireless communications and electronics have enabled the development of low-cost, low-power, multifunctional sensor nodes. The simultaneous use of several sensor nodes, which include components for sensing, data processing, and communication, leverage the idea of sensor networks. Indeed, a sensor network is composed of a large number of sensor nodes that are densely deployed either inside the observed phenomenon or very close to it. This scheme allows gathering distributed information about any parameter of interest. Nodes participating in a sensor network could be both, static and mobile, even highly mobile as in the case of vehicular sensor networks (VSNs). VSNs represent a particular case of sensor networks in which vehicles also embody sensor nodes. Therefore, vehicles evolve from simple transportation means to moving nodes able to gather data from the surrounding environment through their embedded sensors and distribute them to other nodes via wireless communications. This augmented capabilities, coupled with the large number, spread diffusion, and wide mobility range of vehicles, make them perfectly suited for monitoring factors of public interest such as air pollution level or temperature. Nevertheless, several traditional and new vehicular contest-related challenges have to be addressed before making this technology available for the pubic: mobility model tools, routing and topology control algorithms, and application layer protocols. In this context, few contributions have yet been proposed to specifically support sensor networking through vehicles. Furthermore VSNs, compared to traditional sensor networks, present both simplifications ad new challenges. The simplifications can be mainly identified in the absence of power related issues; the new challenges are related to the high vehicular mobility, high density of nodes, and absence of supporting infrastructure. The field of study of VSNs includes issues related to Vehicular Ad Hoc Networks (VANET), wireless sensor networks and delay tolerant networks (DTNs).

Florian Zuleger (opens in new tab)

Technical University of Munich, Germany Research title: Automatic derivation of loop bounds for worst case execution time analysis Supervisor: Prof. Dr. Helmut Veith, Microsoft Research supervisor: Byron Cook Summary: In many industries including robotics, consumer electronics, avionics, automotive, and manufacturing, the system components must interact according to a stringent real-time schedule. It is therefore crucial for system engineers to have a good understanding of the software execution time. Although there has been significant success in analyzing the program flow and the effects of advanced hardware features such as pipelining and cache behaviour, state-of-the-art tools work automatically only for simply structured programs. In particular, existing tools for estimating worst case execution time (WCET) have difficulties to analyze loop bounds for all but very simply structured loops. The goal of the proposed dissertation is to leverage the methods developed for Microsoft’s TERMINATOR tool for WCET analysis, i.e., for automatic derivation of loop bounds. The main challenge in the proposed work stems from the fact that state-of-the-art termination analysis methods employ quite sophisticated non-constructive mathematical methods, and do in most cases not allow to derive a loop bound directly. As argued above, a practically useful tool for the derivation of loop bounds will be valuable for embedded systems, but also for response time evaluation in a large number of software products ranging from desktop software all the way to device drivers.

Grégory Théoduloz

École Polytechnique de Lausanne, Switzerland Research title: Combining software verification and testing Microsoft Research supervisor: Byron Cook Summary: The topic of the proposed research falls within the area of software quality tools. In particular, we propose to draw on techniques from program analysis, software model checking, and automatic test-case generation, in order to develop a program verification theory and tool that combines the efficiency of testing-based approaches for finding bugs with the rigor of abstraction-based approaches for finding proofs of correctness.

Jennifer Pearson (opens in new tab)

University of Swansea, United Kingdom Research title: Digital story systems Supervisor: Prof. Dr. Harold Thimbleby, Microsoft Research supervisor: Richard Harper Summary: ‘Digital stories’ were invented in Berkeley in 1994 and represent a manageable way to create compulsive audio/visual narratives. Little research has been done on them. What is the best format? What tools should be used to construct them? How can databases of stories be managed? However, it is clear that they could be used effectively for many purposes, from UCD to accident debriefing. They could be constructed on mobile phones (so there is an enormous market) except there are no tools to do so, and no user-centred development of the tools. There are three strands to this research: develop tools (to author and to manage) digital stories; to do usability studies to evaluate and improve them; to do generic studies to refine the formats of digital stories.

Jorge Peña

King’s College London, United Kingdom Research title: Novel computing methods for detailed pan-tropical water resource futures simulation Supervisor: Dr. Mark Mulligan, Microsoft Research supervisors: Drew Purves, Kentaro Toyama Summary: Tropical water resources are under an unprecedented level of demand for agricultural, industrial, domestic, hydropower and navigation functions whilst also being heavily affected by continuing land use change in the tropical mountains and lowlands as well as global climate change. Tropical forests and páramos provide much of the high quality water for drinking water and hydroelectric power purposes supplying large cities throughout the tropics. Yet these areas are undergoing, significant land use change with associated detriment to both water quantity and quality. The FIESTA water resources simulation model, developed by Dr. Mulligan at King’s College London, which is the most comprehensive spatial simulation model for tropical hydrology, has been used throughout Latin America to quantify water resources provided by mountain regions to populated areas in the countries as well as impacts from land use and climate change towards the assessment of the utility of implementing financial mechanisms called payments for environmental services schemes (PES) to support long lasting conservation initiatives in this areas. The project PhD will combine computing science and environmental science to further enhance the models and systems for a pan-tropical application of understanding and protecting water resources. Building on the work of my previous PhD student (joint with KCL computer science, Waleed Alsabhan, PhD completed) in developing a real time online GIS-based catchment hydrological model and incorporating our recent databases including a global database of dam locations, we will tackle these problems within the PhD in order to build, apply and make available detailed, tropics hydrological (100 m resolution) assessment data. To achieve this, significant computational and algorithms will be developed and the modelling system will be converted for use in 64-bit and parallelization computing environment. The resulting information will be made available intuitively to decision makers by using recent MS developments in online mapping (virtual earth and virtual earth 3D).

Khilan Gudka (opens in new tab)

Imperial College London, United Kingdom Research title: Optimisations for the performance of programs using atomic blocks Supervisor: Prof. Susan Eisenbach, Microsoft Research supervisor: Tim Harris Summary: The main aim of this research is to look at ways of improving the performance of atomic blocks to make them applicable for real-world software. This will be in the form of static as well as dynamic optimisations that lead to increased concurrency and better throughput with more emphasis given to multi-core/multi-CPU systems, the growing widespread availability of which is the reason for why concurrency is now of extreme importance. Most work carried out on lock inferencing has ignored the scenario of where there are multiple processing units. Furthermore, it seems evident from existing work that neither transactional memory nor lock inferencing is ‘the’ solution, given that they each have their strong and weak points and are thus more applicable for certain applications than others. Hence, an additional area to look at for increasing the performance of atomic blocks is the combination of the two approaches. This would also be looked at as part of the research. Given the current state-of-the-art, developers that require performance from their software will most probably avoid using atomic blocks. The next step is to look at improving in this area.

Nick Taylor (opens in new tab)

University of Lancaster, United Kingdom Research title: Supporting village community through connected situated displays Supervisor: Dr. Keith Cheverst, Microsoft Research supervisor: Shahram Izadi Summary: The central aim of the proposed research is to understand the way in which the physical placement and design of networked displays in a rural village can influence and facilitate notions of community within the setting. It is apparent from our own research and related literature that it is essential to understand the social and physical richness of a given setting. Consequently, the planned research will draw from a range of approaches including ethnographic studies, use of cultural and technology probes, focus groups and design workshops. The proposed setting is the village of Wray in Lancashire—a small, remote village in the north of Lancashire with a population of less than 500. The planned research methodology is iterative: observe, design and deploy, observe… Where these stages are closely coupled and all hold key (technical and practical) challenges. The planned research should produce contributions in the areas of techniques for i) understanding requirements for supporting community through situated display based technologies, ii) design techniques including insights into PD techniques in this domain and the suitability of techniques such as the use of cultural and technology probes, and iii) greater understanding of the difficulties of deploying situated displays in a rural village (albeit one with enhanced connectivity).

Olga Morawczynski

University of Edinburgh, United Kingdom Research title: An ethnography of m-payments in rural Kenya: The case of m-Pesa Supervisor: Dr. James Smith, Microsoft Research supervisors: Alex Taylor, Jonathan Donner, Nimmi Rangaswamy Summary: There has recently been some attention given to how mobile phone applications, such as m-payments can be used to engender local improvements. The general consensus is that these applications provide new opportunities for low-income individuals to engage in financial transactions and to increase their personal income. In this project the PhD student will engage in an ethnographic analysis of m-Pesa—an m-payment application that has recently been launched in Kenya. The student will focus on how micro-entrepreneurs in this context appropriate this application, and how such appropriation can engender local level improvements. The outcomes of this research are as follows: the dissemination of the research findings amongst Microsoft researchers in the form of reports and seminars, and the publication of research in peer-reviewed journals.

Olivier Teboul (opens in new tab)

ANRT CIFRE École Centrale de Paris, France Research title: Large scale 3D modelling: Confluence of computer vision and graphics and architecture Supervisor: Prof. Nikos Paragios, Microsoft Research supervisor: Zhengyou Zhang Summary: Three-dimensional content is a novel modality used in numerous domains like navigation, post production, andcinematography, architectural modelling and urban planning. These domains have benefited from the enormous progress has been made on 3D reconstruction from images. Such a problem consists of building geometric models of the observed environment. State of the art methods can deliver excellent results in a small scale but suffer from being local and cannot be considered in a large scale reconstruction process since the assumption of recovering images from multiple views for an important number of buildings is rather unrealistic. On the other hand several efforts have been made in the graphics community towards content creation with city engines. Such models are purely graphics-based and given a set of rules (grammars) as well as dictionary of architectures (buildings) can produce virtual cities. Such engines could become far more realistic through the use of actual city models as well as knowledge of building architectures. Developing 3D models/rules/grammars that are ‘image’-based and coupling these models with actual observations is the greatest challenge of urban modelling. Solving the large-scale geometric modelling problem from minimal content could create novel means of world representation as well as novel markets and applications. At the same time, like gesture analysis and facial modelling it is a great research challenge that requires an interdisciplinary effort. The main aim of our approach is to deduct very accurate models from images that are constrained from the nature of the application and then use models to solve challenging, ill-posed problems in imaging, vision and graphics.

Paul Dunphy

University of Newcastle, United Kingdom Research title: Design and analysis of usable security mechanisms Supervisor: Dr. Jeff Yan, Microsoft Research supervisor: Michael Roe Summary: Despite the potential benefits of graphical passwords their design space and usability remains largely unexplored. Indeed, both theoretical and empirical studies have revealed unexpected weaknesses in some schemes. We propose to investigate a number of interleaved usability and security issues that are crucial to the real world uptake of graphical passwords. These issues include: 1) methods that can enhance both memorability and security of graphical passwords; 2) applications of proactive checking in graphical passwords, and; 3) mechanisms to counter shoulder surfing. To a lesser extent we hope to develop more ecologically valid methods to evaluate the usability of authentication mechanisms. Typically when studying the usability of passwords there is no cost to the user if they under-perform. In the real world this cost manifests itself as denial of access to some desired service, for example, Internet banking. Added to his is the fact that participants are not protecting a resource of any value and so have no genuine motivation, only the reward offered by the experiment organisers. With that in mind it is hoped to develop ways to motivate participants more realistically. We hope to generate new ideas to apply to new schemes. Ultimately this will involve creation of one or more demonstrably usable and secure graphical password schemes. Our hypothesis is that graphical techniques can be just as secure, if not better than ubiquitous text passwords, in usability, memorability and security.

Philipp Hennig (opens in new tab)

University of Cambridge, United Kingdom Research title: Probabilistic modelling for computer Go Supervisor: Prof. David J.C. Mackay, Microsoft Research supervisor: Thore Graepel Summary: We propose to use probabilistic modelling to tackle the grand AI challenge of computer Go. Following recent breakthroughs in stochastic game tree search known as Monte Carlo Go, we suggest modelling the evaluations in a game tree in terms of graphical probabilistic models and view the game tree search as an inference problem. We anticipate this view to enable a better understanding of the structure of the game tree and to aid design of a strong computer go program.

Sebastian Faust (opens in new tab)

Katholieke Universiteit Leuven, Belgium Research title: Provable security at implementation-level Supervisor: Prof. Dr. Bart Preneel, Microsoft Research supervisor: Cédric Fournet Summary: Traditional provable security regards cryptographic algorithms as black boxes. An adversary may have access to inputs and outputs, but the computation within the box stays secret. Unfortunately, this model often does not match reality where an adversary can attack the algorithm’s implementation with more powerful attacks. An important example in this context, are side-channel attacks, which provide an adversary with a partial view on the inner working of hardware. The goal of this project is to develop theoretical models in which formally provable security guarantees can be made concerning the implementation of cryptographic algorithms and protocols.

Simon Youssef

Ludwig-Maximilians University, Munich Research title: Stochastic time series modelling of single cell assay data Supervisor: Prof. Joachim Rädler, Microsoft Research supervisors: Andrew Phillips, Luca Cardelli Summary: Single cell assays have been established recently, as a means to obtain high quality time-series data. At the same time process calculi have been inferred with stochastic simulation techniques and adapted to the needs of the systems biology community. This thesis aims at integrating both, in silico and in vitro data by providing a standard measure for comparison. This toolkit will illuminate key points of the interplay between model and experiment using all available readouts from an experimental system. Furthermore computational and experimental methods will be developed that improve the high-throughput gathering and processing of single cell data. Our method will be exemplified by applying it to the apoptosis pathway and the gene delivery process.

Tahir Mansoori (opens in new tab)

University of Oxford, United Kingdom Research title: Development of a biological imaging workbench to support systems level simulation in physiology Supervisor: Prof. David Gavaghan Summary: Our goal is to develop a suite of biological imaging tools to create an imaging workbench for the laboratory and computational scientist. Large scale, anatomically detailed simulation of biological and physiological systems is playing an increasingly vital role in our attempt to understand both normal and patho-physiology, via an iterative interplay between laboratory experiment and predictive in silico simulation. Underpinning all such simulations is the requirement for techniques to extract the necessary geometry, and to parameterise the underpinning mathematical models, ideally from non-invasive imaging techniques. The workbench, although being developed in this first instance the very mature research area of heart modelling, will find increasing application right across the life sciences. Many of the necessary tools will developed by extending existing tools in medical imaging, and this project is therefore likely to benefit from the existing collaboration between the Microsoft Research Laboratory in Cambridge and the Medical Vision Laboratory and Oxford e-Research Centre in Oxford in this area.

Usman Ali

Supélec-CNRS-University of Paris, France Research title: WIBOX – A robust video receiver allowing WIMAX video broadcasting and indoor WIFI retransmission Supervisor: Dr. Pierre Duhamel, Microsoft Research supervisor: Bozidar Radunovic Summary: The proposal intends to solve most problems encountered for receiving video through a WIMAX link (even with a very weak signal) and retransmitting it indoor. The intent is to be able to build the equivalent of a ‘triple play box’ for wireless home delivery of internet services (including video streaming) through WIMAX.

Varun Gulshan (opens in new tab)

University of Oxford, United Kingdom Research title: Inferring organization of image data using visual words Supervisor: Prof. Andrew Zisserman, Microsoft Research supervisor: Andrew Blake Summary: We propose to investigate new principles for the unsupervised organization of collections of images, with application to photograph collections, image databases, and image information on the web. Existing systems for clustering or associating images tend to use global measures (for example, histograms) which are sensitive only to gross, average properties of images, and are unable to focus on highly salient structure of a more localized nature. We plan to address this limitation by investigating the combination of two recent developments in analysis of images. The first is visual words—sparse visual features selected statistically for saliency, which have already proved powerful in video- and image-based retrieval systems. The second is hidden variable modelling that can be used to separate areas of images on the fly, so that areas of particular interest—for example, foreground—can be selected for special treatment. These principles will be demonstrated in an experimental system for ‘power annotation’ for interactive users.
Ana Costa e Silva

University of Edinburgh, United Kingdom Research title: Extraction of Tabular Information from Unstructured Documents in a Peer to Peer System Supervisor: Dr. David Robertson, Microsoft Research Supervisor: Prof. Christopher Bishop Summary: The principal aim of this project is to study how tabular data from traditional (loosely structured) sources may be extracted and integrated using methods of multi-agent coordination (specifically the LCC language and associated methods being developed in the OpenKnowledge project); ontology representation and (where applicable) machine learning. Similar existing methods are either so general that they cannot guarantee the accuracy of the extracted information without extensive user intervention, or so context-dependent that they do not operate outside their very specific domains. They are also focused on (inevitably somewhat unreliable) extraction from specific local sources which means that the integration issues when one has to combine information obtained from different sources in different contexts (and perhaps by different means) is not addressed. Using LCC we shall study the broader picture required for integration and address the issues of uncertainty and ontology mapping that emerge. We speculate that by limiting our attention to a specific form of data structuring (tabular) we will be able to perform better integration without becoming domain specific.

António José dos Reis Morgado

University College Dublin, Ireland Research title: Models and algorithms for the maximum quartet consistency problem Supervisor: Dr. Joao Marques-Silva, Microsoft Research Supervisor: Dr. Lucas Bordeaux Summary: Given a set of taxa S and a complete set of quartet topologies Q over S, the problem of determining a phylogeny that satisfies the maximum number of topologies is called the Maximum Quartet Consistency (MQC) problem. The MQC problem is NP-hard. In the recent past, MQC has been solved both heuristically and exactly. Exact solutions for MQC include Constraint Programming and Answer Set Programming. The proposed research work evaluates new modelling solutions for MQC, including Boolean Satisfiability (SAT), Pseudo-Boolean Optimization (PBO), Maximum Satisfiability (MaxSAT) and Satisfiability Modulo Theories (SMT). In addition to developing new modelling solutions for MQC, the proposed research work will also consider specific algorithmic optimizations for the modelling frameworks used, motivated by the experience of modelling and solving computational problems in Bioinformatics with a wide range of Boolean-based problem-solving frameworks.

Arie Middelkoop (opens in new tab)

Utrecht University, Netherlands Research title: Coping with complexity in compiler construction Supervisor: Dr. Atze Dijkstra Summary: As part of the development of a series of compilers (jointly called the EHC) for an extended subset of the programming language Haskell we have designed the Ruler language. From Ruler programs we can both generate the type rules involved (in LaTeX) and the attribute grammars that describe the corresponding implementation. In this way we keep documentation and implementation synchronized. One of the nice aspects of the Ruler language is that we can present the final product in a sequence of steps. In the generated output differences are clearly indicated, which makes the study of these there type rules much easier. Having such a unified description opens a whole new route for compiler construction, which we propose to follow. By extending Ruler to a full programming language—for example,by incorporating global program analyses, and the possibilities for specifying transformations of Ruler programs explicitly—we will be able to capture many common compiler idioms in a uniform way; a typical example of such an idiom can be found in the way Hindley-Milner type inference uses environments and type variables to solve the set of constraints specified by the type rules. Once we have generated attribute grammars we can of course apply all the traditional scheduling algorithms that enable us to compute the attributes in way that is both time and space efficient. Furthermore we expect our compilers to become much easier to adapt: we may switch between different constraint solving strategies, error-correction strategies, or instrument the compiler with extra computations that will enable efficient incremental evaluation. Finally we may use Ruler programs as a starting point for proving properties about the described type system.

Aryeh Rowe

Royal Holloway, University of London, United Kingdom Research title: Unified scalable access control Supervisor: Dr. Jason Crampton, Microsoft Research Supervisor: Dr. Andy Gordon Summary: Access control in open distributed computer systems is a challenging problem, which manifests itself at very different scales. At one end of the scale we need to control access by mobile code to a host machine, while at the other we need to provide different levels of access for millions of users of web services. Research in the last ten years has developed a range of solutions to this problem. Each of these solutions is devised for the problem at one particular scale. We believe that a unified model that provides a generic access control solution at all scales is required. Such a model will inevitably be informed by existing approaches, such as matching attributes of the entity requiring access to the roles defined within the environment containing the protected resource. However, we also believe that is necessary to develop a new model for authorization, based on formalisms that have been specifically developed for distributed computation such as the pi calculus or I/O automata.

Christian Steinruecken

University of Cambridge, United Kingdom Research title: Learning to recognise hierarchies of objects and scenes Supervisor: Prof. David MacKay, Microsoft Research Supervisor: Dr. John Winn Summary: Accurate recognition of objects in images is critical to a very large range of applications, from the self-drive car to aids for the blind. In order to recognise very large numbers of object classes, it is necessary to organise these classes in an appropriate hierarchy or hierarchies, to exploit similarities and relationships between the classes. The proposed research will investigate various organisational hierarchies for object classes and scenes, and demonstrate their advantages both for accurate recognition at appropriate levels of specificity, and for rapid learning of new classes. The project will culminate with the development of a large scale object recogniser that exploits the hierarchical class structure.

Christoph Rhemann

Vienna University of Technology, Austria Research title: New approaches to video matting: Eye-tracking and parallelization Supervisor: Dr. Margrit Gelautz Summary: In this PhD project, we want to perform cutting-edge research in video matting by developing new and efficient algorithms to quickly extract high-quality video objects from natural image sequences. We will develop a new level of advanced user interaction techniques based on eye-tracking technology and parallelization. This will allow us to design interactive matting approaches with iterative improvement of the matte by switching between user interaction and computing intermediate results at high processing rates.

Daniel Cederman (opens in new tab)

Chalmers University of Technology, Sweden Research title: Non-blocking synchronization for multi-core processors Supervisor: Prof. Philippas Tsigas Summary: The aim of this thesis is to investigate the impact of properties of multi-core processors in the design and performance of non-blocking data synchronization data structures. One specific property that these systems have is the fact that processors are placed on the chip and therefore have direct on-chip communications paths without the need to go out to external buses. Multithreading has mainly been something for server software and for heavy computational software designed to run on supercomputers. Multithreading for desktop applications though has not been as common. When the plethora of desktop applications have to adapt to multiprocessing and multithreading it is more important than ever to provide simple interfaces to these algorithms. Therefore this is an important goal for the thesis to achieve.

Guillem Rull Fort (opens in new tab)

Technical University of Catalonia (UPC), Spain Research title: Validation of mappings between data models Supervisor: Dr. Ernest Teniente Summary: The main goal of this research is to propose a method for testing whether a given mapping between models satisfies some desirable properties. The waywe expect to achieve it is through an extension of the CQC Method which we successfully applied to database schema validation.

Iris Miliaraki (opens in new tab)

National And Kapodistrian University of Athens, Greece Research title: Continuous RQL query processing on top of distributed hash tables Supervisor: Prof. Manolis Koubarakis, Microsoft Research Supervisor: Dr. Ant Rowstron Summary: We propose to study the efficient evaluation of continuous RQL queries on top of distributed hash tables. RQL is a declarative query language for RDF/RDFS databases with the ability to express both data and schema queries in a uniform manner. It is used in two of the most prominent RDF stores, Sesame and RSSDB. Distributed hash tables are overlay networks that allow nodes holding data items to self-organize and offer data lookup functionality in a provably efficient, scalable, fault-tolerant and adaptive way. The problem of efficient evaluation of continuous RQL queries on top of distributed hash tables is essentially an open problem at the moment. We plan to extend previous work done in our group in order to deal with conjunctive queries in RQL. Then, we will study how to integrate RDFS reasoning in our algorithms so that schema queries in RQL are answered efficiently. Our next step will be to consider special cases of conjunctive queries (e.g., path queries) that will probably be amenable to more efficient query evaluation strategies. Finally, we will consider nested RQL queries and multiple query optimization issues. Our experimental work will be done by simulation or by using large scale distributed infrastructures such as PlanetLab. We will seek to demonstrate what trade-offs are possible between various performance metrics that need to be optimized in this setting e.g., number of hops, network latency, network bandwidth and load distribution under various data/query workloads.

Lynne Hamill (opens in new tab)

University of Surrey, United Kingdom Research title: The relationship between personal communication technologies and travel Supervisor: Prof. Nigel Gilbert, Microsoft Research Supervisor: Prof. Richard Harper Summary: Developments in communications technologies over the last few decades are such that it would be expected that the demand for travel would fall. Yet the demand for both communication and travel has soared. The key question to be addressed in this research is: in what circumstances do new communication technologies increase the demand for travel and in what circumstances do they reduce it? Communicating costs time and money, and the distribution of both types of costs varies with the nature of the channel between sender and receiver. Different channels provide different benefits but the size of the communication network is probably the most important (Metcalfe’s Law). Drawing on sociology, economics and social psychology, historical and contemporary case studies will be used as the basis for developing social simulation models using dynamically interacting rule-based agents that are capable of accounting for the observed macro relationships. Data on travel and the use of communication technologies will be extracted from publicly available nationally representative sources such as the Family Spending, General Household and National Travel surveys. Limited original qualitative data collection may be undertaken by for example using diaries and interviews.

Marije Geldof

Royal Holloway, University of London, United Kingdom Research title: The role of ICT in empowering people with low-literacy levels in Africa Supervisor: Prof. Tim Unwin Summary: The aim of this research is to identify optimal ways in which Information and Communication Technologies (ICTs) can contribute to empowering the lives of people with low-literacy levels in Africa. The particular contribution to this field will be through the research at the interface between interaction design and literacy. The first stage of the research has been a review of existing knowledge on the concept of literacy and interaction design. The next step will be field research in low-literate communities in Africa to identify the needs of the people that can improve their daily lives. Focus groups will be used to identify the necessary skills that are important in the daily life of the community. These will provide the basis for a ‘virtual card’ set that will be used to help community members categorize the skills and activities. This task will result in a ‘map’ representing the preferences and abilities of participants, which will be the starting point for more in-depth interviews. A meta-analysis of the results of the field research will subsequently be used to develop a model on how to enhance and enable the identified needs through the use of new technologies.

Martin Rohrmeier

University of Cambridge, United Kingdom Research title: Implicit learning of musical structure: experimental and computational approaches Supervisor: Dr. Ian Cross Summary: The proposed research aims to clarify our understanding of syntactic aspects of music by means of an integrated programme of empirical and computational studies. Investigation of the implicit learning of musical systems should provide insight into how humans actually acquire and apply knowledge about complex systems such as music and language, which rules they grasp (or not), and how they represent such knowledge. The research will focus primarily on melody and harmony as central domains in systematic and cognitive musicology, focusing initially on the formalisation of melodic structure. This will involve statistical approaches and will incorporate experiments on implicit learning of melodic structure as well as data-mining oriented corpus analysis. Subsequently, modelling work on harmony (in J. S. Bach’s chorales) that Mr Rohrmeier has already undertaken will be extended so as to formalise stylistic harmonic structure by means either of lexical-functional grammar or optimality theory. Experiments on harmony learning and tonality induction will be undertaken, and the results explored in the context of those of large-scale corpus analysis. Results from these two domains are expected to be applicable in automated musical style classification. In combination with experiments on subjects’ identification of relevant, information-carrying parameters, results may help improve current approaches.

Michael Johnson (opens in new tab)

University of Limerick, Ireland Research title: New machine learning paradigms for robots operating in a dynamic team based environment Supervisor: Dr. Martin Hayes Summary: This project will address the general problem of cooperative learning for a team of independent robots attempting to satisfy a common performance objective. The focus of the work will be on characterising the effect that a low power ad hoc network will have on the performance of a collective task. In particular the project will investigate how loss of sensor information or sensor malfunction impacts on robot performance in the pursuit of the objective(s) at hand. An example scenario would be robotic terrain mapping applications, security and/or biomed. In particular the project will ask how can energy consumption be optimised when certain (numerically well defined) doubts exist on the accuracy of the sensor feedback data.

Michael Pedersen

University of Edinburgh, United Kingdom Research title: High level languages for systems biology Supervisor: Prof. Gordon Plotkin Summary: Biologists use informal pictorial representations to express metabolic, signalling and regulatory pathways. However standardization is necessary for systematic model building and Goryanin et al are co-developing a standard notation, SBGN, to that end. However, the resulting diagrams are monolithic with no version support. Computer science experience show that linguistic approaches permit modular descriptions, and allow version support. Plotkin has developed a ‘Calculus of Chemical Systems’ to naturally express chemical reactions between (complexes of) modified proteins in nested compartments. It is translatable to diagrams (Petri nets) and has both qualitative and quantitative dynamics (stochastic and ODE). The student will design a yet higher-level language with primitives corresponding to those of SBGN, and expressiveness being demonstrated via a natural modular description of the interferon signalling pathways in macrophages. Translators between programs and diagrams will be written, allowing both pictorial representation of modular structure, and screen input of system components. Compilation to Petri nets will permit both qualitative and quantitative simulations. Biological specification languages may also be investigated. Modal logics as previously used are too low level and a more high level succinct language is needed natural for biologists, compiled to a low level language, and so linked to a standard model-checking package.

Michael Smith (opens in new tab)

University of Edinburgh, United Kingdom Research title: Automated stochastic abstraction of programs Supervisor: Dr. Jane Hillston Summary: We will investigate techniques for automatically deriving a stochastic model in PEPA from the code of a distributed system. The ultimate aim is to use existing highly successful methods for performance analysis of such models to analyze such systems directly, rather than only analyzing independently-constructed models.

Michael Verhoek

University of Oxford, United Kingdom Research title: Graph cuts applied to 3D+time ultrasound image spatio-temporal segmentation of the heart Supervisor: Prof. Alison Noble, Microsoft Research Supervisor: Prof. Andrew Blake Summary: We propose to investigate graph-cut based methods as the basis for spatio-temporal segmentation of 3D+time ultrasound sequences of the heart. This research will further the understanding of how graph-cut methods can be applied to solve spatio-temporal problems as well as potentially offer a new way to automatically derive quantitative 3D information about the health of the heart in a robust and efficient way. The latter would pave the way for cardiologists to make more informed decisions about disease assessment and treatment monitoring on the millions of people world-wide who have cardiac conditions.

Milan Raicevic (opens in new tab)

University of Durham, United Kingdom Research title: The end of the cosmic dark ages Supervisor: Prof. Carlos Frenk Summary: We propose a programme of supercomputer simulations of the evolution of the early universe. These will target the end of the ‘cosmic dark ages’ that followed the release of the heat of the primordial fireball 400,000 years after the Big Bang. The dark ages lasted about 500,000,000 years and were brought to an end by the formation of the first stars and quasars. Their radiation ionized the primordial gases making the universe transparent again to UV radiation. Little is known about the process of reionization. We do not know what exactly caused it, how long did it take or how the subsequent formation of galaxies was affected by it. To understand reionization it is essential to calculate how photons or particles of light propagate in the primordial soup of hydrogen and helium. Solving this radiative transfer problem can only be done by direct computer simulation and even then the problem is hugely challenging. We propose to develop novel techniques to tackle this problem and to implement them in a cosmological simulation code that follows the coupled evolution of dark matter and gas in the early universe. The simulations will be carried out in the Cosmology Machine at Durham, one of the largest supercomputers for academic research in Europe. The goal is to quantify how the universe emerged from the dark ages and to make predictions for observable properties that will be tested with the new generation of ground- and space-based observatories that are currently being built.

Sara Vicente

University College London, United Kingdom Research title: Optimisation techniques for computer vision applications Supervisor: Dr. Vladimir Kolmogorov, Microsoft Research Supervisor: Dr. Carsten Rother Summary: Markov Random Field (MRF) model has become an indispensable tool in computer vision and graphics. This model captures the fact that images are spatially coherent. In the past most research has been focused on simple MRF models, for example, Potts, quite likely due to the lack of efficient MRF optimisation techniques for more complex models. At the same time, complex MRFs are much more powerful; their importance for vision has been demonstrated in more recent articles. The goal of this PhD work is twofold: (i) develop novel and efficient optimisation algorithms for complex MRF models; (ii) extend MRF approaches to new and challenging computer vision and graphics applications, such as object recognition and alpha matting. It is important to note that these two tasks are typically coupled since a new application often requires an objective function which cannot be handled efficiently by existing methods.

Thomas Wies (opens in new tab)

University of Freiburg, Germany Research title: Symbolic data structure analysis Supervisor: Prof. Andreas Podelski, Microsoft Research Supervisor: Dr. Byron Cook Summary: The research topic is symbolic data structure analysis. The major theme of this topic is to combine program analysis techniques with symbolic reasoning (that is, with theorem proving) to analyze programs with unbounded data structures. Until now, software model checkers are inherently limited when it comes to the analysis of programs with heap-allocated data structures such as lists, trees, and dynamically allocated arrays. Therefore existing software model-checkers are not able to verify non-trivial properties of such programs (for example, that the output of a list sorting routine is a sorted list). Symbolic data structure analysis combines existing data structure analysis techniques (such as shape analysis) with symbolic reasoning (for example, with the theorem prover for tree structures MONA). This symbolic approach has potential advantages over existing (model-based) data structure analysis. First, symbolic data structure analysis generalizes techniques such as predicate abstraction which are used in software model checking. Therefore it smoothly fits into this approach and makes it easier to integrate this analysis into existing software model checkers. Second, it allows combined reasoning over multiple data types, for example, numbers and lists (needed for properties such as sorted-ness). In existing approaches such reasoning is very limited and requires substantial manual interaction. Next, the success of software model checking has shown that symbolic approaches tend to scale better in practice than non-symbolic approaches. In fact, the existing tools for data structure analysis are only applicable to small programs. Finally, this new approach has the potential of leading to the development of counter example-guided abstraction refinement. This promises an unmatched degree of automation with regard to existing approaches.

Umar Mohammed

University College London, United Kingdom Research title: Interactive video characters Supervisor: Dr. Simon Prince, Microsoft Research Supervisor: Dr. Andrew Fitzgibbon Summary: Creating and animating realistic human characters for interactive computer graphics is time-consuming and requires considerable expertise. A typical approach would create a simplified mesh describing the geometry and texture the surface with appropriate photographic images. Each polygon is associated with one or more bones and these are animated using motion capture data. The resulting characters can be flexibly controlled in real time. However, despite this complex procedure, they often look unrealistic and lack expression. In contrast, video data of real people is easy to capture, and produces very high quality footage. Unfortunately, there is no method for flexibly animating video footage of real people, so video footage cannot be used in interactive applications such as games. Our goal is to develop methods to blend together short captured video sequences of an actor to create a continuous video representation suitable both for creating pre-rendered sequences and for interactive applications such as games. In short, we aim to build a patch-based generative model of the space-time video footage of a single actor. We treat blending from between 2D/3D video-subsequences as a constrained texture synthesis problem within this model. We aim to produce a practical suite of tools for animators.

Vincenzo Gulisano (opens in new tab)

Universidad Politécnica de Madrid, Spain Research title: Large scale data streaming Supervisor: Prof. Ricardo Jiménez Peris Summary: There are many data streaming applications that will have to cope with massive data streams in which large clusters will be required to be able to cope with the input data stream. We foresee two classes of applications: 1) applications with many continuous queries that can scale by distributing the queries among a pool of sites; 2) Applications with one or a few queries but very massive data that will require parallelizing query operators among a large number of sites to be able to cope with the massive input streaming data. In order to realize a data streaming system able to satisfy the requirements of these two classes of applications for very large clusters (in other words, more than100 sites) it is a real challenge from a scientific point of view. We believe that the advances in Storage Area Networks (SANs) will enable a new kind of distributed support for data streaming. On one hand, SANs provide high-bandwidth low latency communication that enables end-to-end inter-host memory access in the realm of microseconds. This can be exploited for fast coordination among sites and agile load balancing. On the other hand, these networks are characterized by having Network Interface Cards (NICs) with their own processor with a reasonable capacity and able to access the host memory through DMA. Our previous work enables us to foresee that the envisioned highly scalable data streaming platform can be realized.
Abigail Durrant (opens in new tab)

University of Surrey, United Kingdom Research title: Designing photographic experiences to support autobiographical memory Supervisor: Prof. David Frohlich, Microsoft Research supervisor: Dr. Abigail Sellen Summary: Surprisingly little is known about the role of photographs in remembering life events. This project will examine this issue in relation to current theory in photography and memory, and use the findings to design new ways of capturing and consuming photographs.

Alessandro Duminuco

Institut Eurécom, France Research title: A peer-to-peer based file backup system Supervisor: Prof. Ernst Biersack, Microsoft Research supervisor: Pablo Rodriguez Rodriguez Summary: Peer-to-Peer systems have the interesting property of self-scaling, which means that the amount of resources grows with the number of participants. While there exist already a large number P2P systems for file sharing, very little work has been done in the area of using P2P systems for file backup. Typically, file back-up is done in a purely centralized manner. Such an organization requires a large amount of resources (disks, tape robot) and also some human intervention. On the other hand there is an increasing number of PCs each equipped with a local disk with a capacity of tens of Giga Bytes. The goal of the thesis is to investigate how the local disks of a large number of PCs can be organized in such a manner as to allow a highly reliable file back-up system. The thesis will first study existing approaches for distributed file backup and then design and implement its own backup system. It is envisioned to distribute each file that is backed up over multiple machines and to use error correcting codes for loss recovery. In order to evaluate the different design choices, a system model (machine availability, etc) needs to be defined and a performance evaluation will be carried out.

Alexander Spengler

Université Pierre et Marie Curie (Paris 6), France Research title: Machine learning with structured data. Application to XML document transformation Supervisor: Prof. Patrick Gallinari, Co-supervisor: Prof. Bernhard Schölkopf, Max Planck Institute for Biological Cybernetics Summary: Many domains and applications are concerned with complex data composed of elementary components which are linked according to some structural or logical organisation. In the text domain, for instance, the diffusion of structured data formats like XML and HTML has considerably changed the fields of information retrieval and information extraction. Other application domains concerned with structured data include biology, image processing, multimedia (video), natural language processing, social networks and many more. The aim of this thesis is to investigate the potential of different families of statistical learning methods for handling structured data and to derive and analyse new methods for different data mining tasks (including generic tasks like classification and clustering of structured data as well as problems where transformations between structured representations need to be learned). The primary field of application for the experiments will be that of semi-structured documents (XML textual documents for example). Semi-structured documents are defined by both their content and their logical structure. In order to effectively mine these documents either type of information needs to be incorporated. Compared to other domains in which the data is made up of only one of the two types, this is a much more complex task.

Andrew Weeks

University of York, United Kingdom Research title: Artificial chemical reaction networks for meta-heuristic search Supervisor: Prof. Susan Stepney Summary: The research aim is to develop an artificial chemical reaction network framework that can be used as a robust constructive computational search meta-heuristic, and apply it to complicated constructive search problems, such as proof discovery. The strategy is to investigate the relationship between complex chemical reaction pathways and meta-heuristic search, in particular (i) the role of auto-catalytic networks and hyper-cycles in amplifying good solutions, and self organising the search process; (ii) the workings of catalysis, to develop endogenous ‘embodied’ techniques for searching for complex artefact construction pathways; (iii) the role of equilibrium and non-equilibrium chemical processes in the construction approach, in particular, computational analogues of the appropriate free energy and enthalpy concepts; (iv) the ‘back reaction’ concept, as an indirect way of discovering and constructing novel ‘reactants’.

Andy Maule

University College London, United Kingdom Research title: Impact analysis of relations database schema changes on dynamically composed queries Supervisor: Prof. Wolfgang Emmerich Summary: When database schemas require change, it is typical to predict the effects of the change, first to gauge if the change is worth the expense, and second, to determine what must be reconciled once the change has taken place. Current techniques to predict the effects of schema changes upon applications that use the database can be expensive and error-prone, making the change process expensive and difficult. Our thesis is that an automated approach for predicting these effects, known as an impact analysis, can create a more informed schema change process, allowing stakeholders to obtain beneficial information, at lower costs than currently used industrial practice. This is an interesting research problem because modern data-access practices make it difficult to create an automated analysis that can identify the dependencies between applications and the database schema. This project will develop a novel analysis that overcomes these difficulties.

Aziem Chawdhary

Queen Mary University of London, United Kingdom Research title: Automatic program analysis and verification with separation logic Supervisor: Prof. Peter O’Hearn, Microsoft Research supervisor: Dr. Byron Cook Summary: There have recently been major advances in the mechanical verification of software, perhaps most strikingly exemplified by Microsoft’s SLAM model checker. However, current software model checkers have only rudimentary treatments of heap-intensive properties and of concurrency. Separation logic is a recent theoretical development which provides a fresh approach to reasoning about the heap and about concurrency. The purpose of this project is to investigate uses of separation logic in automatic verification. The current plan is to start, not from a generalist position, but by building a tool oriented to specific code targets.

Brendan Sheehan

University College Dublin, Ireland Research title: Model driven visualisation of large scale relational networks Supervisor: Dr. Aaron Quigley, Co-supervisor: Prof. Paddy Nixon Summary: The visualisation of relational data has thus far been largely concerned with handling small to medium size collections of data. In the case of large scale relational networks the convention of using nodes and edges is no longer practical. The simple reason for this is that the resolution of the screen provides a restriction on how many items of data can be displayed simultaneously. While techniques can be developed to provide visualisations of such networks for the purposes of exposition of some known relationship or where the frame provided by a well defined question simplifies the visualisation problem, the question of visualisation for purely exploratory purposes remains. Using optimisation techniques to specify constraints on how the underlying data is visualised, tools will be developed to allow the user to arbitrarily introduce and retract parameters so as to impose hypothetical models on the visualisation to reveal unanticipated structure in the data. Initial research will look into how various graph abstraction techniques handle multidimensional data. Particular emphasis is given to geometric methods. For example how quadtree or Voronoi partitioning of the underlying data space deals with various clustering schemes such as Principal Component Analysis (PCA), Multidimensional Scaling (MDS) and non-linear clustering methods such as Kohonen networks will be analysed.

Brian Amberg

Universität Basel, Switzerland Research title: Real time editing of face expressions in video streams Supervisor: Prof. Thomas Vetter, Microsoft Research supervisor: Prof. Andrew Blake Summary: The aim of this PhD project is the extraction and manipulation of facial expressions in video streams. To achieve this, an expression model of the subject in the video stream will be built. This model is fitted to the stream in order to extract expression, pose and lightning conditions. The model will be able to synthesise new expressions from the parameter set. This allows to render a modified face back into the video stream, enabling applications like expression change, exaggeration, or diminution. To achieve this aim several obstacles from different fields have to be overcome. 1) An expression model is needed that separates identity from expression. 2) Real time fitting of the model to the video stream can only be achieved by combining a multitude of methods on different scales of the problem. Efficient tracking of the ROI can reduce the search space by focusing on parts of each frame. Time and space coherence in a video sequence must be exploited to achieve a speedup against single frame operations. A probabilistic model of facial deformations must be learned to predict the parameter set for future frames. 3) Rendering the modified face seamlessly into a video stream requires the removal of the existing face and in painting of background regions that were occluded in the original image. Additionally techniques for the smoothing of cut-out edges are needed to remove artifacts.

Damian Serrano-Garcia

Universidad Politecnica de Madrid, Spain Research title: High performance database replication for storage area networks Supervisor: Prof. Marta Patiño-Martínez Summary: New developments in database replication protocols and new architectures such as system and storage area networks (SANs) have changed dramatically the limits of the scalability for eager database replication. In this project Damien will build upon these developments to produce a highly scalable replicated database. SANs will be exploited to attain low overhead replica coordination. New correctness criteria such as 1-copy snapshot isolation will be exploited to enhance scalability by removing read-write conflicts. Finally, autonomic reconfiguration and optimisation will be exploited to maximise performance.

Dan Dobre

Technische Universität Darmstadt, Germany Research title: Group communication protocols & replication for Web services Supervisor: Prof. Neeraj Suri Summary: The project focus is on exploiting different properties of message exchange patterns at the communication/application level to derive fault-tolerant atomic broadcast protocols facilitating fast message delivery. Currently, Dan is investigating the combined application of distinct oracles (for example, a leader and a weak atomic broadcast oracle) to achieve a better coverage with respect to liveness and/or to expedite message delivery. An interesting issue is to what extent such ‘hybrid’ protocols meet the lower bound on delivery latency in the normal case. Eventually the protocols will be migrated from the static/crash model to dynamic/byzantine models.

Daniele Quercia

University College London, United Kingdom Research title: Distributed trust models for pervasive computing Supervisor: Dr. Stephen Hailes, Co-supervisor: Licia Capra Summary: Pervasive computing environments must be able to provide resources and services without relying on a centralised infrastructure. This would not be possible in the absence of appropriate security mechanisms. Fundamental to the creation of security are mechanisms for assigning trust to different devices: without trust, devices cannot collaborate effectively, and without collaboration, the pervasive computing vision cannot be made a reality. Given the importance of this area, researchers have been designing distributed trust models for some time. However, the focus of such work has usually been on the creation of very general trust-based frameworks that are sufficiently imprecise that they are both difficult to apply and almost impossible to falsify in real-life scenarios. In contrast, the proposed research takes a more rigorously scientific approach. It is comprised of the following steps: (i) select a set of concrete real-life scenarios that both make use of pervasive devices and benefit from a computational trust framework; (ii) deduce and formalise the trust challenges posed by each scenario (iii) design computational models that address those trust challenges (iv) evaluate these models both qualitatively and quantitatively against criteria drawn from the initial domain-specific investigation.

David Stynes (opens in new tab)

University College Cork, Ireland Research title: Adversarial constraint solving Supervisor: Dr. Ken Brown, Microsoft Research supervisor: Dr. Youssef Hamadi Summary: The aim of the research is the combination of constraint satisfaction techniques with artificial intelligence game playing and/or mathematical game theory techniques to tackle combinatorial problems in which two or more self-motivated agents with different goals must produce a joint solution.

Fabien Corblin

Université Joseph Fourier, France Research title: Modelling, inference and simulation of biological networks using constraint logic programming (CLP) Supervisor: Prof. Laurent Trilling, Co-supervisor: Dr. Éric Fanchon, CEA-CNRS, Microsoft Research supervisor: Dr. Youssef Hamadi Summary: Computer tools are needed in systems biology to analyse qualitatively the dynamics of interaction networks, to discover the organisation of the cell’s molecular component. In this context, Fabien’s objective is to develop a general tool based on a unique specification allowing the exploration of the model’s parameters’ and behaviours’ properties, by a mix of inference and simulation. This work is based on the multi-valued asynchronous networks proposed by R. Thomas and E. Snoussi (1989, 1993). This formalism, which can be seen as a discrete abstraction of a special class of piecewise affine differential equations, allows a qualitative analysis of the dynamic behaviour of such systems. This formalism has been recently extended by de Jong et al. (2004) to take into account trajectories which are confined in the neighbourhood of discretisation thresholds (‘sliding modes’). The goal of this research is to investigate how a formal description of such a biological switching network can be exploited through an implementation in CLP (Constraint Logic Programming) in order to obtain the variety of functionalities desired. This tool will be applied to the construction of several biological networks.

Florian Schroff

University of Oxford, United Kingdom Research title: Multi-Class Image Interrogator Supervisor: Andrew Zisserman, Microsoft Research Supervisor: Antonio Criminisi Summary: The first part of this project focused on object recognition itself, more specifically segmentation in the sense of assigning a class-label to each pixel in the image. The second part of the project focuses on “Harvesting Image Databases from the Web”. The goal is to retrieve large numbers of images form the web for a specified object class. The user only has to specify the object-class and no further user interaction is required. This work returns images for a specified object class with higher precision than common image search engines. Further focus lies in multi-class object recognition employing hierarchical models. In order to be scalable to many object classes it seems to be very important to share features between object classes. This idea will be investigated together with the related idea of using hierarchical models. Hierarchical models could provide a generic mean to share more complicated high level features. Which kind of hierarchy is useful in the object recognition domain, e.g. hierarchies of object classes, parts that assemble an object class, or just abstract features, will be analysed.

Georg Weissenbacher (opens in new tab)

University of Oxford Research title: Formal verification techniques for code optimisation Supervisor: Prof. Daniel Kröning, Microsoft Research supervisor: Prof. Sir Tony Hoare Summary: Recent efforts like Tony Hoare’s Verifying Compiler Grand Challenge suggest that there is a consensus in the formal verification community that compilers and formal verification tools will eventually coalesce. In compilers, code optimisation is often based on relatively imprecise (conservative) results of light-weight analysis algorithms, while formal verification tools typically perform less efficient but significantly more accurate analyses. Georg will investigate how the detailed information that is gained by using formal verification techniques can be used to increase efficiency as well as quality of the code generated by compilers

Giorgio Gianforme

Università Roma Tre, Italy Research title: Schema and data mapping and transformation: foundations and application Supervisor: Prof. Paolo Atzeni Summary: Many application settings involve the need to exchange information between heterogeneous frameworks. In the database world, different systems are often used to handle data, following different models, and therefore data and their description need to be translated from one to another. Model Management is a high level-approach to solving such meta data problems. A major operator in model management is ModelGen, which translates schemas from a source to a target model. Given a source data model M1 (e.g., the ER model), a target data model M2 (e.g., SQL DDL or XML Schema), and a source schema S1 expressed in M1, ModelGen generates a target schema S2 in M2. The current results in the ModelGen project of the database group of ‘Università Roma Tre’ can be the basis for various research challenges. Giorgio’s research will consider three main directions: First, there is the need for a formal support to the validity of the approach (correctness and completeness); second, the customisation of the translations has to be studied, both at the schema and at the instance level; third, it will be important to study the validity and the adaptation of the approach to specific application domains, in cross-disciplinary settings.

Kai Kohlhoff

University of Cambridge, United Kingdom Research title: Experimental and computational studies of free energy landscapes for protein aggregation Supervisor: Dr. Michele Vendruscolo, Co-supervisor: Prof. Martin Zacharias, International University Bremen Summary: Deposits of misfolded proteins in cells or in intracellular space play a significant role in a number of severe medical disorders, such as Alzheimer’s and Parkinson’s. The underlying phenomena that cause misfolding and the formation of protein aggregates and amyloid fibrils are not yet well understood. Using data from NMR and X-ray crystallography techniques, Kai is interested in combining experimental measurements with computational methods to improve speed and detail of protein folding simulations. Identifying and applying new constraints to the conformational space of a protein will help finding the correct folding pathway on a protein’s free energy surface.

Konrad Kieling (opens in new tab)

Imperial College London, United Kingdom Research title: Linear optical quantum computing: novel architectures and computational assessment of performance Supervisor: Dr. Jens S. Eisert, Co-supervisor: Dr. Martin Plenio Summary: Among the different proposals for realization of quantum computation hardware, linear optical schemes are attractive due to simplicity of the required resources and only little decoherence. Unfortunately, non-unit success probabilities of the elementary gates are inherent in this scheme. In this project the potentials and limits of the linear optics toolbox will be investigated in the context of resource preparation for one-way computation. Further, general properties of the gates that can be built with these ingredients will be studied. With less restrictive rules the problems of the scheme being probabilistic may be overcome. Therefore, possible extensions will be discussed, e.g. atoms that are coupled to the light field by means of cavities.

Loïc Fejoz

INRIA Lorraine (LORIA), France Research title: Provably correct lock-free data structures Supervisor: Dr. Stephan Merz, Microsoft Research supervisor: Dr. Tim Harris Summary: The aim of this thesis is to design a method for the development and verification of algorithms working on lock-free data structures. The idea will be to start with a formal behavioural specification of the data structure in a sequential setting and derive an implementation that can be used in concurrent applications via a series of refinement steps. These implementations are intended to be more efficient for modern hardware and software than traditional algorithms relying on central locks to protect concurrent modifications.

Michael Kaisser

University of Edinburgh, United Kingdom Research title: Enlisting syntactic and semantic resources for web-based question answering and fact extraction Supervisor: Prof. Bonnie Webber Summary: Michael Kaisser’s research challenge is the possibility of more direct Natural Language access to the vast sea of information expressed in Natural Language on the Web. Specifically, he is investigating how to search the Web for facts. He wants to develop a coherent linguistic approach to this task by exploiting (wherever possible) freely available linguistic tools and resources. His research area overlaps with what is usually called ‘Question Answering’. While common search engines are keyword-based, a Web search that starts with a question is much more specific and provides more information that can be used when searching for relevant answers or Web sites that contain these answers: Understanding a question syntactically and semantically can help to understand which results are relevant to a user’s search and which are not.

Philippe-Alexandre Pouille

Institut Curie, France Research title: In silico model for self-organised embryogenesis mechano-genetics interplay Supervisor: Dr. Emmanuel Farge Summary: The genetic control of embryonic morphogenetic movements at gastrulation is better and better described in developmental biology, especially in the early drosophila embryo. However, the relationship between genetically controlled active single cells shape changes and migration, and the consecutive generation of the multi-cellular ‘macroscopic’ embryonic phenotype remains to be quantitatively investigated. As well as the precise biomechanics leading to feedback mechanical induction of developmental gene expression modulation in response to these morphogenetic movements. The research project consists in generating a biomechanical in silico viscoelastic multi-cellular model of the embryonic active and reactive tissues, to study such self-organised embryogenesis mechano-genetic interplay participating to embryonic mechanical morphing and development.

Stewart Hickey

University of Limerick, Ireland Research title: Technologies for sustainable ICT Supervisor: Dr. Colin Fitzpatrick Summary: When evaluating the environmental performance of electronic products it is very important to consider all aspects of their life cycle. This ensures that any action to improve the performance contributes to the overall reduction in environmental impact and avoids the transfer of burdens from one life cycle phase to another. Most often with electronic products it is the Use Phase of the life cycle that is most environmentally damaging. This project will examine the life cycle of PC’s with an emphasis on the use phase. More specifically it will focus on the use of networks and software in reducing environmental impacts in a market friendly fashion.

Tony O’Donovan

University College Cork, Ireland Research title: Research in wireless sensor networks Supervisor: Prof. Cormac Sreenan Summary: Wireless sensor networks are collections of autonomous devices (nodes) with computational, sensing and wireless communication capabilities. Tony’s research will focus on the use of wireless sensor networks in the area of medical informatics. Most wireless sensor network research involves networks of hundreds or thousands of nodes and the difficulties entailed with networks of this scale. This project on the other hand, will be concerned with the issue of coordinating a small number of sensors to detect medical emergencies in a low latency manner, possibly coupled with actuators to mitigate the effects of the emergency condition. The main challenge will be achieving close to 100 percent reliable communication and ensuring the system is robust enough for use as a medical device.
Alban Rrustemi

University of Cambridge, United Kingdom Research title: Dense wired sensor networks Supervisor: Dr. Simon Moore, Microsoft Research supervisor: Dr. Ken Wood Summary: Computing fabrics could be constructed from small elements (for example, 1.5mm x 1.5mm chips) woven into clothing or embedded into the structure of some device. Communication amongst these elements and other components would allow useful systems to be constructed. Reconfiguration approaches will address the void between conventional architectures of microcontrollers + software at one extreme to field programmable gate microcontrollers + arrays (FPGAs) at the other. To enable prototypes to be made, Alban will draw on existing research and industrial expertise in the areas of sensors, chip packaging and power sources.

Anna Ritchie

University of Cambridge, United Kingdom Research title: Combining term-based and citation-based methods for enhanced information retrieval Supervisor: Dr. Simone Teufel, Microsoft Research supervisor: Prof. Stephen Robertson Summary: The aim of the project is to combine citation information with traditional term based IR information sources for more sophisticated search. In which form citation information is to be included is itself an object of research: one could use the citations themselves, the citation anchor text as context, and possibly discourse structure information of the segment where the citation occurs.

Dana N. Xu

University of Cambridge, United Kingdom Research title: Software tools for secure component programming Supervisor: Prof. Alan Mycroft, Microsoft Research supervisor: Dr. Simon Peyton-Jones Summary: With our ever-growing reliance on software systems – occasionally for life-critical situations, it is no longer safe to rely solely on the informal assurances of software testing. Dana proposes a new framework for software systems to be safely (and securely) built from software components. Several major verification techniques have been advocated over the last two decades, including model-checking, static analysis and advanced type theory. However, little attention has been paid to applying these techniques to component-based programming. In this project, Dana plans to explore both the foundations for secure component integration and practical tools to support its development.

Greg Hale

University of York, United Kingdom Research title: Qualitative and quantitative studies of ‘fun’ with interactive feature film based entertainment via mobile phones and Web sites Supervisor: Prof. Andrew Monk, Microsoft Research supervisor: Dr. Ken Wood Summary: This research project is an investigation of entertainment content experiences, focused particularly on movies and narrative based mobile games/ interactive stories. The objective is to create an integrated psychological framework of these content experiences, which in turn can inform content design. Work completed. The first empirical study involved a systematic qualitative investigation of people’s responses to a short film, using interview data. This work is being integrated into the relevant literature from psychology, human-computer interaction and cognitive approaches to film, around a specific focus of schemas. The work has resulted in two conference papers, presented at international conferences. Work remaining. The integrated framework will be used to analyse three successful movies from the same genre, a ‘systems’ investigation. The flow of schematically structured experience events in the films will be logged and mapped onto the psychological framework. A short film will be then created using the framework and tested on viewers, with interview data being gathered. Finally, the framework will be tested for robustness in a different context by analysing either an existing mobile game or interactive story.

Julia Lasserre

University of Cambridge, United Kingdom Research title: Bayesian object recognition using weakly labelled data Supervisor: Dr. Roberto Cipola, Microsoft Research supervisor: Prof. Christopher Bishop Summary: This PhD will focus on the problem of object recognition using weakly labelled data, it will build on recent developments in probabilistic modelling and Bayesian inference. A typical task, for example, will be to learn about and recognise say cars given only a pile of images containing cars and a pile not containing cars. At some level this is very feasible, but a general solution will be very challenging to find. A central point of visual perception is the classical problem of invariant object recognition: different appearances of an object can be seen as equivalent, but with changes in position, illumination, distortions, or partial occlusion by other objects. It can then take account of prior knowledge to do with geometrical transformations of objects, but can also to some extent learn invariant features from the training data. One approach which may prove very helpful involves capturing multiple images of the same object from different orientations and viewpoints. Prior knowledge of the image generation process including projective geometry can be used to exploit known geometrical transformations and invariants.

Mathieu Verbaere

University of Oxford, United Kingdom Research title: An extensible toolkit for refactoring Supervisor: Prof. Oege de Moor Summary: To enable developers to author their own refactoring transformations. Existing tools offer only a fixed menu of refactorings. Furthermore, even simple refactorings like ‘extract method’ are often incorrectly implemented, because the implementation does only syntactic and no semantic analysis. Mathieu aims to construct an extensible framework for easy and correct implementation of refactoring transformations at several levels: a set of static analyses that frequently occur in refactoring; an API for using these analyses, and applying the corresponding transformations to the source (retaining layout and comments).

Maurice Fallon

University of Cambridge, United Kingdom Research title: Multi-channel audio source localisation and tracking Supervisor: Dr. Simon Godsill, Microsoft Research supervisor: Prof. Andrew Blake Summary: In this project Maurice will study signal processing methods for enhancing audio signals obtained from microphone arrays. The methods will aim initially to improve on the current state of the art in echo cancellation, then develop into the areas of source separation and localisation. The approach will be based on time-frequency models of the audio signals and the inclusion of Bayesian prior information about coefficients across time and frequency in order to aid the enhancement process. A selection of methodologies will be pursued, including variational methods, particle filters and fast approximations to these.

PhD Scholarship Programme in EMEA

A Smart Care System for Healthcare using Contextual Reinforcement

Deep Learning for Graph-structured Data

Designing for Human Partnership with AI in Everyday Life: The Future of Work as a Case Study

Exploring motif-based design patterns for biological computation

Integrating machine learning in to the IDE

Privacy in Distributed and Collaborative Learning

Programming biological systems by reverse-engineering reaction-diffusion model

Reinforcement Learning for Enabling Next Generation Human-Machine Partnerships

AI for Teams: The Future of Assisted Collaborative Work

Compartmentalized RNA Computing using Protein-based Microcompartments

Enabling Repair in Goal-Directed Natural Language Personal Assistants

Intrinsically Motivated Exploration for Lifelong Deep Reinforcement Learning of Multiple Tasks

Machine Learning for Program Analysis and Defect Prediction

Mechanistic Model Misspecification

Optical Fabric for Zero-­‐latency memory disaggregation at Pb/s cluster-­‐ scale network

Optics for the Cloud: Cost-effective assembly for Hybrid Integrated Photonic Switches

Pixelated Spatial Light Modulator for Phase and Polarisation Modulation of Light (PPSLM)

Securely and Efficiently Supporting Managed Language Runtimes in Confidential Computing

Statistical Learning and Adaptive Observation in Clinical Prediction: Methodology and Applications

Reinforcement Learning for Enabling Next Generation Human-Machine Partnerships

Visual Fast Mapping

High-Level Synthesis of Neural Networks for FPGAs with LIFT

Computer Vision on the Edge

Learning Computing with Torino: a physical programming language inclusive of children with visual disabilities

Decoding the Network Logic Governing Resetting of Pluripotency

OutSider: Assessing and Mitigating Side-channel Leaks on Commodity Platforms

Shareable Dynamic Media in Hybrid Meetings

Training and Tuning Deep Neural Networks: Faster, Stronger, Better

Modelling Infective Exacerbations in Cystic Fibrosis

Human-Centred Machine Learning for Adaptive Agents with Vision

SMVRF: Secure Messaging Verifiably Realized in F*

Reinforcement Learning for Adaptive User Interaction

STARCH: SmarT ARchitectures for Data Center Switching

Power Efficient Rack-Scale Fabrics

Towards Ethical Development of Symbiotic Human-Machine Systems; creating ethical frameworks and solutions

Programmable Single-Cell Biocomputers with Scalable Signal Processing Capacity

Bit-level Accurate Reasoning and Interpolation

Rack-Scale Interconnect Fabrics for Disaggregated Memory

Dense Visual Tracking for Active Manipulation

Reducing the Annotation Tax of Programming Language Types Using Machine Learning and Big Code

Learning to Infer in Graphical Models and Probabilistic Programmes

Computational Design of Nonlinear Functions using Nucleic Acids

Lightweight Concurrency Modelling

Smart Molecular Biosensors: Design Principles and Foundational Technologies

Towards Total Immersion: Accurate Reconstruction of Lights, Materials and 3D Geometry from RGB, Depth and Motion

Modelling the Survival and Proliferation of Cancer Cells in Metastasis

Medical Image Analysis in the Cloud: Application to Early Stage Cancer Detection

Estimating the Credibility of Health Information on the Web

Conversational Search: Mathematical Modelling and Applications

Continuous Listening Services: Leveraging Sociological Understandings of Talk for Continuous Agent Input

Data-Efficient Reinforcement Learning from Image Pixels

A Highly Scalable Optical Switch Fabric for Data Centre Networks

Meaningful Metadata: The Things I Wish I Knew

Global Fitness Maximising Approaches to Evaluate the Trade-Offs Involved in the Evergreen and Deciduous Conundrum

Self-Optimising Internet Services

System-Level Support for Persistent Memory

Model Worlds—Interacting with Machines which Read the Future

Fast, Flexible and Functional Software

Designing the Design of the Internet of Things

Designing for Spatial Sharing—Architectural Thinking as an Approach to Technology Design

Constructing robust synthetic gene feedback systems using a combined experimental and Bayesian approach

Preference Elicitation and Mechanism Design for Complex Dynamic Systems

Controlled Malleability in Cryptography

Formal Decompositions of Strongly Coupled Systems

Capture of Real-World Non-Rigid Scenes from a Single RGB Camera

Image Quality Transfer

History-Aware Testing

Sequential Structured Prediction

Real World Data with Dependent Types: Integrity and Interoperation

Advancing Random Forests and Other Ensembles

Leveraging Data Reuse for Efficient Ranker Evaluation in Information Retrieval

Rethinking Resource Allocation in Data Centres: Optimization, Incentives, and Beyond

Large Scale Diverse Learning for Structured Output Prediction

GeoGraph: Efficient Geographically Distributed Graph Infrastructure

Statistical Models and Methods for Privacy Technologies

Ad-hoc Cross-Device Interactions Facilitating Small-Group Collaborative Explorations and Curation of Historic Documents

Sketching Algorithms for Massive Graphs and Matrices

TypeScript: The Next Generation

Why Do People Communicate?

Optical Fabric for Zero-‐latency memory disaggregation at Pb/s cluster-‐ scale network