{"id":611553,"date":"2020-06-12T10:55:43","date_gmt":"2020-06-12T17:55:43","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-group&p=611553"},"modified":"2025-01-21T06:43:45","modified_gmt":"2025-01-21T14:43:45","slug":"swiss-joint-research-center","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/collaboration\/swiss-joint-research-center\/","title":{"rendered":"Swiss Joint Research Center"},"content":{"rendered":"
\n\t
\n\t\t
\n\t\t\t\"members\t\t<\/div>\n\t\t\n\t\t
\n\t\t\t\n\t\t\t
\n\t\t\t\t\n\t\t\t\t
\n\t\t\t\t\t\n\t\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n

Swiss Joint Research Center<\/h1>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n

The Swiss Joint Research Center, Swiss JRC, a collaborative research engagement between Microsoft Research and the two universities that make up the Swiss Federal Institutes of Technology:\u00a0ETH Zurich (opens in new tab)<\/span><\/a>\u00a0and EPFL (opens in new tab)<\/span><\/a> in Lausanne. Since its inception in 2008, the Swiss JRC has supported dozens of collaborative research projects between Microsoft and the two academic partners. <\/p>\n\n\n\n

The Swiss JRC has close links with the Spatial AI Lab – Zurich<\/a> <\/p>\n\n\n\n

<\/p>\n\n\n\n

<\/div>\n\n\n\n

Collaborators<\/h2>\n\n\n\n
\n
\n
\"EPFL (opens in new tab)<\/span><\/a><\/figure>\n<\/div>\n\n\n\n
\n
\"Center (opens in new tab)<\/span><\/a><\/figure>\n<\/div>\n\n\n\n
\n
\"ETH (opens in new tab)<\/span><\/a><\/figure>\n<\/div>\n\n\n\n
\n
\"ecocloud (opens in new tab)<\/span><\/a><\/figure>\n<\/div>\n<\/div>\n\n\n\n\n\n

EPFL projects (2022-2023)<\/h3>\n\n\n\n\n\n

EPFL PIs:<\/strong> Bruno Correia (and Michael Bronstein, Imperial)
Microsoft PIs:<\/strong> Max Welling, 
Chris Bishop<\/a>
PhD Student:<\/strong> Freyr Sverisson, Arne Scheuing<\/p>\n\n\n\n

Proteins play a crucial role in every form of life. The function of proteins is largely determined by their 3D structure and the way they interact with other molecules. Understanding the mechanisms that govern protein structure and their interactions with other molecules is a holy grail of biology that also paves the path to ground-breaking new applications in biotechnology and medicine. Over the past three decades, large amounts of structural data on proteins has been made available to the wide-scientific community. This has created opportunities for machine learning (ML) approaches to improve our ability to better understand the governing principles of these molecules, as well as to develop computational approaches for the design of novel proteins and small molecules drugs. The three-dimensional structures of proteins and imolecular objects are a natural fit for Geometric Deep Learning<\/em> (GDL). In this proposal, we will develop GDL-based approaches that describe molecular entities using point clouds engraved with descriptors capturing physical features (geometry and chemistry) that will be optimized to describe different aspects of proteins. Specifically, through the aims of this grant we will attempt to: capture dynamic features of protein surfaces (Aim 1); leverage the surface descriptors to condition the generation of small-molecules to engage specific pockets (Aim 2); couple new structure prediction algorithms with surface descriptor optimization for the design of new functions in proteins (Aim 3). Towards the generative aspects of our application (designing new surfaces, small-molecules, proteins), a common problem is that the spaces to be sampled are extremely large and thus the expertise within the Microsoft Research team could be critical to reach a functional solution. Specifically, the expertise in variational autoencoders, equivariant architectures and Bayesian optimization will be of major importance. In summary, we propose a novel approach powered by cutting edge computational methods to model and design de novo<\/em> proteins that globally has an enormous potential to help addressing problems in medicine and biotechnology.<\/p>\n\n\n\n\n\n

EPFL PIs:<\/strong> Alexander Mathis, Friedhelm Hummel, Silvestro Micera
Microsoft PIs:<\/strong> 
Marc Pollefeys<\/a>
PhD Student:<\/strong> Haozhe Qi<\/p>\n\n\n\n

Despite many advances in neuroprosthetics and neurorehabilitation, the techniques to measure, to personalize and thus to optimize the functional improvements that patients gain with therapy are limited. Impairments remain to be assessed by standardized functional tests, which fail to capture everyday behaviour and quality of life or allow to be well used for personalization and have to be performed by trained health care professionals in the clinical environment. By leveraging recent advances in motion capture and hardware, we will create novel metrics to evaluate, personalize and improve the dexterity of patients in their everyday life. We will utilize the EPFL Smart Kitchen platform to assess naturalistic behaviour in the kitchen of both healthy subjects, upper-limb amputees and stroke patients filmed from a head mounted camera (Microsoft HoloLens). We will develop a computer vision pipeline that is capable of measuring hand-object interactions in patient\u2019s kitchens. Based on this novel, large-scale dataset collected in patient\u2019s kitchens, we will derive metrics that measure dexterity in the \u201cnatural world,\u201d as well as recovered and compensatory movements due to the pathology\/assistive device. We will also use those data, to assess novel control strategies for neuroprosthetics and design optimal, personalized rehabilitation treatment by leveraging virtual reality.<\/p>\n\n\n\n\n\n\n\n

EPFL PIs:<\/strong> Robert West, Valentin Hartmann, Maxime Peyrard
Microsoft PIs:<\/strong>,
Emre K\u0131c\u0131man<\/a>, Robert Sim<\/a>, Shruti Tople<\/a>
PhD Student:<\/strong> Valentin Hartmann<\/p>\n\n\n\n

As machine learning (ML) models are becoming more complex, there has been a growing interest in making use of decentrally generated data (e.g., from smartphones) and in pooling data from many actors. At the same time, however, privacy concerns about organizations collecting data have risen. As an additional challenge, decentrally generated data is often highly heterogeneous, thus breaking assumptions needed by standard ML models. Here, we propose to \u201ckill two birds with one stone\u201d by developing Invariant Federated Learning, a framework for training ML models without directly collecting data, while not only being robust to, but even benefiting from, heterogeneous data. For the problem of learning from distributed data, the Federated Learning (FL) framework has been proposed. Instead of sharing raw data, clients share model updates to help train an ML model on a central server. We combine this idea with the recently proposed Invariant Risk Minimization (IRM) approach, a solution for causal learning. IRM aims to build models that are robust to changes in the data distribution and provide better out-of-distribution (OOD) generalization by using data from different environments during training. This integrates naturally with FL, where each client may be seen as constituting its own environment. We seek to gain robustness to distributional changes and better OOD generalization, as compared to FL methods based on the standard empirical risk minimization. Previous work has further shown that causal models possess better privacy properties than associational models [26]. We will turn these theoretical insights into practical algorithms to, e.g., provide Differential Privacy guarantees for FL. The project proposed here integrates naturally with ideas pursued in the context of the Microsoft Turing Academic Program (MS-TAP), where the PI\u2019s lab is already collaborating with Microsoft (including Emre K\u0131c\u0131man, a co-author of this proposal) in order to make language models more robust via IRM.<\/p>\n\n\n\n\n\n\n\n

EPFL PI:<\/strong> Giuseppe Carleo
Microsoft PIs:<\/strong> Max Welling, 
Chris Bishop<\/a>, Matthias Troyer<\/a>
PhD Student:<\/strong> Jannes Nys<\/p>\n\n\n\n

The fundamental equations governing interacting quantum-mechanical matter in solids have been known for over 90 years. However, these equations are simply \u201cmuch too complicated to be soluble\u201d (Paul A. M. Dirac, 1929). Besides experiments, the main source of information that we have available originates from computational methods to simulate these systems. Machine learning approaches based on artificial neural networks (NN) have recently been shown to be a new powerful tool in simulating systems governed by the laws of quantum mechanics. The leading approach in the field, pioneered by Carleo and Troyer, are known as neural quantum states, and have been successfully applied to several model quantum systems. For these typically prototypical and simplified \u2013 yet hard to solve\u2013 models of interacting quantum matter, neural quantum states have shown state-of-the-art \u2013 or better \u2013 performance. Despite this success, however, the application of neural quantum states to the ab-initio simulation of solids and materials is largely unexplored, both theoretically and computationally. Compared to the method for quantum spin systems, this requires methods that intrinsically work on continuous degrees of freedom, rather than discrete ones. Examples of important systems that can be studied with continuous space methods are crystals and several phases of matter that show a periodic lattice structure. In this project, we will introduce deep-learning-based approaches for the ab-initio simulation of solids, with a focus on imposing physical symmetries and scalability. With a powerful and efficient computational method to simulate continuous-space atomic quantum systems, we will be able to access unprecedented regimes of accuracy for the descriptions of materials, especially in two dimensions, where strong interactions are dominant.<\/p>\n\n\n\n\n\n\n\n

EPFL PI:<\/strong> Pascal Fua<\/p>\n\n\n\n

Microsoft PIs:<\/strong> Chris Bishop<\/a>
Research Engineer: <\/strong>Benoit Gherardi<\/p>\n\n\n\n

We live in a three-dimensional world full of manufactured objects of ever-increasing complexity. To be functional, they require clever engineering and design. The search for energy-efficient designs of objects, such as the windmill exemplifies the challenges and promises of such engineering: The blades must have the right shapes to harness as much energy from the wind by balancing lift and drag, and the whole assembly must be strong and light. With ever more powerful simulation techniques and the advent of digital sensors that enable precise measurements, shape engineering relies increasingly on the resulting algorithmic developments. As a result, Computer Aided Design (CAD) has become central to engineering but is not yet capable of addressing all the relevant issues simultaneously. Computer Vision and Computer Graphics are among the fields with the greatest potential for impact in CAD, especially given the remarkable progress that deep learning has fostered in these fields. For example, continuous deep implicit-fields have recently emerged as one of the most promising 3D shape-modeling approaches for objects that can be represented by a single watertight surface.<\/p>\n\n\n\n

However, current approaches to modeling complex composite objects cannot jointly account for geometric, topological, engineering constraints as well as for performance requirements. To remedy this, we will build latent models that can be used to represent and optimize complex composite shapes while strictly enforcing compatibility constraints between their components and controllability constraints on the whole. A central focus will be on developing training methods that guarantee that the output of the deep networks we train strictly obey these constraints, something that existing methods that rely on adding ad hoc loss functions cannot do. The results will be integrated into Microsoft\u2019s simulation platforms\u2014 AirSim and Bonsai \u2014with a view to rapidly building and designing real-world robots.<\/p>\n\n\n\n\n\n\n\n

EPFL PIs:<\/strong> Edouard Bugnion, Mathias Payer
Microsoft PIs:<\/strong> Adrien Ghosn
PhD Student:<\/strong> Charly Castes<\/p>\n\n\n\n

Confidential computing is an increasingly popular means to wider Cloud adoption. By offering confidential virtual machines and enclaves, Cloud service providers now host organizations, such as banks and hospitals, that abide by stringent legal requirement with regards to their client\u2019s data confidentiality. These technologies foster sufficient trust to enable such clients to transition to the Cloud, while protecting themselves against a potentially compromised or malicious host. Unfortunately, confidential computing solutions depend on bleeding-edge emerging hardware that (1) takes long to roll out at the Cloud scale and (2) as a recent technology, lacks a clear consensus on both the underlying hardware mechanisms and the exposed programming model and is thus bound to frequent changes and potential security vulnerabilities. This proposal strives to explore the possibilities of building confidential systems without special hardware support. Instead, we will leverage existing commodity hardware that is already deployed in Cloud datacenters combined with new programming language and formal method techniques and identify how to provide similar or even more elaborate confidentiality and integrity guarantees than the existing confidential hardware. Achieving such a software\/hardware co-design will enable Cloud providers to deploy new Cloud products for confidential computing without waiting for neither the standardization nor the wide installation of confidential hardware. The key goal of this project is the design and implementation of a trusted, attested, and formally verified monitor acting as a trusted intermediary between resource managers, such as a Cloud hypervisor or an OS, and their clients, e.g., confidential virtual machines and applications. We plan to explore how commodity hardware features, such as hardware support for virtualization, can be leveraged in the implementation of such a solution with as little modification as possible to existing hypervisor implementations.<\/p>\n\n\n\n\n\n

<\/div>\n\n\n\n

ETH Zurich projects (2022-2023)<\/h3>\n\n\n\n\n\n

ETH Zurich PIs:<\/strong> Roi Poranne, Stelian Coros
Microsoft PIs:<\/strong> 
Jeffrey Delmerico<\/a>, Juan Nieto<\/a>, Marc Pollefeys<\/a>
PhD Student:<\/strong> Florian-Kennel-Maushart<\/p>\n\n\n\n

Despite popular depictions in sci-fi movies and TV shows, robots remain limited in their ability to autonomously solve complex tasks. Indeed, even the most advanced commercial robots are only now just starting to navigate man-made environments while performing simple pick-and-place operations. In order to enable complex high-level behaviours, such as the abstract reasoning required to manoeuvre objects in highly constrained environments, we propose to leverage human intelligence and intuition. The challenge here is one of representation and communication. In order to communicate human insights about a problem to a robot, or to communicate a robot\u2019s plans and intent to a human, it is necessary to utilize representations of space, tasks, and movements that are mutually intelligible for both human and robot. This work will focus on the problem of single and multi-robot motion planning with human guidance, where a human assists a team of robots in solving a motion-based task that is beyond the reasoning capabilities of the robot systems. We will exploit the ability of Mixed Reality (MR) technology to communicate spatial concepts between robots and humans, and will focus our research efforts on exploring the representations, optimization techniques, and multi-robot task planning necessary to advance the ability of robots to solve complex tasks with human guidance.<\/p>\n\n\n\n\n\n

ETH Zurich PI:<\/strong> Srdjan Capkun, Shweta Shinde
Microsoft PIs:<\/strong> 
Manuel Costa<\/a>, Cedric Fournet<\/a>, Stavros Volos<\/a>
PhD Student:<\/strong> Ivan Puddu<\/p>\n\n\n\n

Goal of this research project is to reduce the trust placed in the Cloud Service Provider by increasing the control of the customer over the resources assigned to it in the cloud infrastructure. <\/p>\n\n\n\n

We plan to investigate this specifically in the context of a device or chiplet owned by the client and then placed within the cloud infrastructure, an \u201cEmbassy Hardware Device\u201d. Such device would be able to control, manage, and retain access to the data while remaining inaccessible (in terms of data and control flow) to the Cloud Service Provider. Several research challenges need to be solved in order to develop an end-to-end working prototype.\u202f<\/p>\n\n\n\n\n\n\n\n

ETH Zurich PI:<\/strong> Onur Mutlu
Microsoft PIs:<\/strong> 
Stefan Saroiu<\/a>, Alec Wolman<\/a>, Mark Hill<\/a>, Thomas Moscibroda<\/a>
PhD Student<\/strong>: Giray Yaglikci<\/p>\n\n\n\n

DRAM is the prevalent technology used to architect main memory across a wide range of computing platforms. Unfortunately, DRAM suffers from the RowHammer vulnerability. RowHammer is caused by repeatedly accessing (i.e., hammering) a DRAM row such that the electro-magnetic interference that develops due to the rapid DRAM row activations causing bit flips in DRAM rows that are physically nearby the hammered row. Prior research demonstrates that the RowHammer vulnerability of DRAM chips worsens as DRAM cell size and cell-to-cell spacing shrink. Numerous works demonstrate RowHammer attacks to escalate user privileges, obtain private keys, manipulate sensitive data, and destroy the accuracy of neural networks. Given that the RowHammer vulnerability of modern DRAM chips worsens and can be used to compromise a wide range of computing platforms, it is crucial to fundamentally understand and solve RowHammer to ensure secure and reliable DRAM operation. Our goal in this project is to<\/p>\n\n\n\n

    \n
  1. rigorously study the unexplored aspects of RowHammer via rigorous experiments, using hundreds of real DRAM chips, and leverage all the understanding we develop to<\/li>\n\n\n\n
  2. experimentally analyze the security guarantees of existing RowHammer mitigation mechanisms (e.g., Tar-get Row Refresh (TRR)),<\/li>\n\n\n\n
  3. craft more effective RowHammer access patterns, and<\/li>\n\n\n\n
  4. design completely secure, efficient, and low-cost RowHammer mitigation mechanisms.<\/li>\n<\/ol>\n\n\n\n\n\n

    ETH Zurich PI:<\/strong> Otmar Hilliges
    Microsoft PI:<\/strong> 
    Julien Valentin (opens in new tab)<\/span><\/a>
    PhD Student<\/strong>:
    Chen Guo (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

    Digital capture of human bodies is a rapidly growing research area in computer vision and computer graphics that puts scenarios such as life-like Mixed Reality (MR) virtual-social interactions into reach, albeit not without overcoming several challenging research problems. A core question in this respect is how to faithfully transmit a virtual copy of oneself so that a remote collaborator may perceive the interaction as immersive and engaging. To present a real alternative to face-to-face meetings, future AR\/VR systems will crucially depend on the following two core building blocks:<\/p>\n\n\n\n

      \n
    1. means to capture the 3D geometry and appearance (e.g., texture, lighting) of individuals with consumer-grade infrastructure (e.g., a single RGB-D camera) and with very little time and expertise and<\/li>\n\n\n\n
    2. means to represent the captured geometry and appearance information in a fashion that is suitable for photorealistic rendering under fine-grained control over the underlying factors such as pose and facial expressions amongst others.<\/li>\n<\/ol>\n\n\n\n

      In this project, we plan to develop novel methods to learn animatable representations of humans from \u2018cheap\u2019 data sources alone. Furthermore, we plan to extend our own recent work on animatable neural implicit surfaces, such that it can represent not only the geometry but also the appearance of subjects in high visual fidelity. Finally, we plan to study techniques to enforce geometric and temporal consistency in such methods to make them suitable for MR and other telepresence downstream applications.<\/p>\n\n\n\n\n\n\n\n

      ETH Zurich PI:<\/strong> Christian Holz
      Microsoft PIs:<\/strong> 
      Tadas Baltrusaitis (opens in new tab)<\/span><\/a>
      PhD Student:<\/strong> Bj\u00f6rn Braun<\/p>\n\n\n\n

      The passive<\/em> measurement of cognitive stress and its impact on performance in cognitive tasks has a huge potential for human-computer interaction (HCI) and affective computing, including workload optimization or \u201cflow\u201d understanding for future of work productivity scenarios, remote learning, automated tutor systems, as well as stress monitoring, mental health, and telehealth applications more generally. When cognitive demands exceed resources, people experience stress and task performance degrades. In this project, we will develop intelligent software experiences that reduce workers\u2019 stress and optimize their cognitive resources. We will develop sensing models that capture the body\u2019s autonomic nervous system (\u201cfight or flight\u201d) responses to cognitive demands in real-time using information from multiple physiologic processes. These inputs will then help drive AI support that adapts to provide cognitive support while also maintaining autonomy (e.g., avoiding unnecessary and annoying interventions.) Specifically, we will develop novel computer vision and signal processing approaches for measuring cardiovascular, respiratory, pupil\/ocular, and dermal changes using ubiquitous sensors. For desktop environments, we will develop, evaluate, and demonstrate our methods using non-contact sensing (the webcams built into PCs). For head-mounted displays, we will appropriate our methods to utilize signals originating from the wearer\u2019s head using built-in headset sensors. In both cases, our developments will produce novel datasets, computational methods, and the results of in-situ evaluations in productivity scenarios. Using our novel methods, we will also investigate their implications for telehealth scenarios, which often contain cardiovascular and respiratory assessments. We will develop scenarios that guide the user while assessing these metrics and visually present the remote physician with the results for examination.<\/p>\n\n\n\n\n\n

      ETH Zurich PI:<\/strong> Sebastian Kozerke
      Microsoft PI:<\/strong> Michael Hansen
      PhD Student:<\/strong> Pietro Dirix<\/p>\n\n\n\n

      Cardiovascular Magnetic Resonance Imaging (MRI) has become a key imaging modality to diagnose, monitor and stratify patients suffering from a wide range of cardiovascular diseases. Using Flow MRI, time-resolved blood flow patterns can be quantified throughout the circulatory system providing information on the interplay of anatomical and hemodynamic conditions in health and disease.<\/p>\n\n\n\n

      Today, inference of Flow MRI data is based on data post-processing, which includes massive data reduction to yield metrics such as mean and peak flow, kinetic energy, and wall shear rates. In consequence of the data reduction step, however, the wealth of information encoded in the data including fundamental causal relations are potentially missed. In addition, the dependency of the metrics on parameters of the measurement and image reconstruction process itself compromises the diagnostic yield and the reproducibility of the method, hence hampering further dissemination.<\/p>\n\n\n\n

      Here we propose to develop and implement a computational framework for Flow Tensor MRI data synthesis to train physics-based neural networks for image reconstruction and inference of the complex interplay of anatomy, coherent and incoherent flows in the aorta in-vivo. Using cloud-based, scalable computing resources, we will demonstrate that synthetically trained reconstruction and inference machines permit high-speed image reconstruction and inference to unravel complex structure-function relations using real-world in-vivo Flow Tensor MRI by exploiting the entirety of information contained in the data along with the information of the measurement process itself.<\/p>\n\n\n\n\n\n\n\n

      ETH Zurich PIs:<\/strong> Marco Tognon, Mike Allenspach, Nicholas Lawrence, Roland Siegwart
      Microsoft PIs:<\/strong> 
      Jeffrey Delmerico<\/a>, Juan Nieto<\/a>, Marc Pollefeys<\/a>
      PhD Student:<\/strong> Mike Allenspach<\/p>\n\n\n\n

      Our objective is to exploit recent developments in MR to enhance human capabilities with robotic assistance. Robots offer mobility and power but are not capable of performing complex tasks in challenging environments such as construction, contact-based inspection, cleaning, and maintenance. On the other hand, humans have excellent higher-order reasoning, and skilled workers have the experience and training to adapt to new circumstances quickly and effectively. However, they lack in mobility and power. We envision to reduce this limitation by empowering human operators with the assistance and the capabilities provided by a robot system. This requires a human-robot interface that fully leverages the capabilities of both the human operator and the robot system. In this project we aim to explore the problem of shared autonomy for physical interaction tasks in shared physical workspaces. We will explore how an operator can effectively command a robot system using a MR interface over a range of autonomy levels from low-level direct teleoperation to high-level task specification. We will develop methods for estimating the intent and comfort level of an operator to provide an intuitive and effective interface. Finally, we will explore how to pass information from the robot system back to the human operator for effective understanding of the robot\u2019s plans. We will prove the value of mixed reality interfaces by enhancing human capabilities with robot systems through effective, bilateral communication for a wide variety of complex tasks.<\/p>\n\n\n\n\n\n

      ETH Zurich PI:<\/strong> Shweta Shinde
      Microsoft PIs:<\/strong>
      Manuel Costa<\/a>, Cedric Fournet<\/a>, Stavros Volos<\/a>
      PhD Student:<\/strong> Mark Kuhne<\/p>\n\n\n\n

      Goal of the is research project is to give visibility on whether any abuse is happening, particularly if it is happening from untrusted software (e.g., Operating System, Hypervisor) or trusted-but-erroneous software (e.g., Trusted Execution Environment management).<\/p>\n\n\n\n

      The key idea is to have a small, trusted software to check the runtime behavior of the untrusted and trusted-but-erroneous software. Such a minimal security monitor can restrict the privileged software\u2019s capabilities and visibility over the system while still adequately managing the resources.<\/p>\n\n\n\n\n\n\n\n

      ETH Zurich PI:<\/strong> Kaveh Razavi
      Microsoft PI:<\/strong> 
      Boris K\u00f6pf<\/a>
      PhD Student:<\/strong> Flavien Solt<\/p>\n\n\n\n

      There is currently a large gap between the capabilities of Electronic Design Automation (EDA) tools and what is required to detect various classes of microarchitectural vulnerabilities pre-silicon. This project aims to bridge this gap by leveraging recent advances in software testing to produce the necessary knowledge and tools for effective hardware testing. Our driving hypothesis is that if we could provide crucial information about the privilege and domain of instructions and\/or data in the microarchitecture during simulation or emulation, then we can easily detect many classes of microarchitectural vulnerabilities. As an example, with the right test cases, we could detect Meltdown-type vulnerabilities since seemingly different variants all require an instruction that can access data from a different privilege domain.<\/p>\n\n\n\n\n\n\n\n

      ETH Zurich PI:<\/strong> Kenneth G. Paterson
      Microsoft PI:<\/strong> 
      C\u00e9dric Fournet<\/a>, Esha Gosh<\/a>, Michael Naehrig<\/a>
      PhD Student:<\/strong> Mia Fili\u0107<\/p>\n\n\n\n

      Probabilistic data structures (PDS) are becoming extremely widely used in practice in the era of \u201cbig data\u201d. They are used to process large data sets, often in a streaming setting, and to provide approximate answers to basic data exploration questions such as \u201cHas a particular bit-string in this data stream been encountered before?\u201d or \u201cHow many distinct bit-strings are there in this data set?\u201d. They are increasingly supported in systems like Microsoft Azure Data Explorer, Google Big Query, Apache Spark, Presto and Redis, and there is an active research community working on PDS within computer science. Generally, PDS are designed to perform well \u201cin the average case\u201d, where the inputs are selected independently at random from some distribution. This we refer to as the non-adversarial setting. However, they are increasingly being used in adversarial settings, where the inputs can be chosen by an adversary interested in causing the PDS to perform badly in some way, e.g. creating many false positives for a Bloom filter, or underestimating the set cardinality for a cardinality estimator. In recent work, we performed an in-depth analysis of the HyperLogLog (HLL) PDS and its security under adversarial input. The proposed research will extend our prior work in three directions:<\/p>\n\n\n\n

        \n
      1. address the mergeability problem for HLL;<\/li>\n\n\n\n
      2. extend our simulation-based framework for studying the correctness and security of HLL to other PDS in adversarial settings;<\/li>\n\n\n\n
      3. study the specific case of cascaded Bloom filters, which have been proposed for use in CRLite, a privacy-preserving system for managing certificate revocation for the webPKI.<\/li>\n<\/ol>\n\n\n\n\n\n
        <\/div>\n\n\n\n

        EPFL projects (2019-2021)<\/h3>\n\n\n\n\n\n

        EPFL PIs:<\/strong> Pascal Fua (opens in new tab)<\/span><\/a>, Mathieu Salzmann (opens in new tab)<\/span><\/a>
        Microsoft PIs:<\/strong> 
        Bugra Tekin (opens in new tab)<\/span><\/a>, Sudipta Sinha (opens in new tab)<\/span><\/a>, Federica Bogo (opens in new tab)<\/span><\/a>, Marc Pollefeys (opens in new tab)<\/span><\/a>
        PhD Student: <\/strong>
        Mengshi Qi (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

        In recent years, there has been tremendous progress in camera-based 6D object pose, hand pose, and human 3D pose estimation. They can now both be done in real-time but not yet to the level of accuracy required to properly capture how people interact with each other and with objects, which is a crucial component of modeling the world in which we live. For example, when someone grasps an object, types on a keyboard, or shakes someone else\u2019s hand, the position of their fingers with respect to what they are interacting with must be precisely recovered for the resulting models to be used by AR devices, such as the HoloLens device or consumer-level video see-through AR ones. This remains a challenge, especially given the fact that hands are often severely occluded in the egocentric views that are the norm in AR. We will, therefore, work on accurately capturing the interaction between hands and objects they touch and manipulate. At the heart of it, will be the precise modeling of contact points and the resulting physical forces between interacting hands and objects. This is essential for two reasons. First, objects in contact exert forces on each other; their pose and motion can only be accurately captured and understood if reaction forces at contact points and areas are modeled jointly. Second, touch and touch-force devices, such as keyboards and touch-screens are the most common human-computer interfaces, and by sensing contact and contact forces purely visually, everyday objects could be turned into tangible interfaces, that react as if they were equipped with touch-sensitive electronics. For instance, a soft cushion could become a non-intrusive input device that, unlike virtual mid-air menus, provides natural force feedback. In this talk, I will present some of our preliminary results and discuss our research agenda for the year to come.<\/p>\n\n\n\n\n\n\n\n

        EPFL PIs:<\/strong> Robert West (opens in new tab)<\/span><\/a>, Arnaud Chiolero (opens in new tab)<\/span><\/a>
        Microsoft PIs:<\/strong> 
        Ryen White (opens in new tab)<\/span><\/a>, Eric Horvitz (opens in new tab)<\/span><\/a>, Emre Kiciman (opens in new tab)<\/span><\/a>
        PhD Student: <\/strong>
        Kristina Gligoric (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

        The overall goal of this project is to develop methods for monitoring, modeling, and modifying dietary habits and nutrition based on large-scale digital traces. We will leverage data from both EPFL and Microsoft, to shed light on dietary habits from different angles and at different scales: Our team has access to logs of food purchases made on the EPFL campus with the badges carried by all EPFL members. Via the Microsoft collaborators involved, we have access to Web usage logs from IE\/Edge and Bing, and via MSR\u2019s subscription to the Twitter firehose, we gain full access to a major social media platform. Our agenda broadly decomposes into three sets of research questions: (1) Monitoring and modeling: How to mine digital traces for spatiotemporal variation of dietary habits? What nutritional patterns emerge? And how do they relate to, and expand, the current state of research in nutrition? (2) Quantifying and correcting biases: The log data does not directly capture food consumption, but provides indirect proxies; these are likely to be affected by data biases, and correcting for those biases will be an integral part of this project. (3) Modifying dietary habits: Our lab is co-organizing an annual EPFL-wide event called the Act4Change challenge, whose goal is to foster healthy and sustainable habits on the EPFL campus. Our close involvement with Act4Change will allow us to validate our methods and findings on the ground via surveys and A\/B tests. Applications of our work will include new methods for conducting population nutrition monitoring, recommending better-personalized eating practices, optimizing food offerings, and minimizing food waste.<\/p>\n\n\n\n\n\n\n\n

        EPFL PI:<\/strong> Tobias J. Kippenberg (opens in new tab)<\/span><\/a>
        Microsoft PI:<\/strong> 
        Hitesh Ballani (opens in new tab)<\/span><\/a>
        PhD Student: <\/strong>
        Arslan Raja (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

        The substantial increase in optical data transmission, and cloud computing, has fueled research into new technologies that can increase communication capacity. Optical communication through fiber, which traditionally has been used for long haul fiber optical communication, is now also employed for short haul communication, even with data-centers. In a similar vein, the increasing capacity crunch in optical fibers, driven in particular by video streaming, can only be met by two degrees of freedom: spatial and wavelength division multiplexing. Spatial multiplexing refers to the use of optical fibers that have multiple cores, allowing to transmit the same carrier wavelength in multiple fibers. Wavelength division multiplexing (WDM or dense-DWM) refers to the use of multiple optical carriers on the same fiber. A key advantage of WDM is the ability to increase line-rates on existing legacy network, without requirements to change existing SMF28 single mode fibers. WDM is also expected to be employed in data-centers. Yet to date, WDM implementation within datacenters faces a key challenge: a CMOS compatible, power efficient source of multi-wavelengths. Currently employed existing solutions, such as multi-laser chips based on InP (as developed by Infinera) cannot be readily scaled to a larger number of carriers. As a result, the currently prevalently employed solution is to use a bank of multiple, individual laser modules. This approach is not viable for datacenters due to space and power constraints. Over the past years, a new technology has rapidly matured \u2013 that was developed by EPFL \u2013 microresonator frequency combs, or microcombs that satisfy these requirements. The potential of this new technology in telecommunications has recently been demonstrated with the use of microcombs for massively coherent parallel communication on the receiver and transmitter side. Yet to date the use of such micro-combs in data-centers has not been addressed.<\/p>\n\n\n\n

          \n
        1. Kippenberg, T. J., Gaeta, A. L., Lipson, M. & Gorodetsky, M. L. Dissipative Kerr solitons in optical microresonators. Science 361, eaan8083 (2018).<\/li>\n\n\n\n
        2. Brasch, V. et al. Photonic chip\u2013based optical frequency comb using soliton Cherenkov radiation. Science aad4811 (2015). doi:10.1126\/science.aad4811<\/li>\n\n\n\n
        3. Marin-Palomo, P. et al. Microresonator-based solitons for massively parallel coherent optical communications. Nature 546, 274\u2013279 (2017).<\/li>\n\n\n\n
        4. Trocha, P. et al. Ultrafast optical ranging using microresonator soliton frequency combs. Science 359, 887\u2013891 (2018).<\/li>\n<\/ol>\n\n\n\n\n\n\n\n

          EPFL PIs:<\/strong> Edouard Bugnion (opens in new tab)<\/span><\/a>
          Microsoft PIs:<\/strong> 
          Irene Zhang (opens in new tab)<\/span><\/a>, Dan Ports (opens in new tab)<\/span><\/a>, Marios Kogias (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Konstantinos Prasopoulos (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          The deployment of a web-scale application within a datacenter can comprise of hundreds of software components, deployed on thousands of servers organized in multiple tiers and interconnected by commodity Ethernet switches. These versatile components communicate with each other via Remote Procedure Calls (RPCs) with the cost of an individual RPC service typically measured in microseconds. The end-user performance, availability and overall efficiency of the entire system are largely dependent on the efficient delivery and scheduling of these RPCs. Yet, these RPCs are ubiquitously deployed today on top of general-purpose transport protocols such as TCP. We propose to make RPC first-class citizens of datacenter deployment. This requires a revisitation of the overall architecture, application API, and network protocols. Our research direction is based on a novel RPC-oriented protocol, R2P2, which separates control flow from data flow and provides in-networking scheduling opportunities to tame tail latency. We are also building the tools that are necessary to scientifically evaluate microsesecond-scale services.<\/p>\n\n\n\n\n\n

          <\/div>\n\n\n\n

          ETH Zurich projects (2019-2021)<\/h3>\n\n\n\n\n\n

          ETH Zurich PIs:<\/strong> Roland Siegwart (opens in new tab)<\/span><\/a>, Cesar Cadena (opens in new tab)<\/span><\/a>
          Microsoft PIs:<\/strong> 
          Johannes Sch\u00f6nberger (opens in new tab)<\/span><\/a>, Marc Pollefeys (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Lukas Schmid (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          AR\/VR allow new and innovative ways of visualizing information and provide a very intuitive interface for interaction. At their core, they rely only on a camera and inertial measurement unit (IMU) setup or a stereo-vision setup to provide the necessary data, either of which are readily available on most commercial mobile devices. Early adoptions of this technology have already been deployed in the real estate business, sports, gaming, retail, tourism, transportation and many other fields. The current technologies in visual-aided motion estimation and mapping on mobile devices have three main requirements to produce highly accurate 3D metric reconstructions: An accurate spatial and temporal calibration of the sensor suite, a procedure which is typically carried out with the help of external infrastructure, like calibration markers, and by following a set of predefined movements. Well-lit, textured environments and feature-rich, smooth trajectories. The continuous and reliable operation of all sensors involved. This project aims at relaxing these requirements, to enable continuous and robust lifelong mapping on end-user mobile devices. Thus, the specific objectives of this work are: 1. Formalize a modular and adaptable multi-modal sensor fusion framework for online map generation; 2. Improve the robustness of mapping and motion estimation by exploiting high-level semantic features; 3. Develop techniques for automatic detection and execution of sensor calibration in the wild. A modular SLAM (simultaneous localization and mapping) pipeline which is able to exploit all available sensing modalities can overcome the individual limitations of each sensor and increase the overall robustness of the estimation. Such an information-rich map representation allows us to leverage recent advances in semantic scene understanding, providing an abstraction from low-level geometric features \u2013 which are fragile to noise, sensing conditions and small changes in the environment \u2013 to higher-level semantic features that are robust against these effects. Using this complete map representation, we will explore new ways to detect miscalibrations and sensor failures, so that the SLAM process can be adapted online without the need for explicit user intervention.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PI:<\/strong> Ce Zhang (opens in new tab)<\/span><\/a>
          Microsoft PI:<\/strong> 
          Matteo Interlandi (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Bojan Karla\u0161 (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          The goal of this project is to mine ML.NET historical data such as user telemetry and logs to understand how ML.NET transformations and learners are used and eventually being able to use this knowledge to automatically provide suggestions to data scientists using ML.NET.Suggestions can be in the form of: Better or additional recipes for unexplored tasks (e.g., neural networks). Auto-completion suggestions for pipelines directly authored for example in .NET or Python.Automatically generation of parameters and sweep strategies optimal for the task at hand. We will try to develop a solution that is extensible such that, if new tasks, algorithms, etc. are added to the library, suggestions will be eventually properly upgraded as well. Additionally, the tool will have to interface with ML.NET and make easy to add new recipes coming either from users or the log mining tool.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PI:<\/strong> Siyu Tang (opens in new tab)<\/span><\/a>
          Microsoft PIs:<\/strong> 
          Marc Pollefeys (opens in new tab)<\/span><\/a>, Federica Bogo (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Siwei Zhang (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          Humans are social beings and frequently interacting with one another, e.g. spending a large amount of their time being socially engaged, working in teams, or just being as part of the crowd. Understanding human interaction from visual input is an important aspect of visual cognition and key to many applications including assistive robotics, human-computer interaction and AR\/VR. Despite rapid progresses in estimating 3D pose and shape of a single person from RGB images, capturing and modelling human interactions is rather poorly studied in the literature. Particularly for the first-person-view settings, the problem has drawn little attention from the computer vision community. We argue that it is essential for the augmented reality glasses, e.g. Microsoft HoloLens, to capture and model the interactions between the camera wearer and others as the interaction between humans characterises how they move, behave and perform tasks in a collaborative setting.<\/p>\n\n\n\n

          In this project, we aim to understand how to recognise and predict the interactions between humans under the first-person view setting. To that end, we will create a 3D human-human interaction dataset where the goal is to capture rich and complex interaction signals including body and hand poses, facial expression and gaze directions using Microsoft Kinect and HoloLens. We will develop models that can recognise the dynamics of human interactions and even predict the motion and activities of the interacting humans. We believe such models will facilitate various down-streaming applications for the augmented reality glasses, e.g. Microsoft HoloLens.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PIs:<\/strong> Roland Siegwart (opens in new tab)<\/span><\/a>, Nicholas Lawrance (opens in new tab)<\/span><\/a>, Jen Jen Chung (opens in new tab)<\/span><\/a>
          Microsft PIs:<\/strong> 
          Andrey Kolobov (opens in new tab)<\/span><\/a>, Debadeepta Dey (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Florian Achermann (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          A major factor restricting the utility of UAVs is the amount of energy aboard, which limits the duration of their flights. Birds face largely the same problem, but they are adept at using their vision to aid in spotting \u2014 and exploiting \u2014 opportunities for extracting extra energy from the air around them. Project Altair aims at developing infrared (IR) sensing techniques for detecting, mapping and exploiting naturally occurring atmospheric phenomena called thermals for extending the flight endurance of fixed-wing UAVs. In this presentation, we will introduce our vision and goals for this project.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PIs:<\/strong> Torsten Hoefler (opens in new tab)<\/span><\/a>, Renato Renner (opens in new tab)<\/span><\/a>
          Microsoft PIs:<\/strong> 
          Matthias Troyer (opens in new tab)<\/span><\/a>, Martin Roetteler (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Niels Gleinig (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          QIRO will establish a new internal representation for compilation systems on quantum computers. Since quantum computation is still emerging, I will provide an introduction to the general concepts of quantum computation and a brief discussion of its strengths and weaknesses from a high-performance computing perspective. This talk is tailored for a computer science audience with basic (popular-science) or no background in quantum mechanics and will focus on the computational aspects. I will also discuss systems aspects of quantum computers and how to map quantum algorithms to their high-level architecture. I will close with the principles of practical implementation of quantum computers and outline the project.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PI:<\/strong> Andreas Krause (opens in new tab)<\/span><\/a>
          Microsoft PI:<\/strong> 
          Katja Hofmann (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          David Lindner (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          Reinforcement learning (RL) is a promising paradigm in machine learning and gained considerable attention in recent years, partly because of its successful application in previously unsolved challenging games like Go and Atari. While these are impressive results, applying reinforcement learning in most other domains, e.g. virtual personal assistants, self-driving cars or robotics, remains challenging. One key reason for this is the difficulty of specifying the reward function a reinforcement learning agent is intended to optimize. For instance, in a virtual personal assistant, the reward function might correspond to the user\u2019s satisfaction with the assistant\u2019s behavior and is difficult to specify as a function of observations (e.g. sensory information) available to the system. In such applications, an alternative to specifying the reward function is to actually query the user for the reward. This, however, is only feasible if the number of queries to the user are limited and the user\u2019s response can be provided in a natural way such that the system\u2019s queries are non-irritating. Similar problems arise in other application domains such as robotics in which, for instance, the true reward can only be obtained by actually deploying the robot but an approximation to the reward can be computed by a simulator. In this case, it is important to optimize the agent\u2019s behavior while simultaneously minimizing the number of costly deployments. This project\u2019s aim is to develop algorithms for these types of problems via scalable active reward learning for reinforcement learning. The project\u2019s focus is on scalability in terms of computational complexity (to scale to large real-world problems) and sample complexity (to minimize the number of costly queries).<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PIs:<\/strong> Stelian Coros (opens in new tab)<\/span><\/a>, Roi Poranne (opens in new tab)<\/span><\/a>
          Microsoft PIs:<\/strong> 
          Federica Bogo (opens in new tab)<\/span><\/a>, Bugra Tekin (opens in new tab)<\/span><\/a>, Marc Pollefeys (opens in new tab)<\/span><\/a>
          PhD Students:<\/strong> 
          Simon Zimmermann (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          With this project, we aim to accelerate the development of intelligent robots that can assist those in need with a variety of everyday tasks. People suffering from physical impairments, for example, often need help dressing or brushing their own hair. Skilled robotic assistants would allow these persons to live an independent lifestyle. Even such seemingly simple tasks, however, require complex manipulation of physical objects, advanced motion planning capabilities, as well as close interactions with human subjects. We believe the key to robots being able to undertake such societally important functions is learning from demonstration. The fundamental research question is, therefore, how can we enable human operators to seamlessly teach a robot how to perform complex tasks? The answer, we argue, lies in immersive telemanipulation. More specifically, we are inspired by the vision of James Cameron\u2019s Avatar, where humans are endowed with alternative embodiments. In such a setting, the human\u2019s intent must be seamlessly mapped to the motions of a robot as the human operator becomes completely immersed in the environment the robot operates in. To achieve this ambitious vision, many technologies must come together: mixed reality as the medium for robot-human communication, perception and action recognition to detect the intent of both the human operator and the human patient, motion retargeting techniques to map the actions of the human to the robot\u2019s motions, and physics-based models to enable the robot to predict and understand the implications of its actions.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PI:<\/strong> Christian Holz (opens in new tab)<\/span><\/a>
          Microsoft PI:<\/strong> 
          Ken Hinckley (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Hugo Romat (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          Over the past dozen years, touch input \u2013 seemingly well-understood \u2013 has become the predominate means of interacting with devices such as smartphones, tablets, and large displays. Yet we argue that much remains unknown \u2013 in the form of a seen but unnoticed vocabulary of natural touch \u2013 that suggests tremendous untapped potential. For example, touchscreens remain largely ignorant of the human activity, manual behavior, and context-of-use beyond the moment of finger-contact with the screen itself. In a sense, status quo interactions are trapped in a flatland of touch, while systems remain oblivious to the vibrant world of human behavior, activity, and movement that surrounds them.We posit that an entire vocabulary of naturally-occurring gestures \u2013 both in terms of the activity of the hands, as well as the subtle corresponding motion and compensatory movements of the devices themselves \u2013 exists in plain sight.Our intended outcome is creating a conceptual understanding as well as a deployable interactive system, both of which blend the naturally-occurring gestures \u2013 interactions users embody through their actions \u2013 with the explicit input through traditional touch operation.<\/p>\n\n\n\n\n\n\n\n

          ETH Zurich PI:<\/strong> Onur Mutlu (opens in new tab)<\/span><\/a>
          Microsoft PIs:<\/strong> 
          Kushagra Vaid (opens in new tab)<\/span><\/a>, Terry Grunzke,  (opens in new tab)<\/span><\/a>Derek Chiou (opens in new tab)<\/span><\/a>
          PhD Student: <\/strong>
          Lois Orosa (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

          This project examines the architecture and management of next-generation data center storage devices within the context of realistic data-intensive workloads. The aim is to investigate novel techniques that can greatly improve performance, cost, and efficiency in real world systems with real world applications, breaking the barriers between the applications and devices, such that the software can much more effectively and efficiently manage the underlying storage devices that consist of (potentially different types of) flash memory, emerging SCM (storage class memory) technologies, and (potentially different types of) DRAM memories. We realize that there is a disconnect in the communication between applications\/software and the NVM devices: the interfaces and designs we currently have enable little communication of useful information from the application\/software level (including the kernel) to the NVM devices, and vice versa, causing significant performance and efficiency loss and likely fueling higher \u201cmanaged\u201d storage device costs because applications cannot even communicate their requirements to the devices. We aim to fundamentally examine the software-NVM interfaces as well as designs for the underlying storage devices to minimize the disconnect in communication and empower applications and system software to more effectively manage the underlying devices, optimizing important system-level metrics that are of interest to the system designer or the application (at different points in time of execution).<\/p>\n\n\n\n\n\n

          <\/div>\n\n\n\n

          EPFL projects (2017-2018)<\/h3>\n\n\n\n\n\n

          EPFL PIs:<\/strong> Babak Falsafi, Martin Jaggi
          Microsoft Co-PI:<\/strong> Eric Chung<\/p>\n\n\n\n

          Deep Neural Networks (DNNs) have emerged as algorithms of choice for many prominent machine learning tasks, including image analysis and speech recognition. In datacenters, DNNs are trained on massive datasets to improve prediction accuracy. While the computational demands for performing online inference in an already trained DNN can be furnished by commodity servers, training DNNs often requires computational density that is orders of magnitude higher than that provided by modern servers. As such, operators often use dedicated clusters of GPUs for training DNNs. Unfortunately, dedicated GPU clusters introduce significant additional acquisition costs, break the continuity and homogeneity of datacenters, and are inherently not scalable. FPGAs are appearing in server nodes either as daughter cards (e.g., Catapult) or coherent sockets (e.g., Intel HARP) providing a great opportunity to co-locate inference and training on the same platform. While these designs enable natural continuity for platforms, co-locating inference and training on a single node faces a number of key challenges. First, FPGAs inherently suffer from low computational density. Second, conventional training algorithms do not scale due to inherent high communication requirements. Finally, co-location may lead to contention requiring mechanisms to prioritize inference over training. In this project, we will address these fundamental challenges in DNN inference\/training co-location on servers with integrated FPGAs. Our goals are:<\/p>\n\n\n\n