{"id":663087,"date":"2021-01-25T05:52:45","date_gmt":"2021-01-25T13:52:45","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-group&p=663087"},"modified":"2024-03-06T09:35:10","modified_gmt":"2024-03-06T17:35:10","slug":"inria-joint-centre","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/collaboration\/inria-joint-centre\/","title":{"rendered":"Inria Joint Center"},"content":{"rendered":"
The Microsoft Research-Inria Joint Center, Inria JC, is a collaborative research partnership between Microsoft Research and Inria<\/a>, the French Public Research Institute in Computer Science. Since its creation in 2005, the Inria JC has been home to\u00a0over\u00a025\u00a0projects co-led by researchers from\u00a0Inria\u00a0and MSR. There are currently eleven active projects, addressing a wide range of cutting-edge research challenges.<\/p>\n Security is an area of particular focus, with projects on cryptography for blockchains,\u00a0certification of distributed algorithms\u2019 correctness, and\u00a0design of cryptographic protocols for the Internet with certified security properties.<\/p>\n Artificial Intelligence is another main focus of the Inria JC with projects on machine learning and computer vision (specifically for Mixed Reality devices such as HoloLens) and alternative paradigms to reinforcement learning for continuously evolving and interacting agents.\u00a0\u00a0In Computer Vision, the Inria JC partners closely with the Mixed Reality & AI Zurich Lab<\/a>.<\/p>\n \u201cCollaboration between computer science researchers from academia and industry has never made more sense than today, given the exciting challenges and opportunities offered by our field. The MSR-Inria JC empowers such collaborations between fellow researchers at\u00a0Inria\u00a0and Microsoft Research on topics from artificial intelligence to security and quantum computing.\u201d\u00a0 <\/em><\/p>\n Laurent Massoulie, Inria JC Managing Director<\/p>\n The Inria JC is governed by a Management Committee consisting of representatives of Inria, Microsoft Research and Microsoft France. Among other things, the Management Committee oversees the progress of existing projects and identifies new project opportunities.<\/p>\n Microsoft Research – Inria JC The Centre’s objective is to pursue fundamental, long-term research in formal methods, software security, and the application of computer science research to the sciences.<\/p>\n","protected":false},"featured_media":699334,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":true,"_classifai_error":"","msr_group_start":"","footnotes":""},"research-area":[13556,13562,13546,13553,13558,13547],"msr-group-type":[243721],"msr-locale":[268875],"msr-impact-theme":[],"class_list":["post-663087","msr-group","type-msr-group","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-research-area-computational-sciences-mathematics","msr-research-area-medical-health-genomics","msr-research-area-security-privacy-cryptography","msr-research-area-systems-and-networking","msr-group-type-collaboration","msr-locale-en_us"],"msr_group_start":"","msr_detailed_description":"","msr_further_details":"","msr_hero_images":[],"msr_research_lab":[],"related-researchers":[{"type":"guest","display_name":"Matthieu Armando","user_id":665112,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Balthazar Bauer","user_id":665160,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Ioannis Filippidis","user_id":717796,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Florian Groult","user_id":717787,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Yana Hasson","user_id":608811,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Hadrien Hendrikx","user_id":665097,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Antoine Plouviez","user_id":665157,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Marina Polubelova","user_id":717790,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"R\u00e9my Portelas","user_id":717811,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Mathieu Rita","user_id":717820,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Maria Laetitia Teodorescu","user_id":717814,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Boyao Zhou","user_id":665118,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Dimitri Zhukov","user_id":665133,"people_section":"Students and Postdocs","alias":""},{"type":"guest","display_name":"Antoine Defourn\u00e9\u00a0\u00a0","user_id":717799,"people_section":"Students and Postdocs","alias":""},{"type":"user_nicename","display_name":"Melissa Chase","user_id":32878,"people_section":"Microsoft - Principal Investigators","alias":"melissac"},{"type":"user_nicename","display_name":"Antoine Delignat-Lavaud","user_id":31056,"people_section":"Microsoft - Principal Investigators","alias":"antdl"},{"type":"user_nicename","display_name":"C\u00e9dric Fournet","user_id":31819,"people_section":"Microsoft - Principal Investigators","alias":"fournet"},{"type":"user_nicename","display_name":"Esha Ghosh","user_id":37851,"people_section":"Microsoft - Principal Investigators","alias":"esghosh"},{"type":"user_nicename","display_name":"Chris Hawblitzel","user_id":31425,"people_section":"Microsoft - Principal Investigators","alias":"chrishaw"},{"type":"user_nicename","display_name":"Katja Hofmann","user_id":32468,"people_section":"Microsoft - Principal Investigators","alias":"kahofman"},{"type":"user_nicename","display_name":"Markus Kuppe","user_id":32784,"people_section":"Microsoft - Principal Investigators","alias":"makuppe"},{"type":"user_nicename","display_name":"Leslie Lamport","user_id":32614,"people_section":"Microsoft - Principal Investigators","alias":"lamport"},{"type":"user_nicename","display_name":"Jay Lorch","user_id":32732,"people_section":"Microsoft - Principal Investigators","alias":"lorch"},{"type":"user_nicename","display_name":"Marc Pollefeys","user_id":36191,"people_section":"Microsoft - Principal Investigators","alias":"mapoll"},{"type":"user_nicename","display_name":"Jonathan Protzenko","user_id":33286,"people_section":"Microsoft - Principal Investigators","alias":"protz"},{"type":"user_nicename","display_name":"Tahina Ramananandro","user_id":36293,"people_section":"Microsoft - Principal Investigators","alias":"taramana"},{"type":"user_nicename","display_name":"Aseem Rastogi","user_id":36021,"people_section":"Microsoft - Principal Investigators","alias":"aseemr"},{"type":"user_nicename","display_name":"Srinath Setty","user_id":33709,"people_section":"Microsoft - Principal Investigators","alias":"srinath"},{"type":"user_nicename","display_name":"Sudipta Sinha","user_id":33748,"people_section":"Microsoft - Principal Investigators","alias":"sudipsin"},{"type":"user_nicename","display_name":"Paul Smolensky","user_id":36353,"people_section":"Microsoft - Principal Investigators","alias":"psmo"},{"type":"user_nicename","display_name":"Nikhil Swamy","user_id":33138,"people_section":"Microsoft - Principal Investigators","alias":"nswamy"},{"type":"user_nicename","display_name":"Santiago Zanella-B\u00e9guelin","user_id":33518,"people_section":"Microsoft - Principal Investigators","alias":"santiago"},{"type":"guest","display_name":"Martin de la Gorce","user_id":735115,"people_section":"Microsoft - Principal Investigators","alias":""},{"type":"guest","display_name":"Francis Bach","user_id":665040,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Karthikeyan Bhargavan","user_id":571929,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Edmond Boyer","user_id":665049,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Damien Doligez","user_id":665163,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Emmanuel Dupoux","user_id":717817,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Jean-S\u00e9bastien Franco","user_id":665052,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Ivan Laptev","user_id":665055,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Laurent Massoulie","user_id":665043,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Stephan Merz","user_id":665169,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Pierre-Yves Oudeyer","user_id":717808,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"David Pointcheval","user_id":664998,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Josef Sivic","user_id":665061,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Stefano Zacchiroli","user_id":717805,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Roberto Di Cosmo","user_id":717802,"people_section":"Inria - Principal Investigators","alias":""},{"type":"guest","display_name":"Fran\u00e7ois Cuny","user_id":717412,"people_section":"Management Board","alias":""},{"type":"guest","display_name":"\u00c9ric Fleury","user_id":718546,"people_section":"Management Board","alias":""},{"type":"guest","display_name":"Jean-Fr\u00e9d\u00e9ric Gerbeau","user_id":717415,"people_section":"Management Board","alias":""},{"type":"guest","display_name":"Eneric Lopez","user_id":743158,"people_section":"Management Board","alias":""},{"type":"guest","display_name":"Laurent Massoulie","user_id":665043,"people_section":"Management Board","alias":""},{"type":"guest","display_name":"Bernard Ourghanlian","user_id":717403,"people_section":"Management Board","alias":""},{"type":"user_nicename","display_name":"Evelyne Viegas","user_id":31768,"people_section":"Management Board","alias":"evelynev"}],"related-publications":[339641,720403,866301,392960,720412,619023,720418,640176,720424,655743,720430,662424,720436,671178,720442,701437,720448,152126,720364,725464,161413,720376,725473,328493,720391,866286],"related-downloads":[736291],"related-videos":[],"related-projects":[],"related-events":[],"related-opportunities":[],"related-posts":[],"tab-content":[{"id":0,"name":"Projects","content":"Address<\/h3>\n
\n2 rue Simone Iff
\n75012 PARIS<\/p>\n","protected":false},"excerpt":{"rendered":"Current projects<\/h2>\r\n[accordion]\r\n\r\n[panel header=\"Acquisition and Synthesis of High-quality Four-dimensional Data\"]\r\n\r\nInria PI:<\/strong> Edmond Boyer, Jean-S\u00e9bastien Franco\r\n\r\nMS PI:<\/strong> Federica Bogo, Martin de la Gorce\r\n\r\nPhD students:<\/strong> Matthieu Armando, Boyao Zhou\r\n\r\nOne aim of this project is the synthesis of 3D-shape models based on observations from several viewpoints. A challenge to reach this aim is to remove noise from acquired shape models. To that end we develop denoising algorithms based on graph convolutional neural networks.\r\nA second aim of this project is the completion of shape and motion information from partial observations, e.g. sparse point clouds. To address it we develop approaches based on Gaussian processes combined with deep learning.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Autonomous Learning \"]\r\n\r\nInria PI:<\/strong> Pierre-Yves Oudeyer\r\n\r\nMSR PI:<\/strong> Katja Hofmann\r\n\r\nPhD students:<\/strong> R\u00e9my Portelas, Laetitia Teodorescu\r\n\r\nThis project takes place in the context of developing the foundations of a novel machine learning framework, called Intrinsically Motivated Goal Exploration (IMGEPs), focused on the challenge of autonomous learning of diverse repertoires of skills in open and changing environments. This is an alternative to the classical reinforcement learning framework, which is currently reaching both conceptual and practical limits. It is grounded in a decade of work modelling mechanisms of curiosity-driven learning and development in human infants. In particular, it aims to develop algorithms enabling i) to learn incrementally goal spaces representations and associated internal goal achievement reward function, with the aim to foster efficient discovery of a diversity of skills; ii) a social peer to guide curiosity-driven learners using natural language, both fostering more efficient exploration leveraging language input, and enabling external users to leverage skills learned autonomously by the machine by using language-based queries.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Cryptography for the Blockchain \"]\r\n\r\nInria PIs:<\/strong> David Pointcheval\r\n\r\nMSR PIs:<\/strong> Melissa Chase, Esha Ghosh, Santiago Zanella\r\n\r\nPhD students:<\/strong> Balthazar Bauer, Antoine Plouviez\r\n\r\nThe project aims to advance state-of-the-art in blockchain technology. Blockchains as exemplified by Bitcoin, Ethereum, ZCash, show promise to enable guaranteed transactions without the need for a central authority. Among identified objectives, researchers of the project are looking at: making blockchains \u201cgreener\u201d, for instance by replacing notion of proof of work that underlies bitcoin by \u201cproof of storage\u201d, and develop new applications of blockchains, in the spirit of recent proposals to perform decentralized verification of e.g., web certificates, and public-key infrastructure system.\r\n\r\n[\/panel]\r\n[panel header=\"Flexible AI\"]\r\n\r\nInria PI:<\/strong> Emmanuel Dupoux\r\n\r\nMSR PI:<\/strong> Paul Smolensky\r\n\r\nPhD student:<\/strong> Mathieu Rita\r\n\r\nRecent progresses in deep learning led to the successful development of neural networks able to achieve complex tasks in domains such as computer vision or natural language. These successes are attained by networks that are trained with massive amounts of data. However, the interactive and functional aspects of intelligence are almost completely ignored. Indeed, researchers only begin to take an interest in introducing communicating tools in multi-agent settings. This research aims in studying the conditions under which populations of agents develop a communication that helps them to succeed in their tasks. From a scientific perspective, understanding the evolution of languages in communities of deep agents and its emergent properties can shed light on human language evolution. From an applied perspective, endowing deep networks with the ability to solve problems interactively by communicating with each other and with us should make them more flexible and useful in everyday life.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Machine Learning for Distributed Environments \"]\r\n\r\nInria PIs:<\/strong> Francis Bach, Laurent Massouli\u00e9\r\n\r\nMSR PIs:<\/strong> S\u00e9bastien Bubeck\r\n\r\nPhD Student:<\/strong> Hadrien Hendrikx\r\n\r\nThe project aims to develop efficient distributed ML algorithms suited to run for instance in the cloud.\r\n\r\nLarge-scale machine learning algorithms, whether supervised or unsupervised, are ideally run on a cloud platform with many cores organized in a hierarchical fashion in datacenters. Important efficiency gains can be had by designing new algorithms tailored to exploit such distributed platforms. Specific objectives include design of new algorithms for supervised learning (e.g., distributed, accelerated gradient descent) and unsupervised learning (e.g., distributed spectral analysis and graph clustering)\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Project Everest\"]\r\n\r\nInria PI:<\/strong> Karthik Bhargavan\r\n\r\nMSR PIs:<\/strong> Antoine Delignat-Lavaud, C\u00e9dric Fournet, Jonathan Protzenko, Nikhil Swamy, Santiago Zanella\r\n\r\nPostdoc:<\/strong> Florian Groult\r\n\r\nPhD Student:<\/strong> Marina Polubelova\r\n\r\nThe project aims at building formally verified high-performance standard-compliant cryptographic components for securing Internet communications. These components can be used either as drop-in replacements for existing libraries and protocols, or to build verified secure sub-systems and applications. By construction, these components prevent flaws and vulnerabilities that litter existing implementations and require frequent security patches\u2014see, e.g., the 3SHAKE, FREAK and LOGJAM attacks uncovered as part of our research. More details can be found at the following sites:\r\n\r\nView project page<\/a>\r\n\r\nhttps:\/\/project-everest.github.io\/<\/a>\r\n\r\nhttps:\/\/www.fstar-lang.org\/<\/a>\r\n\r\nhttps:\/\/mitls.org\/<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Quantum Error Correction\"]\r\n\r\nInria PI:<\/strong> Anthony Leverrier\r\n\r\nMS PIs:<\/strong> Nicolas Delfosse\r\n\r\nQuantum hardware suffers from much higher error rates than classical devices. As a result, extensive error correction is necessary to execute a large scale quantum algorithm. Quantum LDPC codes could lead to a significant reduction the cost of quantum error correction but they need better decoders. With this project, we will explore the potential of the Union-Find decoder to decode quantum LDPC codes.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Quantum Walks and Algorithms\"]\r\n\r\nInria PI:<\/strong> Alain Sarlette\r\n\r\nMS PIs:<\/strong> Martin Roetteler, Matthias Troyer\r\n\r\nPostdoc:<\/strong> Matthys Rennela\r\n\r\nThe aim of this project is to accurately evaluate the resources required for implementing quantum speedup based on the \"quantum fast forwarding\" (QFF) algorithm for Markov chains. The QFF routine indeed operates at an intermediate level between a graph-based quantum oracle, and an output to be extracted from the generated Markov chain distribution. We will consider concrete applications to specify these oracle and output layers, evaluate the complexity and opportunities for implementing them in a quantum computer, and thereby propose a full-chain quantum computation with graph oracles.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Software Heritage and Software Productivity \"]\r\n\r\nInria PIs:<\/strong> Roberto di Cosmo, Stefano Zacchiroli\r\n\r\nMSR PI:<\/strong> Tom Zimmermann\r\n\r\nSoftware Heritage is today the largest corpus of software source code artifacts, containing over 6 billion unique source code files, 1 billion commits, coming from more than 90 million software projects. So far, empirical studies of software engineering have been conducted on much smaller repositories, and as such can\u2019t be fully conclusive. The aim of this project is to reproduce previous empirical studies relevant for the improvement of developer productivity on Software Heritage. In particular, regarding Software Heritage it will analyze: the extent of large-scale code reuse in open-source software; identification of topics in source code via semantic clustering; relation between programming language and code quality; quality and productivity outcomes relating to continuous integration.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"TLA Proof System \"]\r\n\r\nInria PIs: <\/strong>Damien Doligez, Stephan Merz\r\n\r\nMSR PIs:<\/strong> Markus Kuppe, Leslie Lamport\r\n\r\nPostdoc:<\/strong> Ioannis Filippidis\r\n\r\nPhD student:<\/strong> Antoine Defourn\u00e9\r\n\r\nThe project aims to develop a proof assistant for enabling certification of TLA+-expressed specification.\r\n\r\nTLA+, with the associated Pluscal algorithm language, have been proposed by Leslie Lamport as tools for developing and verifying algorithms for concurrent and distributed systems. These have gradually matured and have recently been used for industrial applications. They are now entering mainstream use for certifying properties of distributed systems. An associated proof assistant is an important complement to model-checking for establishing properties of TLA+-expressed algorithm specifications. More details can be found at: https:\/\/lamport.azurewebsites.net\/tla\/tla.html<\/a>\r\n\r\nhttps:\/\/tla.msr-inria.inria.fr<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Video Understanding for Personal Assistants \"]\r\n\r\nInria PIs: <\/strong>Ivan Laptev, Josef Sivic\r\n\r\nMS PIs:<\/strong> Marc Pollefeys, Johannes Sch\u00f6nberger\r\n\r\nPhD student:<\/strong> Yana Hasson, Dimitri Zhukov\r\n\r\nThe project aims to develop tools for the HoloLens Mixed Reality device, in particular solutions for construction of visual cues to be superimposed to real scene in order to help conduct practical tasks (such as changing a car tyre). Ongoing work at Inria shows promise on automatic processing of YouTube videos of instructions for various tasks to identify key stages and visual signatures of these steps. It is a bold challenge to try and go from there to the actual production of visual cues to be projected onto HoloLens to assist users in conducting chosen tasks.\r\n\r\n[\/panel]\r\n\r\n[\/accordion]\r\n
<\/h2>\r\n
Archived projects<\/h2>\r\n[accordion]\r\n\r\n[panel header=\"4D Cardiac MR Images\"]\r\n\r\n\r\n\r\nThis project finished in 2013 \u2013 then continued by \u201cMedilearn\u201d<\/em>\r\n
Goals of the project<\/h3>\r\nGiven a large database of cardiac images of patients stored with an expert diagnosis, we wish to automatically select the most similar cases to the cardiac images of a new patient.\r\n
Application<\/h3>\r\nThis is important, for instance to try and estimate the optimal time for an operation when a Tetralogy of Fallot condition is diagnosed (see fig. 1).\r\n
Science<\/h3>\r\nWe want to be able to index images based on e.g. the shape of the heart, the dynamics of the myocardium, or the presence of anomalies. This would require capturing the right level of shape, motion and appearance characteristics in a compact and efficient way so as to enable fast indexation and retrieval even for hundreds of 4D datasets. Using state of the art efficient machine learning techniques is also of paramount importance. The design of both the visual features and the classification algorithms have to be informed by medical experts so as to make sure the final system remains relevant from a clinical point of view.\r\n\r\nSuch a system would provide a strong innovation in cardiology, a learning tool for residents and young physicians, and a new tool to help physicians to assess a diagnosis and plan a therapy. The method could emphasize local motion and\/or local shape singularities (e.g. septal flash (motion anomaly) vs. ventricle overload (shape anomaly), with the possibility to actually weight motion vs. shape features.\r\n
The MSRC\u00ad-Inria collaboration<\/h3>\r\nKey points to succeed in developing such a system are the following:\r\n
\r\n \t
Our partnership<\/h3>\r\nIt is clear that the close partnership between INRIA Sophia Antipolis, Microsoft Research Cambridge, and King\u2019s College London meets the required expertise of the three points above, and could lead to a pioneering advance in the field of computer aided clinical decision for pediatric cardiology and more generally to the field of content-\u00ad\u2010based retrieval of medical images.\r\n\r\nThe best strategy to implement this research agenda would be to rapidly hire a highly qualified PhD student, who would share his\/her time between the 3 institutions, in order to rapidly develop expertise by leveraging the extensive knowledge of three participating laboratories. The exact repartition of the time would probably depend on the original qualification of the candidate, and on the progress made during the PhD research.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"A-Brain\"]\r\n\r\n\r\n\r\nThis project finished in 2013 \u2013 then continued by Z-CloudFlow and MediLearn.<\/em>\r\n
Goals of the project<\/h3>\r\nJoint acquisition of neuroimaging and genetic data on large cohorts of subjects is a new approach used to assess and understand the variability that exists between individuals, and that has remained poorly understood so far. As both neuroimaging- and genetic-domain observations represent a huge amount of variables (of the order of 106), performing statistically rigorous analyses on such amounts of data represents a computational challenge that cannot be addressed with conventional computational techniques. On one hand, sophisticated regression techniques need to be used in order to perform sensitive analysis on these large datasets; on the other hand, the cost entailed by parameter optimization and statistical validation procedures (e.g. permutation tests). However, the computational framework can easily by run in parallel.\r\n\r\nIn this project, researchers of the Parietal and KerData INRIA teams will jointly address address this computational problem using cloud computing techniques on Microsoft Azure cloud computing environment. The two teams bring their complementary expertise: KERDATA (Rennes) in the area of scalable cloud data management and PARIETAL (Saclay) in the field of neuroimaging and genetics data analysis. The Map-Reduce programming model has recently arisen as a very effective approach to develop high-performance applications over very large distributed systems such as grids and now clouds. KerData has recently proposed a set of algorithms for data management, combining versioning with decentralized metadata management to support scalable, efficient, fine-grain access to massive, distributed Binary Large OBjects (BLOBs) under heavy concurrency. The project investigates the benefits of integrating BlobSeer with Microsoft Azure storage services and aims to evaluate the impact of using BlobSeer on Azure with large-scale application experiments such as the genetics-neuroimaging data comparisons addressed by Parietal.\r\n\r\nThis project will thus bring together researchers from algorithmic and statistical analysis domain on the one hand, and researchers involved on the organization of data management in intensive computation on the other hand, to work on the Microsoft Azure platform in order to unveil the relationships between genes and brain characteristics.\r\n
Team members<\/h3>\r\nCurrent members:<\/strong>\r\n
\r\n \t
\r\n \t
Goals of the project<\/h3>\r\nThe goal of this research is to improve the applicability of constraint-based or heuristic-based solvers to complex scientific problems. As demonstrated by a large literature, e-Scientists already benefit from the use of search procedures to tackle a large variety of important problems. Unfortunately, these applications suffer from the limits of current solving technologies which appear to be poorly adapted to these new domains. One solution to improve performance is the fine tuning of the solver parameters. This is a tedious and time-consuming task that often requires knowledge about both the domain and the algorithm.\r\n\r\nThis approach is hardly applicable to e-Science whose applications fields are constantly growing. We claim that the self-adaptation of a solver to the domain of interest is the only viable solution to this problem. Our goal in this project is to develop tools able to automatically choose the optimal parameter configuration of a given search algorithm for a given problem or class of problems.\r\n\r\nThis adaptation would allow us to deploy combinatorial search with new complex scientific problems with good expected performance.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Dynamic Dictionary of Mathematical Functions\"]\r\n\r\n\r\n\r\nThis project finished in 2013.<\/em>\r\n
Project Summary<\/h3>\r\nOur research project is concerned with the automatic derivation of properties and formulae concerning classical elementary and special functions. We are located in Orsay, within the\u00a0Inria-Microsoft Research Joint Centre, with participants from the\u00a0Algorithms project<\/a>\u00a0at Inria Rocquencourt. Please contact us for additional information, visits, internships, etc.\r\n
Overview<\/h3>\r\nThe aim of this project is to automate the computation of numerous mathematical formulae that are needed in analysis. We intend to show that large parts of the code dealing with elementary and special functions in computer algebra systems can actually be generated automatically in a uniform manner.\r\n\r\nThe starting point is to promote linear differential or difference equations to the status of data-structures representing their solutions. New efficient algorithms operating directly on this data structure will be designed as part of this project. As a showcase for this project, we intend to develop an encyclopedia of special functions available on the web. In addition to our existing prototype, this new version will provide dynamical entries depending on user queries; integral transforms (Laplace, Fourier,\u2026); more information on each function.\r\n\r\nIn order to achieve the desired efficiency, the project contains an important theoretical investment on the complexity of fundamental algorithms of computer algebra. Computer algebra is devoted to the study of effective mathematics and the complexity of corresponding algorithms. The emphasis is on exact mathematical objects (integers, rational numbers, polynomials, algebraic numbers,\u2026) rather than numerical approximations; the foundations lie in algebra. So far, most of the impact of this area has been in pure mathematics and in education.\r\n\r\nThe applications of computer algebra systems such as Maple or Mathematica to applied, discrete or experimental mathematics have often been restricted by either the scarceness of exact solutions or the huge memory requirements on practical problems. Our long-term project aims at improving this state of affairs and bring computer algebra to a stage where it can generate efficient and guaranteed numerical code for approximating the solutions of large classes of problems of real-life size. We pursue three main ideas rooted in computer science.\r\n
Equations as data structures<\/h3>\r\nVery few equations admit explicit solutions. This does not mean that questions about these solutions cannot be answered. Indeed, even when explicit solutions exist, some of their properties can be accessed more efficiently from the defining equations. By considering equations themselves as data-structures, one is led to develop algorithms that construct equations satisfied by objects of interest as well as algorithms that extract information from these equations. This idea is classical in the case of roots of polynomials, where the polynomials themselves are a good data-structure for representing the roots. Algorithmic tools operating on this data-structure are the classical Euclidean division, Euclidean algorithm and, in the case of multivariate systems, Gr\u00f6bner bases.\r\n\r\nMore recently, it has been realized (in a large part by members of this project), that another important class of examples is provided by linear differential equations. The tools there are non-commutative analogues of the algorithms used for polynomials. To give an idea of the impact of this idea, it can be estimated that approximately 60% of Abramowitz and Stegun\u2019s Handbook of Mathematical Functions (one of the most cited references in mathematics, physics, chemistry and engineering sciences), covering the knowledge of the best specialists on about 100 mathematical functions can be generated automatically by computer algebra programs using systems of linear differential equations as the basic data-structure. This has been implemented in a prototype\u00a0on-line encyclopedia<\/a>.\r\n\r\nObvious advantages of computer generation are added interactivity, a virtually unbounded set of functions, and automatically generated symbolic or numerical code. The next steps in this area will be to add more interactivity to this prototype, to extend it so as to deal with integral transforms and more general parameterized integrals. In the long term, another application of this data-structure is the generation of good approximations for otherwise difficult to evaluate multiple integrals arising in theoretical physics or chemistry.\r\n
Mathematical objects stored as programs<\/h3>\r\nIntermediate expression swell is a common disease of computer algebra algorithms: integers or polynomials intervening in intermediate computations are much larger than the actual input and output and most of the computation time is spent on them. In several important cases, the actual expression of these objects are not needed for the full computation but only pieces of information concerning them. It has been realized that using for data-structure programs that evaluate these polynomials or integers or more general mathematical objects results in huge memory savings.\r\n\r\nAll the algorithms then need to be revisited since arithmetic operations become costless while equality testing becomes expensive. The results of carrying out this program for polynomial systems are spectacular: the worst-case complexity which was exponential in the size of the result has become only polynomial. This is the result of a long research program by the\u00a0TERA group, to which some of us have contributed.\r\n\r\nAnother recent application of this idea to differential equations has led us to algorithms whose complexity is almost optimal (up to logarithmic factors) in the size of the result for their solutions. Such results have an immediate impact throughout large parts of computer algebra.\r\n
Geometry of Complexity<\/h3>\r\nAnother source of computational complexity comes from the nature of the problems that are being attacked. The ill-conditioning of equations or matrices which causes difficulties in numerical analysis has a counterpart in the size of exact representations. Typically, the existence of a close-by singular matrix or polynomial or polynomial system forces the representations of solutions to involve very large integers. Thus the intrinsic complexity of a problem depends on how far (in a mathematically quantifiable sense) it lies from singularities. One is thus led to a theory of geometric computational complexity. There, global upper bounds are refined by a complexity measure depending on the geometry of the input data-structure space.\r\n\r\nA lot of work is yet to be done, both experimentally and theoretically, to develop geometry-aware algorithms. For instance, we have recently developed and precisely analyzed an algorithm that converges quadratically to solutions of systems, even in the presence of multiplicities or clusters of roots, by analyzing numerically and on the fly the nature of the close-by singularity. The goal there is to enlarge as much as possible the class of problems for which answers can be computed in time almost linear in the size of appropriate data structures for both input and output. Again, results obtained there are of high relevance not only in our project.\r\n
Team members<\/h3>\r\nFormer members:<\/strong>\r\n
\r\n \t
Our software<\/h3>\r\n
\r\n \t
\r\n \t
\r\n \t
Team members<\/h3>\r\nCurrent members:<\/strong>\r\n
\r\n \t
\r\n \t
\r\n \t
Team members<\/h3>\r\nFormer members:<\/strong>\r\n
\r\n \t
Summary<\/h3>\r\nThe object of this project is to demonstrate that formalized mathematical theories can, like modern software, be built out of components. By components we mean modules that comprise both the static (objects and facts) and dynamic (proof and computation methods) contents of theories. We propose to develop a general platform for mathematical components, based on the\u00a0Coq<\/a>\u00a0\u201cSsreflect\u201d extension that was used to carry out the formalization of the Four Colour Theorem. We would validate the platform on two significant developments: a theory of efficient arithmetic, and the proof of the Odd Order theorem. The former could be used in the resolution of several outstanding computer proof challenges, among them the Kepler Conjecture, and the latter could become the first step in a new major challenge: the classification of finite simple groups.\r\n
Team members<\/h3>\r\n
\r\n \t
\r\n \t
Project Scope<\/h3>\r\nA long standing problem of medical image analysis is the lack of annotated data (e.g. images of a tumour with its associated, expert annotated regions). Such data is crucially important for training automatic classification techniques which are able to automatically detect and delineate an anomaly in a new, previously unseen patient. Applications range from diagnosis, to more accurate treatment, from quantification of the effect of a drug to prevention.\r\n
Research<\/h3>\r\nRecent trends in machine learning and computer vision demonstrate that under certain constraints it is possible to generate annotated training data synthetically. The trained classifiers then are shown to work well on previously unseen test images. For instance, this technique is at the basis of the successful Microsoft Kinect sensor; where a body part classifier was trained on millions of synthetically generated images of people in various poses.\r\n\r\nIn this project we wish to do the same for medical images. This is very timely for a number of reasons. For instance, we now have available realistic generative models of e.g. brain scans, brain tumors, beating hearts, as well as full-body CT scans. So now the question is how well does a system trained on such synthetic images work on real patients scans, and how to modify the algorithm and its features to make it work more accurately.\r\n\r\nAs part of this project, the generated synthetic training and testing sets and their annotations will also be made available to the whole scientific community. This will enable various research teams to take part in competitions and compare their different algorithms on an equal base.\r\n
Neuroimaging<\/h3>\r\nNeuroimaging is accumulating large functional MRI datasets that display, in spite of their low signal-to-noise ratio (SNR), brain activation patterns giving access to a meaningful representation of brain spatial organization. This ongoing accumulation is intensified via new large-scale international initiatives such as the Human Connectome Project or the recently accepted Human Brain Project, but also to existing open repositories of functional neuroimaging datasets. These datasets represent a very significant resource for the community, but require new analytic approaches in order to be fully exploited.\r\n\r\nIn order to provide a synthetic picture of the brain substrate of human cognition and its pathologies, we proceed by learning from large-scale datasets a brain atlas that summarizes adequately these functional activation maps drawing from a large number of protocols and subjects. Once learned, such an atlas is extremely useful to understand the large-scale functional organization of the brain: it is a tool for understanding brain segregation, the different encoding of many cognitive parameters into different brain regions, as well as brain integration, i.e. how remote brain regions co-activate across subjects and experiments.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Privacy-Friendly Services and Apps\"]\r\n\r\nThe objective of this project is to re-imagine how modern on-line services and applications may be engineered to provide a higher degree of technical privacy protection. Supporting stronger privacy and user control involves a serious redesign of key protocols and architectures that underlie these services, as well as the development of principled definitions of privacy, tools to realize them, and methods to evaluate the degree of protection afforded.\r\n\r\nWe consider two categories of services: The first one consists of those systems that process personal information but generate information that is publicly disclosed. A typical example is aggregating the data of multiple individuals, and extracting higher level information and statistics. For example, the likelihood of customers buying two items together, or the correlation between a certain disease and obesity. For those features, the objective of this project is to devise generic techniques to compute and make available such statistical data while minimizing the exposure of personal information.\r\n\r\nThe second category of services both processes personal information and also generates personal information. For example, the insurance premium to be paid by an individual on the basis of their health may be both the result of processing sensitive information, as well as sensitive itself. For those applications, we aim to devise generic architectures that (a) minimize the exposure of private information and (b) allow for flexible policies to determine under what conditions the resulting sensitive information is shared, and with whom.\r\n\r\nA fundamental objective, common to both categories, is to develop principled and robust definitions of privacy, as well as methods for evaluating the quality of protection offered by different proposed mechanisms. We intend to consider variants of differential privacy, cryptography and other established privacy technologies, augmented by rigorous proofs of correctness.\r\n\r\nWe plan to dedicate particular attention to the domain of location-based services. We aim at designing mechanisms that will allow a user to get information from such services, such as point of interest, or nearby friends, without exposing his precise location. To this aim, we plan to develop obfuscation techniques that will provide strong privacy guarantees, such as geo-indistinguishability, a variant of differential privacy suitable for geo-location.\r\n\r\nFinally, we intend to prototype high quality software tools for developing and evaluating privacy-friendly services. These include tools and libraries that implement high value computations in a privacy preserving manner; language based tools, such as compilers and runtimes, to support building higher level services; and platforms and API that support privacy features out of the box. These should be capable of inter-operating to produce larger and richer services that process location, medical, or financial data without exposing them to third parties. While this is not a core objective of the project, we hope to achieve synergies with other Microsoft Research-Inria projects that focus on automatic verification to validate the code that implements protection mechanisms.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"ReActivity\"]\r\n\r\nThis project finished in 2011.<\/em>\r\n
Research goals<\/h3>\r\n
\r\n \t
\r\n \t
ReActivity phases<\/h3>\r\n
\r\n \t
\r\n \t
\r\n \t
\r\n \t
Team members<\/h3>\r\nFormer members:<\/strong>\r\n
\r\n \t
Project Summary<\/h3>\r\nWe propose to focus on fundamental computer science research in computer vision and machine learning, and its application to archaeology, cultural heritage preservation, and sociology. We validate our project by collaborations with researchers and practitioners in these fields.\r\n
Goals of the project<\/h3>\r\nWe address the following problems:\r\n\r\nMining historical collections of photographs and paintings with applications to archaeology and cultural heritage preservation. This includes for example the quantitative analysis of environmental dammage on wall paintings or mosaics over time, and the cross-indexing of XIXth Century paintings of Pompeii with modern photographs.\r\n\r\nMining TV broadcasts with applications to sociology. This includes automating the analysis and annotation of human actions and interactions in video segments to assist \u2013and provide data for\u2013 studies of consumer trends in commercials, political event coverage in newscasts, and class- and gender-related behavior patterns in situation comedies, for example.\r\n\r\nFor every one of the problems we have in mind, indexing, searching and analyzing photo and video collections is a key issue. Recent advances in image analysis, computer vision, and machine learning promise an opportunity to automate, partly or completely, these tasks (e.g., annotation of photos and videos), as well as to access information whose extraction from images is simply beyond human capabilities (e.g., indexing of very large image archives). To fulfil this promise, we propose to conduct fundamental research in object, scene, and activity modeling, learning, and recognition, and to validate it with the development of computerized image and video mining tools at the service of sciences and humanities.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Secure Distributed Computations and their Proofs\"]\r\n
Secure Distributed Computations and their Proofs<\/h3>\r\nWe develop formal tools for programming distributed computation with effective security guarantees. Our goal is to enable programmers to express and prove high-level security properties with a reasonable amount of effort\u2014sometimes automatically, sometimes with mechanical assistance\u2014as part of the development process. These properties should hold in a hostile environment, with realistic (partial) trust assumptions on the principals and machines involved in the computation.\r\n\r\nWe are located in Orsay, within the Inria-Microsoft Research Joint Lab, with participants from the\u00a0Programming Principles and Tools\u00a0group<\/a> at Microsoft Research Cambridge, the\u00a0Moscova\u00a0team<\/a> at Inria Rocquencourt, and the\u00a0Everest\u00a0team<\/a> at Inria Sophia-Antipolis. Please contact us for additional information, visits, internships, etc.\r\n
Programming Abstractions for Security<\/h3>\r\nOver the last decade, as security expectations have evolved, the task of writing secure code has become more challenging and widespread. A large amount of code now has security implications, and programmers often struggle to analyze them. In principle, a security analysis should involve a precise mapping from security goals to enforcement mechanisms, but these high-level goals are often left informal, whereas lower-level enforcement mechanisms are largely hidden in library implementations.\r\n\r\nPartly because their design predates security concerns, the abstractions embedded in programming languages are usually ineffective as regards security. In a distributed system, for example, these abstractions are trustworthy only inasmuch as every machine involved in the computation \u201cplays by the rules\u201d and correctly implements them. This assumption is clearly unrealistic when some of these machines may be controlled by an adversary. Hence, for instance, a cautious programmer who relies on remote procedure calls should not reason about them as if they were local calls, and must instead look at the details of their implementation, including the network stack, the underlying protocols, and the system configuration. Also, due to functional demands such as flexibility and ease of deployment, the boundary between applications and protocols gets blurred. Modern application code often needs to dynamically select protocols, or change their configuration. Hence, even the experts who design the protocols may not be able to guarantee their security in advance, without knowledge of the application. In fact, most interesting high-level security properties (privacy, integrity, or secrecy invariants) necessarily depend on the whole application. As a result, those properties must somehow be reflected to the programmer, rather than hidden behind \u201ctransparent\u201d abstractions.\r\n
Secure Implementations for Typed Session Abstractions<\/h3>\r\nDistributed applications can be structured as parties that exchange messages according to some pre-arranged communication patterns. These sessions (or contracts, or protocols) simplify distributed programming: when coding a role for a given session, each party just has to follow the intended message flow, under the assumption that the other parties are also compliant. In an adversarial setting, remote parties may not be trusted to play their role. Hence, defensive implementations also have to monitor one another, in order to detect any deviation from the assigned roles of a session. This task involves low-level coding below session abstractions, thus giving up most of their benefits.\r\n\r\nWe explore language-based support for sessions. We extend the ML language with session types that express flows of messages between roles, such that well-typed programs always play their roles. We compile session type declarations to cryptographic communication protocols that can shield programs from any low-level attempt by coalitions of remote peers to deviate from their roles. (\u2026)\r\n
Cryptographic Enforcement of Information-Flow Security<\/h3>\r\nWe consider cryptographic enforcement mechanisms for imperative programs with untrusted components. In source programs, security depends on an abstract information-flow policy for accessing the shared memory. In their implementations, shared memory is unprotected and security depends instead on encryption and signing. We build compilers from source programs and policies to cryptographic implementations. To establish their correctness, we develop type systems that enforce a correct usage of cryptographic primitives against active adversaries, under standard (probabilistic, polytime) security assumptions. (\u2026)\r\n
Secure Audit Logs<\/h3>\r\nMany protocols rely on secure audit logs to store traces of protocol runs. These logs can be used a posteriori to verify the compliance of each principal to its role: principals who attempt non-compliant actions will be blamed using the logged evidence. Secure logging is widely used in practice and much research effort has been devoted to techniques for implementing logs securely and efficiently. Still, which data should be logged? and why? (\u2026)\r\n
Automated Verifications of Protocol Implementations<\/h3>\r\nThere has been much progress in formal methods and tools for cryptography, enabling, in principle, the automated verification of complex security protocols. In practice, however, these methods and tools remain difficult to apply. Indeed, the verification of a system that uses a given protocol involves more than the cryptographic verification of an abstract model. For these reasons, we are interested in the integration of protocol verifiers into the arsenal of software testing and verification tools.\r\n\r\nIn recent work, Bhargavan et al. advocate the automatic extraction and verification of cryptographic models from executable code. They verify protocol implementations written in F#. Their approach relies on sharing as much code as possible between implementations and models. We explore a similar approach to extend the benefit of computational cryptographic verification to protocol implementations. to this end, we develop a tool for extracting cryptographic models from executable code written in F#, then rely on CryptoVerif to obtain computational verification results for executable code.\r\n\r\nAs an extended case study, we verify implementations of TLS, one of the most widely deployed communications protocol.\r\n
CertiCrypt: Formal Proofs for Computational Cryptography<\/h3>\r\nProvable security, whose origins can be traced back to the pioneering work of\u00a0Goldwasser and Micali, advocates a mathematical approach based on complexity theory in which the goals and requirements of cryptosystems are specified precisely, and where the security proof is carried out rigorously and makes all underlying assumptions explicit. In a typical provable security setting, one reasons about effective adversaries, modelled as arbitrary probabilistic polynomial-time Turing machines, and about their probability of thwarting a security objective, e.g., secrecy. In a similar fashion, security assumptions about cryptographic primitives bound the probability of polynomial algorithms to solve difficult problems, e.g., computing discrete logarithms. The security proof is performed by reduction by showing that the existence of an effective adversary with a certain advantage in breaking security implies the existence of an effective algorithm contradicting the security assumptions.\r\n\r\nThe game-playing technique is a general method to structure and unify cryptographic proofs. Its central idea is to view the interaction between an adversary and the cryptosystem as a game, and to study game transformations that preserve security, thus allowing to transform an initial game-that explicitly encodes a security property-into a game where it is easy to bound the advantage of the adversary. Code-based techniques (CBT) is an instance of the game-playing technique that has been used successfully to verify state-of-the-art cryptographic schemes, see e.g.\u00a0Belare and Rogaway. Its distinguishing feature is to take a code-centric view of games, security hypotheses and computational assumptions, that are expressed using (probabilistic, imperative, polynomial-time) programs. Under this view, game transformations become program transformations, and can be justified rigorously by semantic means. Although CBT proofs are easier to verify and are more easily amenable to machine-checking, they go far beyond established theories of program equivalence and exhibit a rich and broad set of reasoning principles that draws from program verification, algebraic reasoning, and probability and complexity theory. Thus, despite the beneficial effect of their underlying framework, CBT proofs remain inherently difficult to verify. In an inspiring paper, Halevi argues that formal verification techniques are mandatory to improve trust in game-based proofs, going as far as describing an interface to a tool for computer-assisted code-based proofs. As a first step towards the completion of Halevi\u2019s programme, we develop CertiCrypt, a framework to construct machine-checked code-based proofs in the Coq proof assistant. (\u2026)\r\n
Related<\/h3>\r\n
\r\n \t
Team members<\/h3>\r\n
\r\n \t
\r\n \t
1) Large-scale convex optimization<\/h3>\r\nLarge-scale convex optimization for big data: Many classification problems that practitioners are facing have large dimensions: large size of observations p, large number of observations n, large number of classes k. Supervised learning algorithms based on convex optimization (such as the support vector machines or logistic regression) cannot yet achieve robustly the ideal complexity O(nk+np), i.e., the size of the data. To achieve this complexity, we propose several open avenues of research (whose impact on solving large problems with higher accuracy is immediate).\r\n\r\nSpecifically we intend to focus on i) Large-scale learning with latent structure, ii) Optimal computational-statistical trade-offs, iii) Large-scale learning with large number of classes : scaling in log(k), iv) Large-scale active learning, v) Large-scale learning with large number of classes : dealing with imbalance and vi) Robust stochastic approximation algorithms.\r\n
2) Large-scale combinatorial optimization<\/h3>\r\nMany problems in computer vision or natural language processing involve large-scale combinatorial problems, for example involving graph cuts or submodular functions. We propose several open avenues of research at the interface between combinatorial and convex optimization (whose main impact is to enlarge the set of problems that can be solved through machine learning).\r\n\r\nIn this context we intend to work on i) Large-scale learning with large number of classes : leveraging the latent geometry; ii) Relationships between convex and combinatorial optimization; and iii)\u00a0 Structured prediction without pain.\r\n
3) Sequential decision making for structured data<\/h3>\r\nMulti-Armed Bandit (MAB) problems constitute a generic benchmark model for learning to make sequential decisions under uncertainty. They capture the trade-off between exploring decisions to learn the statistical properties of the corresponding rewards, and exploiting decisions that have generated the highest rewards so far. In this project, we aim at investigating bandit problems with a large set of available decisions, with structured rewards. The project addresses bandit problems with known and unknown structure, and targets specific applications in online advertising, recommendation and ranking systems.\r\n
Team members<\/h3>\r\nCurrent members:<\/strong>\r\n
\r\n \t