We are re-thinking how systems can be designed to empower Artificial Intelligence (AI)/deep learning researchers, developers and users.
Deep Learning Model Training
One example is the Gandiva cluster scheduler project. Deep learning training jobs are compute-hungry. Thus, they are typically scheduled in clusters with high-performance and expensive GP-GPUs (General purpose Graphics Processing Unit). However, schedulers that manage these deep learning jobs today are borrowed from the big-data world that treat these jobs as black-boxes.
Unlike big-data jobs, deep learning training jobs have several unique characteristics. Gandiva is an introspective cluster scheduler that “understands and looks into” deep learning jobs. This enables Gandiva to deliver early-feedback to deep learning jobs that AI developers need while simultaneously improving efficiency of GPU cluster usage. For more details, refer to our OSDI 2018 paper (opens in new tab).
Deep Learning Model Inference
Another example is the Privado “Private AI” project. Our goal is secure and practical deep learning model inference in the cloud. Consider a company that has trained a new state-of-the-art deep learning model that can detect diseases accurately based on x-rays. The company desires to host their model in the cloud for their customers. However, the company does not want to reveal their model parameters which is their intellectual property. Similarly, the customers of the company may not want to reveal their x-rays as it is sensitive data. How then can the cloud host such an inference service while simultaneously guaranteeing privacy to both the company and its customers?
We have built the Privado system that takes a deep learning model and, with zero developer effort, converts it into a cloud-based inference service. The Privado system guarantees privacy to both the model owner and their customers. Privado ensures efficiency by leveraging trusted hardware such as Intel’s Software Guard eXtenstions (SGX). Simultaneously, Privado ensures privacy by designing the inference software to be resistant to access-based side-channel attacks that SGX can be susceptible to. See our Arxiv paper (opens in new tab) for more details.