{"id":964476,"date":"2023-08-29T05:48:00","date_gmt":"2023-08-29T12:48:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=964476"},"modified":"2023-08-29T05:48:00","modified_gmt":"2023-08-29T12:48:00","slug":"litepred-transferable-and-scalable-latency-prediction-for-hardware-aware-neural-architecture-search","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/litepred-transferable-and-scalable-latency-prediction-for-hardware-aware-neural-architecture-search\/","title":{"rendered":"LitePred: Transferable and Scalable Latency Prediction for Hardware-Aware Neural Architecture Search"},"content":{"rendered":"
Hardware-Aware Neural Architecture Search (NAS) has demonstrated success in automating the design of affordable deep neural networks (DNNs) for edge platforms by incorporating inference latency in the search process. However, accurately and efficiently predicting DNN inference latency on diverse edge platforms remains a significant challenge. Current approaches require several days to construct new latency predictors for each one platform, which is prohibitively time-consuming and impractical.<\/p>\n
In this paper, we propose LitePred, a lightweight approach for accurately predicting DNN inference latency on new platforms with minimal adaptation data by transferring existing predictors. LitePred builds on two key techniques: (i)\u00a0a Variational Autoencoder (VAE) data sampler<\/em>\u00a0to sample high-quality training and adaptation data that conforms to the model distributions in NAS search spaces, overcoming the out-of-distribution challenge; and (ii)\u00a0a latency distribution-based similarity detection method<\/em>\u00a0to identify the most similar pre-existing latency predictors for the new target platform, reducing adaptation data required while achieving high prediction accuracy. Extensive experiments on<\/p>\n