{"id":66786,"date":"2023-03-01T15:31:40","date_gmt":"2023-03-01T14:31:40","guid":{"rendered":"https:\/\/www.microsoft.com\/en-gb\/industry\/blog\/?p=66786"},"modified":"2023-03-01T15:31:44","modified_gmt":"2023-03-01T14:31:44","slug":"simplifying-ml-deployment-with-azures-managed-endpoints","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-gb\/industry\/blog\/technetuk\/2023\/03\/01\/simplifying-ml-deployment-with-azures-managed-endpoints\/","title":{"rendered":"Simplifying ML Deployment with Azure’s Managed Endpoints"},"content":{"rendered":"
\"A<\/figure>\n\n\n\n

Deploying machine learning models in the cloud is becoming more crucial as businesses utilise artificial intelligence (AI) and machine learning (ML) to gain a competitive edge. However, with so many options available, it can be very difficult to know where to start when it comes to getting your models deployed to the cloud.  <\/p>\n\n\n\n

To generate predictions from a machine learning model, there are two main approaches: real-time (online) inference and batch (offline) inference. Real-time inference involves generating predictions in real-time, while batch inference generates predictions based on a batch of observations. <\/p>\n\n\n\n

\"An\"An\"An\"An <\/p>\n\n\n\n

The method of deploying machine learning models will vary depending on whether the use case is real-time or batch inference. Options for deployment in the cloud include managed online endpoints, Kubernetes online endpoints, Azure Container Instance, and Azure Functions. The right choice will depend on the specific needs of your project. <\/p>\n\n\n\n

In this article, we will take a closer look at managed online endpoints and explore its features and benefits. Whether you are new to machine learning or are looking to deploy your models, this article will provide an introduction and step-by-step guide to help you get started with managed online endpoints using Azure Machine Learning Studio. We will develop a machine learning model using Azure AutoML and demonstrate how to deploy the trained model to an online endpoint. <\/p>\n\n\n\n

Managed Endpoints \u2013 What are they? <\/h3>\n\n\n\n

Managed endpoints is a fully managed service provided by Microsoft Azure that allows for quick and easy deployment of machine learning models in the cloud. <\/p>\n\n\n\n

With managed endpoints, you do not need to worry about the technical details of deploying the models. Typically when deploying models for real-time or online inference, you would need to prepare and specify a scoring script, environment and infrastructure. Instead, Azure takes care of all of this for you allowing you to deploy models in a turnkey manner without worrying about infrastructure and environment setup.\u00a0<\/p>\n\n\n\n

Understanding the Key Components of Managed Endpoints <\/h3>\n\n\n\n

Endpoints<\/strong> <\/p>\n\n\n\n

An endpoint in Azure Machine Learning is a URL that allows you to access a machine learning model and make predictions. There are two types of endpoints: online endpoints and batch endpoints. Online endpoints provide real-time predictions, meaning you can receive a prediction response immediately after sending a request to the endpoint. Batch endpoints, on the other hand, allow you to send a batch of requests for predictions, which can be processed in parallel and returned as a batch of predictions. <\/p>\n\n\n\n

Online<\/strong> Endpoints<\/strong> <\/p>\n\n\n\n

In Azure, online endpoints come in two forms: managed online endpoints and Kubernetes online endpoints. Managed online endpoints are designed to make deployment simple, with Azure handling all the technical details. Kubernetes online endpoints provide more control and customisation, as they allow you to deploy models and serve online endpoints on your own fully configured and managed Kubernetes cluster, with support for CPUs or GPUs. The differences between the two options are detailed\u202fhere<\/a>. <\/p>\n\n\n\n

Online endpoint deployment is ideal for applications that require real-time predictions, such as fraud detection, predictive maintenance, personalisation and predictive pricing. <\/p>\n\n\n\n

Deployments<\/strong> <\/p>\n\n\n\n

Deployment in Azure Machine Learning involves taking a trained and tested machine learning model and making it available for real-world use through a production environment. Azure handles the necessary infrastructure setup and configurations, providing an endpoint for accessing the model. <\/p>\n\n\n\n

Each endpoint in Azure Machine Learning can host multiple deployments, allowing you to deploy different versions of a machine learning model or multiple models for different purposes. One way to visualise multiple deployments on a single endpoint is to think of an endpoint as a website, and each deployment as a different web page on the site. The endpoint provides a single URL that can be used to access any of the web pages hosted on the site.  <\/p>\n\n\n\n

An online endpoint can do load balancing to give any percentage of traffic to each deployment. This means that you can direct a specific percentage of incoming prediction requests to each deployment, allowing you to test and compare the performance of different versions of a machine learning model or to gradually roll out a new model to production. For example, the diagram shows an online endpoint with two models. Model 1 is configured to have 60% of incoming traffic while Model 2 has 40% of the remaining traffic. <\/p>\n\n\n

\"A<\/figure>\n\n\n\n

Importance of Azure managed online endpoints in cloud computing <\/h3>\n\n\n\n

Azure Managed Online Endpoints provides several benefits. These are: <\/p>\n\n\n\n