Updated on 2024-07-11 GMT+08:00

Model Deployment

ModelArts is capable of managing models and services. This allows mainstream framework images and models from multiple vendors to be managed in a unified manner.

Generally, AI model deployment and large-scale implementation are complex.

For example, in a smart transportation project, the trained model needs to be deployed to the cloud, edges, and devices. It takes time and effort to deploy the model on the devices, for example, deploying a model on cameras of different specifications and vendors. ModelArts supports one-click deployment of a trained model on various devices for different application scenarios. In addition, it provides a set of secure and reliable one-stop deployment modes for individual developers, enterprises, and device manufacturers.

Figure 1 Process of deploying a model
  • The real-time inference service features high concurrency, low latency, and elastic scaling, and supports multi-model gray release and A/B testing.
  • Models can be deployed as real-time inference services and batch inference tasks on the cloud.