Help Center> ModelArts> Tool Guide> Molde Deployment

Molde Deployment

You can deploy a well-trained model as a real-time service in a few clicks in ToolKit.

Currently, models can only be deployed as real-time services. Batch services and edge services are not supported.

Background

  • Model training has been completed, and the training job status is Successful.
  • Existing training jobs on ModelArts on the public cloud can also be deployed as services in PyCharm ToolKit in a few clicks.
  • Before deploying a trained model, you need to develop inference scripts and configuration files. For details about the development specifications of inference scripts and configuration files, see Model Package Specifications, Specifications for Compiling the Model Configuration File, and Specifications for Compiling Model Inference Code.
  • Inference code and configuration files must be stored in a model output path. To obtain the model output path of a training job running in PyCharm, double-click the job version, and obtain the value of Training Output Path in the ModelArts Training Job area. The model output path is an OBS path.
    Figure 1 Model output path
  • The newly deployed service is displayed in the Real-Time Services list on the Service Deployment page of the ModelArts console. You can also manage real-time services on the ModelArts management console, such as using services for prediction and starting and stopping services.

Deploying a Model

After completing model training, compiling inference code and configuration files, and uploading them to the OBS path (the model directory in the training output path), you can perform the following steps to deploy a model as a real-time service:

  1. In the ModelArts Explorer area, right-click the training job version, and choose Deploy to Service from the shortcut menu.
    Figure 2 Deploy to Service
  2. In the displayed dialog box, enter information required for deploying a model as a service. For details about the parameters, see Table 1.
    Table 1 Parameters for model deployment

    Parameter

    Description

    Service Name

    Name of a real-time service. Set this parameter as prompted.

    Auto Stop

    After this parameter is enabled and the auto stop time is set, a service automatically stops at the specified time.

    After this function is enabled, the service automatically stops 1 hour later by default. You can set auto stop time based on site requirements, for example, 5 hours later.

    Model Source

    If you start a deployment task using a specific training job version, Training Job and Model Path will be automatically filled in. You can also modify them based on site requirements.

    • Training Job: name and version of a training job
    • Model Path: OBS path for storing a trained model

    Specifications

    Select resource specifications used for deploying a real-time service. Currently, the following specifications are supported: CPU: 2 U 8 GiB and CPU: 2 U 8 GiB GPU: 1 x P4

    Compute Nodes

    Set the number of compute nodes. If you set Compute Nodes to 1, the standalone computing mode is used. If you set Compute Nodes to a value greater than 1, the distributed computing mode is used. Select a computing mode based on the actual requirements.

    Environment Variables

    Set environment variables and inject them to the container instance. Use semicolons (;) to separate multiple environment variables.

    Figure 3 Setting parameters for model deployment
  3. After setting the parameters, click OK to start model deployment. After the deployment task is started, the deployment status is displayed in Event Log in the lower left corner.

    It takes a long time to deploy the model as the real-time service. After the deployment is complete, you can click a link to quickly switch to the real-time service page on the ModelArts management console. Note that the HUAWEI CLOUD account and password are required for your first login.

    For deployed services, prediction can be performed only on the ModelArts management console on the public cloud.

    Figure 4 Deployment status