Updated on 2024-10-29 GMT+08:00

Creating a Service Deployment Phase

Description

This phase integrates capabilities of ModelArts service management to enable service deployment and update in a workflow. The application scenarios are as follows:

  • Deploying a model as a web service
  • Updating an existing service (gray update supported)

Parameter Overview

You can use ServiceStep to create a service deployment phase. The following is an example of defining a ServiceStep.

Table 1 ServiceStep

Parameter

Description

Mandatory

Data Type

name

Name of a service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.

Yes

str

inputs

Inputs of the service deployment phase.

No

ServiceInput or ServiceInput list

outputs

Outputs of the service deployment phase.

Yes

ServiceOutput or ServiceOutput list

title

Title for frontend display.

No

str

description

Description of the service deployment phase.

No

str

policy

Phase execution policy.

No

StepPolicy

depend_steps

Dependent phases.

No

Step or step list

Table 2 ServiceInput

Parameter

Description

Mandatory

Data Type

name

Input name of the service deployment phase. The name can contain a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-), and must start with a letter. The input name of a step must be unique.

Yes

str

data

Input data object of the service deployment phase.

Yes

Model list or service object. Currently, only ServiceInputPlaceholder, ServiceData, and ServiceUpdatePlaceholder are supported.

Table 3 ServiceOutput

Parameter

Description

Mandatory

Data Type

name

Output name of the service deployment phase. The name can contain a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-), and must start with a letter. The output name of a step must be unique.

Yes

str

service_config

Configurations for service deployment.

Yes

ServiceConfig

Table 4 ServiceConfig

Parameter

Description

Mandatory

Data Type

infer_type

Inference mode. The value can be real-time, batch, or edge. The default value is real-time.

  • real-time: real-time service. The model is deployed as a web service.
  • batch: batch service. A batch service can perform inference on batch data and automatically stops after data processing is completed.
  • edge: edge service. A model is deployed as a web service on an edge node through IEF. You must create an edge node on IEF beforehand.

Yes

str

service_name

Service name. Enter 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed.

NOTE:

If you do not specify this parameter, the default service name is generated automatically.

No

str, Placeholder

description

Service description, which contains a maximum of 100 characters. By default, this parameter is left blank.

No

str

vpc_id

ID of the VPC to which a real-time service instance is deployed. By default, this parameter is left blank. In this case, ModelArts allocates a dedicated VPC to each user, and users are isolated from each other. To access other service components in the VPC of the service instance, set this parameter to the ID of the corresponding VPC. Once a VPC is configured, it cannot be modified. If both vpc_id and cluster_id are configured, only the dedicated resource pool takes effect.

No

str

subnet_network_id

ID of a subnet. By default, this parameter is left blank. This parameter is mandatory when vpc_id is configured. Enter the network ID displayed in the subnet details on the VPC management console. A subnet provides dedicated network resources that are isolated from other networks.

No

str

security_group_id

Security group. By default, this parameter is left blank. This parameter is mandatory when vpc_id is configured. A security group is a virtual firewall that provides secure network access control policies for service instances. A security group must contain at least one inbound rule to permit the requests whose protocol is TCP, source address is 0.0.0.0/0, and port number is 8080.

No

str

cluster_id

ID of a dedicated resource pool. By default, this parameter is left blank, indicating that no dedicated resource pool is used. When using a dedicated resource pool to deploy services, ensure that the cluster is running properly. After this parameter is configured, the network configuration of the cluster is used, and the vpc_id parameter does not take effect. If both this parameter and cluster_id in real-time config are configured, cluster_id in real-time config is preferentially used.

No

str

additional_properties

Additional configurations.

No

dict

apps

Whether to enable application authentication for service deployment. Multiple application names can be entered.

No

str, Placeholder, list

envs

Environment variables.

No

dict

Example:

example = ServiceConfig()
# This object is used in the output of the service deployment phase.

If there is no special requirement, use the default values.

Examples

There are three scenarios:

  • Deploying a real-time service
  • Modifying a real-time service
  • Getting the inference address from the service deployment phase

Deploying a Real-Time Service

import modelarts.workflow as wf
# Use ServiceStep to define a service deployment phase and specify a model for service deployment.

# Define model name parameters.
model_name = wf.Placeholder(name="placeholder_name", placeholder_type=wf.PlaceholderType.STR)

service_step = wf.steps.ServiceStep(
    name="service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
    title="Deploying a Service",  # Title
    inputs=wf.steps.ServiceInput(name="si_service_ph", data=wf.data.ServiceInputPlaceholder(name="si_placeholder1", 
                                                                                  # Restrictions on the model name: Only the model name specified here can be used in the running state; use the same model name as model_name of the model registration phase.
                                                                                  model_name=model_name)),# ServiceStep inputs
    outputs=wf.steps.ServiceOutput(name="service_output") # ServiceStep outputs
)

workflow = wf.Workflow(
    name="service-step-demo",
    desc="this is a demo workflow",
    steps=[service_step]
)

Modifying a Real-Time Service

Scenario: When you use a new model version to update an existing service, ensure that the name of the new model version is the same as that of the deployed service.

import modelarts.workflow as wf
# Use ServiceStep to define a service deployment phase and specify a model for service update.

# Define model name parameters.
model_name = wf.Placeholder(name="placeholder_name", placeholder_type=wf.PlaceholderType.STR)

# Define a service object.
service = wf.data.ServiceUpdatePlaceholder(name="placeholder_name")

service_step  = wf.steps.ServiceStep(
    name="service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
    title="Service Update", # Title
    inputs=[wf.steps.ServiceInput(name="si2", data=wf.data.ServiceInputPlaceholder(name="si_placeholder2", 
                                                                                  # Restrictions on the model name: Only the model name specified here can be used in the running state.
                                                                                  model_name=model_name)),
           wf.steps.ServiceInput(name="si_service_data", data=service) # ServiceStep inputs are configured when the workflow is running. You can also use wf.data.ServiceData(service_id="fake_service") for the data field.
    ], # ServiceStep inputs
    outputs=wf.steps.ServiceOutput(name="service_output") # ServiceStep outputs
)

workflow = wf.Workflow(
    name="service-step-demo",
    desc="this is a demo workflow",
    steps=[service_step]
)

Getting the Inference Address from the Service Deployment Phase

The service deployment phase supports the output of the inference address. You can use the get_output_variable("access_address") method to obtain the output and use it in subsequent phases.

  • For services deployed in the public resource pool, you can use access_address to obtain the inference address registered on the public network from the output.
  • For services deployed in a dedicated resource pool, you can get the internal inference address from the output using cluster_inner_access_address, in addition to the public inference address. The internal address can only be accessed by other inference services.
    import modelarts.workflow as wf
    
    # Define model name parameters.
    sub_model_name = wf.Placeholder(name="si_placeholder1", placeholder_type=wf.PlaceholderType.STR)
    
    sub_service_step = wf.steps.ServiceStep(
        name="sub_service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
        title="Subservice",  # Title
        inputs=wf.steps.ServiceInput(
            name="si_service_ph",
            data=wf.data.ServiceInputPlaceholder(name="si_placeholder1", model_name=sub_model_name)
        ),# ServiceStep inputs
        outputs=wf.steps.ServiceOutput(name="service_output") # ServiceStep outputs
    )
    
    
    main_model_name = wf.Placeholder(name="si_placeholder2", placeholder_type=wf.PlaceholderType.STR)
    
    # Obtain the inference address output by the subservice and transfer the address to the main service through envs.
    main_service_config = wf.steps.ServiceConfig(
        infer_type="real-time", 
        envs={"infer_address": sub_service_step.outputs["service_output"].get_output_variable("access_address")} # Obtain the inference address output by the subservice and transfer the address to the main service through envs.
    )
    
    main_service_step = wf.steps.ServiceStep(
        name="main_service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
        title="Main service",  # Title
        inputs=wf.steps.ServiceInput(
            name="si_service_ph",
            data=wf.data.ServiceInputPlaceholder(name="si_placeholder2", model_name=main_model_name)
        ),# ServiceStep inputs
        outputs=wf.steps.ServiceOutput(name="service_output", service_config=main_service_config), # ServiceStep outputs
        depend_steps=sub_service_step
    )
    
    workflow = wf.Workflow(
        name="service-step-demo",
        desc="this is a demo workflow",
        steps=[sub_service_step, main_service_step]
    )

Configuring Information for Deploying a Synchronous Service

After the service deployment phase is started in the development state (usually a notebook instance), configure the information based on the following format in the logs.

  1. On the ModelArts console, choose Development Workspace > Workflow from the navigation pane.
  2. Configure the information after the service deployment phase is started. After the configuration, click Next.

Configuring Information for Deploying an Asynchronous Service

  1. On the ModelArts console, choose Workflow from the navigation pane.
  2. Configure the information after the service deployment phase is started. Select an asynchronous inference AI application and a version, and configure service startup parameters. After the configuration, click Next.

After you select the required AI application and version, the system automatically matches the service startup parameters.