Creating a Service Deployment Phase

Description

This phase integrates capabilities of ModelArts service management to enable service deployment and update in a workflow. The application scenarios are as follows:

Deploying a model as a web service
Updating an existing service (gray update supported)

Parameter Overview

You can use ServiceStep to create a service deployment phase. The following is an example of defining a ServiceStep.

**Table 1** **ServiceStep**
Parameter	Description	Mandatory	Data Type
name	Name of a service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.	Yes	str
inputs	Inputs of the service deployment phase.	No	ServiceInput or ServiceInput list
outputs	Outputs of the service deployment phase.	Yes	ServiceOutput or ServiceOutput list
title	Title for frontend display.	No	str
description	Description of the service deployment phase.	No	str
policy	Phase execution policy.	No	StepPolicy
depend_steps	Dependent phases.	No	Step or step list

**Table 2** **ServiceInput**
Parameter	Description	Mandatory	Data Type
name	Input name of the service deployment phase. The name can contain a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-), and must start with a letter. The input name of a step must be unique.	Yes	str
data	Input data object of the service deployment phase.	Yes	Model list or service object. Currently, only ServiceInputPlaceholder, ServiceData, and ServiceUpdatePlaceholder are supported.

**Table 3** **ServiceOutput**
Parameter	Description	Mandatory	Data Type
name	Output name of the service deployment phase. The name can contain a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-), and must start with a letter. The output name of a step must be unique.	Yes	str
service_config	Configurations for service deployment.	Yes	ServiceConfig

Table 4 ServiceConfig

Parameter	Description	Mandatory	Data Type
infer_type	Inference mode. The value can be real-time, batch, or edge. The default value is real-time. real-time: real-time service. The model is deployed as a web service. batch: batch service. A batch service can perform inference on batch data and automatically stops after data processing is completed. edge: edge service. A model is deployed as a web service on an edge node through IEF. You must create an edge node on IEF beforehand.	Yes	str
service_name	Service name. Enter 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed. NOTE: If you do not specify this parameter, the default service name is generated automatically.	No	str, Placeholder
description	Service description, which contains a maximum of 100 characters. By default, this parameter is left blank.	No	str
vpc_id	ID of the VPC to which a real-time service instance is deployed. By default, this parameter is left blank. In this case, ModelArts allocates a dedicated VPC to each user, and users are isolated from each other. To access other service components in the VPC of the service instance, set this parameter to the ID of the corresponding VPC. Once a VPC is configured, it cannot be modified. If both vpc_id and cluster_id are configured, only the dedicated resource pool takes effect.	No	str
subnet_network_id	ID of a subnet. By default, this parameter is left blank. This parameter is mandatory when vpc_id is configured. Enter the network ID displayed in the subnet details on the VPC management console. A subnet provides dedicated network resources that are isolated from other networks.	No	str
security_group_id	Security group. By default, this parameter is left blank. This parameter is mandatory when vpc_id is configured. A security group is a virtual firewall that provides secure network access control policies for service instances. A security group must contain at least one inbound rule to permit the requests whose protocol is TCP, source address is 0.0.0.0/0, and port number is 8080.	No	str
cluster_id	ID of a dedicated resource pool. By default, this parameter is left blank, indicating that no dedicated resource pool is used. When using a dedicated resource pool to deploy services, ensure that the cluster is running properly. After this parameter is configured, the network configuration of the cluster is used, and the vpc_id parameter does not take effect. If both this parameter and cluster_id in real-time config are configured, cluster_id in real-time config is preferentially used.	No	str
additional_properties	Additional configurations.	No	dict
apps	Whether to enable application authentication for service deployment. Multiple application names can be entered.	No	str, Placeholder, list
envs	Environment variables.	No	dict

Example:

example = ServiceConfig()
# This object is used in the output of the service deployment phase.

If there is no special requirement, use the default values.

Examples

There are three scenarios:

Deploying a real-time service
Modifying a real-time service
Getting the inference address from the service deployment phase

Deploying a Real-Time Service

import modelarts.workflow as wf
# Use ServiceStep to define a service deployment phase and specify a model for service deployment.

# Define model name parameters.
model_name = wf.Placeholder(name="placeholder_name", placeholder_type=wf.PlaceholderType.STR)

service_step = wf.steps.ServiceStep(
    name="service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
    title="Deploying a Service",  # Title
    inputs=wf.steps.ServiceInput(name="si_service_ph", data=wf.data.ServiceInputPlaceholder(name="si_placeholder1", 
                                                                                  # Restrictions on the model name: Only the model name specified here can be used in the running state; use the same model name as model_name of the model registration phase.
                                                                                  model_name=model_name)),# ServiceStep inputs
    outputs=wf.steps.ServiceOutput(name="service_output") # ServiceStep outputs
)

workflow = wf.Workflow(
    name="service-step-demo",
    desc="this is a demo workflow",
    steps=[service_step]
)

Modifying a Real-Time Service

Scenario: When you use a new model version to update an existing service, ensure that the name of the new model version is the same as that of the deployed service.

import modelarts.workflow as wf
# Use ServiceStep to define a service deployment phase and specify a model for service update.

# Define model name parameters.
model_name = wf.Placeholder(name="placeholder_name", placeholder_type=wf.PlaceholderType.STR)

# Define a service object.
service = wf.data.ServiceUpdatePlaceholder(name="placeholder_name")

service_step  = wf.steps.ServiceStep(
    name="service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
    title="Service Update", # Title
    inputs=[wf.steps.ServiceInput(name="si2", data=wf.data.ServiceInputPlaceholder(name="si_placeholder2", 
                                                                                  # Restrictions on the model name: Only the model name specified here can be used in the running state.
                                                                                  model_name=model_name)),
           wf.steps.ServiceInput(name="si_service_data", data=service) # ServiceStep inputs are configured when the workflow is running. You can also use wf.data.ServiceData(service_id="fake_service") for the data field.
    ], # ServiceStep inputs
    outputs=wf.steps.ServiceOutput(name="service_output") # ServiceStep outputs
)

workflow = wf.Workflow(
    name="service-step-demo",
    desc="this is a demo workflow",
    steps=[service_step]
)

Getting the Inference Address from the Service Deployment Phase

The service deployment phase supports the output of the inference address. You can use the get_output_variable("access_address") method to obtain the output and use it in subsequent phases.

For services deployed in the public resource pool, you can use access_address to obtain the inference address registered on the public network from the output.

For services deployed in a dedicated resource pool, you can get the internal inference address from the output using cluster_inner_access_address, in addition to the public inference address. The internal address can only be accessed by other inference services.

import modelarts.workflow as wf

# Define model name parameters.
sub_model_name = wf.Placeholder(name="si_placeholder1", placeholder_type=wf.PlaceholderType.STR)

sub_service_step = wf.steps.ServiceStep(
    name="sub_service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
    title="Subservice",  # Title
    inputs=wf.steps.ServiceInput(
        name="si_service_ph",
        data=wf.data.ServiceInputPlaceholder(name="si_placeholder1", model_name=sub_model_name)
    ),# ServiceStep inputs
    outputs=wf.steps.ServiceOutput(name="service_output") # ServiceStep outputs
)


main_model_name = wf.Placeholder(name="si_placeholder2", placeholder_type=wf.PlaceholderType.STR)

# Obtain the inference address output by the subservice and transfer the address to the main service through envs.
main_service_config = wf.steps.ServiceConfig(
    infer_type="real-time", 
    envs={"infer_address": sub_service_step.outputs["service_output"].get_output_variable("access_address")} # Obtain the inference address output by the subservice and transfer the address to the main service through envs.
)

main_service_step = wf.steps.ServiceStep(
    name="main_service_step", # Name of the service deployment phase. The name contains a maximum of 64 characters, including only letters, digits, underscores (_), and hyphens (-). It must start with a letter and must be unique in a workflow.
    title="Main service",  # Title
    inputs=wf.steps.ServiceInput(
        name="si_service_ph",
        data=wf.data.ServiceInputPlaceholder(name="si_placeholder2", model_name=main_model_name)
    ),# ServiceStep inputs
    outputs=wf.steps.ServiceOutput(name="service_output", service_config=main_service_config), # ServiceStep outputs
    depend_steps=sub_service_step
)

workflow = wf.Workflow(
    name="service-step-demo",
    desc="this is a demo workflow",
    steps=[sub_service_step, main_service_step]
)

Configuring Information for Deploying a Synchronous Service

After the service deployment phase is started in the development state (usually a notebook instance), configure the information based on the following format in the logs.

On the ModelArts console, choose Development Workspace > Workflow from the navigation pane.
Configure the information after the service deployment phase is started. After the configuration, click Next.
If you want to start the deployment phase automatically without manual configuration, configure the ServiceInputConfig or ServiceConfig parameter in the code in advance, for example, service_config=wf.steps.ServiceConfig(cluster_id="XX"). If all required parameters are specified, the service deployment phase automatically starts.

Configuring Information for Deploying an Asynchronous Service

On the ModelArts console, choose Workflow from the navigation pane.
Configure the information after the service deployment phase is started. Select an asynchronous inference model and a version, and configure service startup parameters. After the configuration, click Next.