Help Center> ModelArts> API Reference> Service Management> Deploying a Model as a Service

Deploying a Model as a Service

Function

This API is used to deploy a model as a service.

URI

POST /v1/{project_id}/services

Table 1 describes the required parameters.

**Table 1** Parameter description
Parameter	Mandatory	Type	Description
project_id	Yes	String	Project ID. For details about how to obtain the project ID, see Obtaining a Project ID.

Request Body

Table 2 describes the request parameters.

**Table 2** Parameter description
Parameter	Mandatory	Type	Description
service_name	Yes	String	Service name. The value can contain 1 to 64 visible characters, including Chinese characters. Only letters, Chinese characters, digits, hyphens (-), and underscores (_) are allowed.
description	No	String	Service description, which contains a maximum of 100 characters. By default, this parameter is left blank.
infer_type	Yes	String	Inference mode. The value can be real-time, batch, or edge. real-time: real-time service. The service keeps running. batch: batch service, which can be configured as tasks to run in batches. When the tasks are completed, the service stops automatically. edge: inference service deployed on an edge node. You need to create a node on Intelligent EdgeFabric (IEF) in advance.
workspace_id	No	String	ID of the workspace to which a service belongs. The default value is 0, indicating the default workspace.
vpc_id	No	String	ID of the VPC to which a real-time service instance is deployed. By default, this parameter is left blank. In this case, ModelArts allocates a dedicated VPC to each user so that users are isolated from each other. If you need to access other service components in a VPC of a service instance, set this parameter to the ID of the corresponding VPC. Once a VPC is configured, it cannot be modified. When vpc_id and cluster_id are configured together, only the dedicated cluster parameter takes effect.
subnet_network_id	No	String	ID of a subnet. By default, this parameter is left blank. This parameter is mandatory when vpc_id is configured. Enter the network ID displayed in the subnet details on the VPC management console. A subnet provides dedicated network resources that are isolated from other networks.
security_group_id	No	String	Security group. By default, this parameter is left blank. This parameter is mandatory when vpc_id is configured. A security group is a virtual firewall that provides secure network access control policies for service instances. A security group must contain at least one inbound rule to permit the requests whose protocol is TCP, source address is 0.0.0.0/0, and port number is 8080.
cluster_id	No	String	ID of a dedicated cluster. This parameter is left blank by default, indicating that no dedicated cluster is used. When using the dedicated cluster to deploy services, ensure that the cluster status is normal. After this parameter is set, the network configuration of the cluster is used, and the vpc_id parameter does not take effect. If this parameter is configured together with cluster_id in real-time config, cluster_id in real-time config is used preferentially.
config	Yes	config array corresponding to infer_type	Model running configuration. If infer_type is batch or edge, you can configure only one model. If you upload multiple models, the first model is used for creating a service by default. If infer_type is real-time, you can configure multiple models and assign weights based on service requirements. However, the version numbers of multiple models cannot be the same.
schedule	No	schedule array	Service scheduling configuration, which can be configured only for real-time services. By default, this parameter is not used. Services run for a long time. For details, see Table 6.
additional_properties	No	Map<String, Object>	Additional service attribute, which facilitates service management. For details, see Table 7.

**Table 3** **config** parameters of **real-time**
Parameter	Mandatory	Type	Description
model_id	Yes	String	Model ID
weight	Yes	Integer	Traffic weight allocated to a model. This parameter is mandatory only when infer_type is set to real-time. The sum of the weights must be 100.
specification	Yes	String	Resource specifications. Select specifications based on service requirements. For the current version, the following specifications are available: modelarts.vm.cpu.2u modelarts.vm.gpu.0.25p4 modelarts.vm.gpu.0.5p4 modelarts.vm.gpu.p4 modelarts.vm.gpu.0.25t4 modelarts.vm.gpu.0.5t4 modelarts.vm.gpu.t4 modelarts.vm.arm.d310.3u6g modelarts.vm.ai1.a310 modelarts.vm.cpu.free modelarts.vm.gpu.free
instance_count	Yes	Integer	Number of instances deployed in a model The value must be greater than 0.
envs	No	Map<String, String>	(Optional) Environment variable key-value pair required for running a model. By default, this parameter is left blank.
cluster_id	No	string	ID of the dedicated resource pool. By default, this parameter is left blank, indicating that no dedicated resource pool is used. After this parameter is set, the network configuration of the cluster is used, and the vpc_id parameter does not take effect.

**Table 4** **config** parameters of **batch**
Parameter	Mandatory	Type	Description
model_id	Yes	String	Model ID
specification	Yes	String	Resource flavor. Available flavors: modelarts.vm.cpu.2u and modelarts.vm.gpu.p4
instance_count	Yes	Integer	Number of instances deployed in a model
envs	No	Map<String, String>	(Optional) Environment variable key-value pair required for running a model. By default, this parameter is left blank.
src_type	No	String	Data source type. This parameter can be set to ManifestFile. By default, this parameter is left blank, indicating that only files in the src_path directory are read. If this parameter is set to ManifestFile, src_path must be set to a specific manifest file path. Multiple data paths can be specified in the manifest file. For details, see Manifest File Specifications.
src_path	Yes	String	OBS path of the input data of a batch job
dest_path	Yes	String	OBS path of the output data of a batch job
req_uri	Yes	String	Inference path of a batch job. The input parameters and input data vary with the inference path.
mapping_type	Yes	String	Mapping type of the input data. The value can be file or csv. If you select file, each inference request corresponds to a file in the input data path. When this mode is used, req_uri of this model can have only one input parameter and the type of this parameter is file. If you select csv, each inference request corresponds to a row of data in the CSV file. When this mode is used, the files in the input data path can only be in CSV format and mapping_rule needs to be configured to map the index of each parameter in the inference request body to the CSV file.
mapping_rule	No	Map	Mapping between input parameters and CSV data. This parameter is mandatory only when mapping_type is set to csv. Mapping rule: The mapping rule comes from the input parameter (input_params) in the model configuration file config.json. When type is set to string/number/integer/boolean, you need to configure this parameter, that is, the index parameter. For a specific example, refer to the mapping relationship example. The index must be a positive integer starting from 0. If the value of index does not comply with the rule, this parameter is ignored in the request. After the mapping rule is configured, the corresponding CSV data must be separated by commas (,).

**Table 5** **config** parameters of **edge**
Parameter	Mandatory	Type	Description
model_id	Yes	String	Model ID
specification	Yes	String	Resource flavor. Currently, modelarts.vm.cpu.2u and modelarts.vm.gpu.p4 are available.
envs	No	Map<String, String>	(Optional) Environment variable key-value pair required for running a model. By default, this parameter is left blank.
nodes	Yes	String array	Edge node ID array

**Table 6** **schedule** parameters
Parameter	Mandatory	Type	Description
type	Yes	String	Scheduling type. Currently, only the value stop is supported.
time_unit	Yes	String	Scheduling time unit. Possible values are as follows: DAYS HOURS MINUTES
duration	Yes	Integer	Value that maps to the time unit. For example, if the task stops after two hours, set time_unit to HOURS and duration to 2.

**Table 7** Existing service attributes in **additional_properties**
Parameter	Type	Description
smn_notification	smn_notification structure	SMN message notification structure, which is used to notify the user of the service status change. For details, see Table 8.

**Table 8** **smn_notification** structure
Parameter	Mandatory	Type	Description
topic_urn	Yes	String	URN of an SMN topic
events	Yes	List<Integer>	Event ID. Currently, the following event IDs are available: 1: failed 3: running 7: concerning 11: pending

Response Body

Table 9 describes the response parameters.

**Table 9** Parameter description
Parameter	Type	Description
service_id	String	Service ID

Samples

The following shows how to deploy different types of services.

Sample request: Creating a real-time service

POST    https://endpoint/v1/{project_id}/services
{
  "service_name": "mnist",
  "description": "mnist service",
  "infer_type": "real-time",
  "config": [
    {
      "model_id": "xxxmodel-idxxx",
      "weight": "100",
      "specification": "modelarts.vm.cpu.2u",
      "instance_count": 1
    }
  ]
}

Sample request: Creating a real-time service and configuring multi-version traffic distribution

{
  "service_name": "mnist",
  "description": "mnist service",
  "infer_type": "real-time",
  "config": [
    {
      "model_id": "xxxmodel-idxxx",
      "weight": "70",
      "specification": "modelarts.vm.cpu.2u",
      "instance_count": 1,
      "envs":
      {
          "model_name": "mxnet-model-1",
          "load_epoch": "0"
      }
    },
    {
      "model_id": "xxxxxx",
      "weight": "30",
      "specification": "modelarts.vm.cpu.2u",
      "instance_count": 1
    }
  ]
}

Sample request for creating a real-time service in a dedicated resource pool with custom specifications

{
	"service_name": "realtime-demo",
	"description": "",
	"infer_type": "real-time",
	"cluster_id": "8abf68a969c3cb3a0169c4acb24b0000",
	"config": [{
		"model_id": "eb6a4a8c-5713-4a27-b8ed-c7e694499af5",
		"weight": "100",
		"cluster_id": "8abf68a969c3cb3a0169c4acb24b0000",
		"specification": "custom",
		"custom_spec": {
			"cpu": 1.5,
			"memory": 7500,
			"gpu_p4": 0,
			"ascend_a310": 0
		},
		"instance_count": 1
	}]
}

Sample request for creating a real-time service and setting it to automatically stop

{
	"service_name": "service-demo",
	"description": "demo",
	"infer_type": "real-time",
	"config": [{
		"model_id": "xxxmodel-idxxx",
		"weight": "100",
		"specification": "modelarts.vm.cpu.2u",
		"instance_count": 1
	}],
	"schedule": [{
		"type": "stop",
		"time_unit": "HOURS",
		"duration": 1
	}]
}

Sample request: Creating a batch service and setting mapping_type to file

{
"service_name": "batchservicetest",
"description": "",
"infer_type": "batch",
"config": [{
    "model_id": "598b913a-af3e-41ba-a1b5-bf065320f1e2",
    "specification": "modelarts.vm.cpu.2u",
    "instance_count": 1,
    "src_path": "https://infers-data.obs.cn-north-4.myhuaweicloud.com/xgboosterdata/",
    "dest_path": "https://infers-data.obs.cn-north-4d.com/output/",
    "req_uri": "/",
    "mapping_type": "file"
}]
}

Sample request: Creating a batch service and setting mapping_type to csv

{
"service_name": "batchservicetest",
"description": "",
"infer_type": "batch",
"config": [{
    "model_id": "598b913a-af3e-41ba-a1b5-bf065320f1e2",
    "specification": "modelarts.vm.cpu.2u",
    "instance_count": 1,
    "src_path": "https://infers-data.obs.cn-north-4.myhuaweicloud.com/xgboosterdata/",
    "dest_path": "https://infers-data.obs.cn-north-4.myhuaweicloud.com.com/output/",
    "req_uri": "/",
    "mapping_type": "csv",
    "mapping_rule": {
        "type": "object",
        "properties": {
            "data": {
                "type": "object",
                "properties": {
                    "req_data": {
                        "type": "array",
                        "items": [{
                            "type": "object",
                            "properties": {
                                "input5": {
                                    "type": "number",
                                    "index": 0
                                },
                                "input4": {
                                    "type": "number",
                                    "index": 1
                                },
                                "input3": {
                                    "type": "number",
                                    "index": 2
                                },
                                "input2": {
                                    "type": "number",
                                    "index": 3
                                },
                                "input1": {
                                    "type": "number",
                                    "index": 4
                                }
                            }
                        }]
                    }
                }
            }
        }
    }
}]
}

The format of the inference request body described in mapping_rule is as follows:

{
"data": {
    "req_data": [{
        "input1": 1,
        "input2": 2,
        "input3": 3,
        "input4": 4,
        "input5": 5
    }]
}
}

Sample response

{
  "service_id": "10eb0091-887f-4839-9929-cbc884f1e20e"
}

Status Code

For details about the status code, see Table 1.

Parent topic: Service Management

Last Article: Service Management

Next Article: Querying the List of Services

Did this article solve your problem?

Thank you for your score！Your feedback would help us improve the website.

Products

Compute

Application

Dedicated Cloud

Storage

Management & Deployment

Migration

Network

Enterprise Intelligence

Video

Database

Edge Cloud Services

DevCloud

Security

Cloud Communications

Internet of Things

Solutions

Industry-Specific Solutions

General-Purpose Solutions

Security

DevOps

Enterprise Intelligence

Essential Platform

Big Data

Visual Cognition

Speech and Semantics

Support

Help Center

Customer Services

Developers

Console

语言 - Language

中国站 - 简体中文

中国站 - English

International - 简体中文

International - English