Obtaining Service Details
Function
This API is used to obtain the details about a model service based on the service ID.
Debugging
You can debug this API through automatic authentication in API Explorer or use the SDK sample code generated by API Explorer.
URI
GET /v1/{project_id}/services/{service_id}
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
project_id |
Yes |
String |
Project ID. For details, see Obtaining a Project ID and Name. |
service_id |
Yes |
String |
Service ID. You can obtain the service ID from the response body when creating a service or obtain the services owned by the current user by calling the API for querying the service list. The service_id field indicates the service ID. |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
Yes |
String |
User token. It can be obtained by calling the IAM API that is used to obtain a user token. The value of X-Subject-Token in the response header is the user token. |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
service_id |
String |
Service ID |
service_name |
String |
Service name |
description |
String |
Service description |
tenant |
String |
Tenant to which a service belongs |
project |
String |
Project to which a service belongs |
owner |
String |
User to which a service belongs |
publish_at |
Number |
Latest service release time, in milliseconds calculated from 1970.1.1 0:0:0 UTC. |
infer_type |
String |
Inference mode. Options:
|
workspace_id |
String |
Workspace ID |
cluster_id |
String |
ID of the dedicated resource pool used by a real-time or batch service or ID of the edge resource pool used by an edge service. This parameter is returned only when a dedicated resource pool or edge resource pool is configured. |
vpc_id |
String |
ID of the VPC to which the real-time service instance belongs. This parameter is available when the network configuration is customized. |
subnet_network_id |
String |
ID of the subnet to which the real-time service instance belongs. This parameter is available when the network configuration is customized. |
security_group_id |
String |
Security group to which the real-time service instance belongs. This parameter is available when the network configuration is customized. |
status |
String |
Service status. The options are as follows:
|
progress |
Integer |
Deployment progress. This parameter is available when the status is deploying. |
error_msg |
String |
Error message. When status is failed, an error message carrying the failure cause is returned. |
config |
Array of QueryServiceConfig objects |
Service configuration (If a service is shared, only model_id, model_name, and model_version are returned.) |
access_address |
String |
Access address of an inference request. This parameter is only valid when infer_type is set to real-time and the service is deployed. |
bind_access_address |
String |
Request address of a custom domain name. This parameter is available after a domain name is bound. |
invocation_times |
Number |
Total number of service calls |
failed_times |
Number |
Number of failed service calls |
is_shared |
Boolean |
Whether a service is subscribed |
shared_count |
Number |
Number of subscribed services |
schedule |
Array of Schedule objects |
Service scheduling configuration. If this parameter is not configured, no value will be returned. |
update_time |
Number |
Time when the configuration used by the current service is updated, in milliseconds calculated from 1970.1.1 0:0:0 UTC. |
debug_url |
String |
Online debugging address of a real-time service. This parameter is available only when the model supports online debugging and there is only one instance. |
due_time |
Number |
Time when an online service automatically stops, in milliseconds calculated from 1970.1.1 0:0:0 UTC. If automatic stop is not configured, this parameter is not returned. |
operation_time |
Number |
Operation time of a request |
transition_at |
Number |
Time when the service status changes |
is_free |
Boolean |
Whether a free-of-charge flavor is used |
additional_properties |
Map<String,String> |
Additional service attribute |
pool_name |
String |
Resource pool ID of the elastic cluster in the AI dedicated resource pool used by the real-time or batch service. This parameter is returned only when a dedicated resource pool is configured. |
load_balancer_policy |
String |
Backend ELB forwarding policy that can be set for synchronous real-time services. The value can be ROUND_ROBIN (weighted round robin), LEAST_CONNECTIONS (weighted least connections), or SOURCE_IP (source IP address algorithm). |
priority |
Integer |
Preemption priority. The value ranges from 1 to 3. The scheduling of high-priority services is guaranteed by setting the preemption priority. When infer_type is set to real-time or batch, the preemption priority can be set. |
Parameter |
Type |
Description |
---|---|---|
model_version |
String |
Model version |
finished_time |
Number |
Task completion time, in milliseconds calculated from 1970.1.1 0:0:0 UTC. This parameter is not returned before the task is complete. |
custom_spec |
CustomSpec object |
Customized resource specification configuration. This parameter is returned only when specification is set to custom. |
envs |
Map<String,String> |
Environment variable key-value pair required for running a model |
specification |
String |
Resource specifications, for example, modelarts.vm.cpu.2u/modelarts.vm.gpu.pnt004/modelarts.vm.ai1.snt3. If this parameter is set to custom, a customized flavor is used, which matches the custom_spec field. |
weight |
Integer |
Traffic weight allocated to a model |
source_type |
String |
Model source. This parameter is returned when a model is created using ExeML. The value is auto. |
model_id |
String |
Model ID |
src_path |
String |
OBS path for storing the input data of a batch task, for example, https://xxx.obs.myhwclouds.com/image/. |
req_uri |
String |
Inference path invoked in a batch task, for example, /. |
mapping_type |
String |
Mapping type of the input data, which can be file or csv |
start_time |
Number |
Task start time, in milliseconds calculated from 1970.1.1 0:0:0 UTC. This parameter is not returned before the task starts. |
cluster_id |
String |
ID of the dedicated resource pool or edge resource pool used by the service instance. This parameter is returned only when a dedicated resource pool or edge resource pool is configured. |
nodes |
Array of Nodes objects |
Edge node information. This parameter is returned only when ModelArts edge nodes are configured. |
mapping_rule |
Object |
Mapping between input parameters and CSV data. This parameter is mandatory only when mapping_type is set to csv. |
model_name |
String |
Model name |
src_type |
String |
Data source type. This parameter is returned only when ManifestFile is used. |
dest_path |
String |
OBS path to the output data of a batch job Example: https://xxx.obs.myhwclouds.com/res/. |
instance_count |
Integer |
Number of instances deployed for a model |
status |
String |
Service status. The options are as follows:
|
scaling |
Boolean |
Whether auto scaling is enabled |
support_debug |
Boolean |
Whether a model supports online debugging |
additional_properties |
Map<String,ModelAdditionalProperties> |
Additional model deployment attribute |
pool_name |
String |
Resource pool ID of the elastic cluster in the AI dedicated resource pool used by the service instance. This parameter is returned only when a dedicated resource pool is configured. |
affinity |
ServiceAffinity object |
Service Affinity Information |
specification_details |
ServiceSpecificationDetails object |
Flavor details. |
Parameter |
Type |
Description |
---|---|---|
gpu_p4 |
Float |
Number of GPUs, which can be a decimal. The value cannot be smaller than 0, with the third decimal place is rounded off. This parameter is optional and is not used by default. |
memory |
Integer |
Memory in MB, which must be an integer |
cpu |
Float |
Number of CPU cores, which can be a decimal. The value cannot be smaller than 0.01, with the third decimal place is rounded off. |
ascend_a310 |
Integer |
Number of Ascend chips. This parameter is optional and is not used by default. Either this parameter or gpu is configured. |
Parameter |
Type |
Description |
---|---|---|
memory |
Integer |
Memory size, in MB |
os_version |
String |
OS version of a node |
cpu |
Integer |
Number of CPU cores |
created_at |
String |
Creation time, in the format of YYYY-MM-DDThh:mm:ss (UTC) |
description |
String |
Node description. |
message |
String |
Indicates the reason when instance_status is failed or notReady. |
predict_url |
String |
Inference URL of a node |
enable_gpu |
Boolean |
Whether to enable GPUs |
gpu_num |
Integer |
Number of GPUs |
host_ips |
Array of strings |
Host IP address of a node |
updated_at |
String |
Update time, in the format of YYYY-MM-DDThh:mm:ss (UTC) |
node_label |
String |
Node label |
os_type |
String |
OS type of a node |
name |
String |
Name of an edge node |
os_name |
String |
OS name of a node |
arch |
String |
Node architecture |
id |
String |
Edge node ID |
instance_status |
String |
Running status of a model instance on the node. The options are as follows:
|
state |
String |
Host status, which can be RUNNING, FAIL, or UNCONNECTED |
deployment_num |
Integer |
Number of application instances deployed on a node |
host_name |
String |
Host name of a node |
Parameter |
Type |
Description |
---|---|---|
log_volume |
Array of LogVolume objects |
Host directory mounting. This parameter takes effect only if a dedicated resource pool is used. If a public resource pool is used to deploy services, this parameter cannot be configured. Otherwise, an error will occur. |
max_surge |
Float |
The value must be greater than 0. If this parameter is not set, the default value 1 is used. If the value is less than 1, it indicates the percentage of instances to be added during the rolling upgrade. If the value is greater than 1, it indicates the maximum number of instances to be added during the rolling upgrade. |
max_unavailable |
Float |
The value must be greater than 0. If this parameter is not set, the default value 0 is used. If the value is less than 1, it indicates the percentage of instances that can be scaled in during the rolling upgrade. If the value is greater than 1, it indicates the number of instances that can be scaled in during the rolling upgrade. |
termination_grace_period_seconds |
Integer |
Graceful stop period of a container. |
persistent_volumes |
Array of PersistentVolumes objects |
Persistent storage mounting. |
dew_secret |
DewSecret object |
DEW secret. |
Parameter |
Type |
Description |
---|---|---|
host_path |
String |
Log path to be mapped on the host |
mount_path |
String |
Path to the logs in the container |
Parameter |
Type |
Description |
---|---|---|
name |
String |
Volume name. |
mount_path |
String |
Mount path of a volume in the container. Example: /tmp. The container path must not be a system directory, such as / and /var/run. Otherwise, an exception occurs. It is a good practice to mount the container to an empty directory. If the directory is not empty, ensure that there are no files affecting container startup in the directory. Otherwise, such files will be replaced, resulting in failures to start the container and create the workload. |
storage_type |
String |
Mount type: sfs_turbo. |
source_address |
String |
Specifies the mounting source path. The value is the SFS Turbo ID when an EFS file is mounted. |
Parameter |
Type |
Description |
---|---|---|
node_affinity |
NodeAffinity object |
Set this parameter when node affinity is used. |
Parameter |
Type |
Description |
---|---|---|
mode |
String |
Node affinity mode. The value required indicates strong affinity. A service instance can be scheduled only to a specified node. If the specified node does not exist, the scheduling fails. preferred indicates weak affinity. A service instance tends to be scheduled to a specified node. If the specified node does not meet the scheduling conditions, the service instance will be scheduled to another node. |
pool_infos |
Array of AffinityPoolInfo objects |
Configure an affinity policy for a specified cluster and specify the nodes in the cluster. |
Parameter |
Type |
Description |
---|---|---|
pool_name |
String |
Cluster name. The cluster name must be in the outer pool_name. |
nodes |
Array of AffinityNodeInfo objects |
Affinity Node List |
Parameter |
Type |
Description |
---|---|---|
name |
String |
Node name, which corresponds to the private IP address of the node. |
Parameter |
Type |
Description |
---|---|---|
display_en |
String |
Flavor name. |
display_cn |
String |
Flavor name, in Chinese. |
category |
String |
Flavor type, which can be CPU (no accelerator cards), GPU (GPU accelerator cards), or NPU (NPU accelerator cards). |
cpu_info |
CpuDisplayInfo object |
CPU details. |
memory_info |
MemoryDisplayInfo object |
Memory details. |
gpu_info |
GpuDisplayInfo object |
GPU details. |
npu_info |
NpuDisplayInfo object |
NPU details. |
Parameter |
Type |
Description |
---|---|---|
cpu |
Double |
Number of CPU cores. |
arch |
String |
CPU architecture, which can be x86 or Arm. |
Parameter |
Type |
Description |
---|---|---|
memory |
Integer |
Memory capacity. |
unit |
String |
Memory unit, for example, MB or GB. |
Parameter |
Type |
Description |
---|---|---|
gpu |
Double |
Number of GPUs. |
brand |
String |
GPU vendor. |
version |
String |
GPU model. |
memory |
Integer |
GPU memory capacity. |
unit |
String |
GPU memory unit, for example, MB or GB. |
Parameter |
Type |
Description |
---|---|---|
npu |
Integer |
Number of NPUs. |
brand |
String |
NPU vendor. |
version |
String |
NPU model. |
memory |
Integer |
NPU memory. |
unit |
String |
NPU memory unit, for example, MB or GB. |
Parameter |
Type |
Description |
---|---|---|
duration |
Integer |
Value mapping a time unit. For example, if the task stops after two hours, set time_unit to HOURS and duration to 2. |
time_unit |
String |
Scheduling time unit. Possible values are DAYS, HOURS, and MINUTES. |
type |
String |
Scheduling type. Currently, the value can only be stop, indicating that the task automatically stops after a specified period of time. |
Example Requests
GET https://{endpoint}/v1/{project_id}/services/{service_id}
Example Responses
Status code: 200
Service Details
{ "service_id" : "f76f20ba-78f5-44e8-893a-37c8c600c02f", "service_name" : "service-demo", "tenant" : "xxxxx", "project" : "xxxxx", "owner" : "xxxxx", "publish_at" : 1585809231902, "update_time" : 1585809358259, "infer_type" : "real-time", "status" : "running", "progress" : 100, "access_address" : "https://xxxxx.apigw.xxxxx.com/v1/infers/088458d9-5755-4110-97d8-1d21065ea10b/f76f20ba-78f5-44e8-893a-37c8c600c02f", "cluster_id" : "088458d9-5755-4110-97d8-1d21065ea10b", "workspace_id" : "0", "additional_properties" : { }, "is_shared" : false, "invocation_times" : 0, "failed_times" : 0, "shared_count" : 0, "operation_time" : 1586249085447, "config" : [ { "model_id" : "044ebf3d-8bf4-48df-bf40-bad0e664c1e2", "model_name" : "jar-model", "model_version" : "1.0.1", "specification" : "custom", "custom_spec" : { }, "status" : "notReady", "weight" : 100, "instance_count" : 1, "scaling" : false, "envs" : { }, "additional_properties" : { }, "support_debug" : false } ], "transition_at" : 1585809231902, "is_free" : false }
Status Codes
Status Code |
Description |
---|---|
200 |
Service Details |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot