Viewing Service Details
After an AI application is deployed as a real-time service, you can access the service page to view its details.
- Log in to the ModelArts management console and choose Service Deployment > Real-Time Services.
- On the Real-Time Services page, click the name of the target service. The service details page is displayed.
You can view the service name, status, and other information. For details, see Table 1.
Table 1 real-time service parameters Parameter
Description
Name
Name of the real-time service.
Status
Status of the real-time service.
Source
AI application source of the real-time service.
Service ID
Real-time service ID
Description
Service description, which can be edited after you click the edit button on the right side.
Resource Pool
Resource pool specifications used by the service.
Custom Settings
Customized configurations based on real-time service versions. This allows version-based traffic distribution policies and configurations. Enable this option and click View Settings to customize the settings. For details, see Modifying Customized Settings.
Traffic Limit
Maximum number of times a service can be accessed within a second.
WebSocket
Whether to upgrade to the WebSocket service.
- Switch between tabs on the details page of a real-time service to view more details. For details, see Table 2.
Table 2 Details of a real-time service Parameter
Description
Usage Guides
This page displays the API URL, AI application information, input parameters, and output parameters. You can click to copy the API URL to call the service.
Prediction
You can perform real-time prediction on this page. For details, see Testing the Deployed Service.
Configuration Updates
This page displays Current Configurations and Update History.
- Current Configurations: AI application name, version, status, deployed resource pool, compute node specifications, traffic ratio, number of compute nodes, and deployment timeout interval. You can deploy a dedicated resource pool on this page, and the resource pool information is displayed.
- Update History: historical AI application information.
Monitoring
This page displays resource usage and AI application calls.
- Resource Usage: includes the used and available CPU, memory, and GPU resources.
- AI Application Calls: indicates the number of AI application calls. The statistics collection starts after the AI application status changes to Ready. (This parameter is not displayed for WebSocket services.)
Event
This page displays key operations during service use, such as the service deployment progress, detailed causes of deployment exceptions, and time points when a service is started, stopped, or modified.
Events are saved for one month and will be automatically cleared then.
For details about how to view events of a service, see Viewing Service Events.
Logs
This page displays the log information about each AI application in the service. You can view logs generated in the latest 5 minutes, latest 30 minutes, latest 1 hour, and user-defined time segment.
You can select the start time and end time when defining the time segment.
Meet the following rules to search logs:
- Do not enter strings that contain any following delimiters: ,'";=()[]{}@&<>/:\n\t\r.
- Enter keywords for exact search. A keyword is a word between two adjacent delimiters.
- Enter keywords for fuzzy search. For example, you can enter error, er?or, rro*, or er*r.
- Enter phrases for exact search. For example, Start to refresh.
- Before enabling this function, you can combine keywords with AND (&&) or OR (||). For example, query logs&&erro* or query logs||erro*. After enabling this function, you can combine keywords with AND or OR. For example, query logs AND erro* or query logs OR erro*.
Modifying Customized Settings
A customized configuration rule consists of the configuration condition (Setting), access version (Version), and customized running parameters (including Setting Name and Setting Value).
You can configure different settings with customized running parameters for different versions of a real-time service.
The priorities of customized configuration rules are in descending order. You can change the priorities by dragging the sequence of customized configuration rules.
After a rule is matched, the system will no longer match subsequent rules. A maximum of 10 configuration rules can be configured.
Parameter |
Mandatory |
Description |
---|---|---|
Setting |
Yes |
Expression of the Spring Expression Language (SPEL) rule. Only the equal, matches, and hashCode expressions of the character type are supported. |
Version |
Yes |
Access version for a customized service configuration rule. When a rule is matched, the real-time service of the version is requested. |
Setting Name |
No |
Key of a customized running parameter, consisting of a maximum of 128 characters. Configure this parameter if the HTTP message header is used to carry customized running parameters to a real-time service. |
Setting Value |
No |
Value of a customized running parameter, consisting of a maximum of 256 characters. Configure this parameter if the HTTP message header is used to carry customized running parameters to a real-time service. |
Customized settings can be used in the following scenarios:
- If multiple versions of a real-time service are deployed for gray release, customized settings can be used to distribute traffic by user.
Table 4 Built-in variables Built-in Variable
Description
DOMAIN_NAME
Account name that is used to call an inference request
DOMAIN_ID
Account ID that is used to call an inference request
PROJECT_NAME
Project name that is used to call an inference request
PROJECT_ID
Project ID that invokes the inference request
USER_NAME
Username that is used to call an inference request
USER_ID
User ID that is used to call an inference request
Pound key (#) indicates that a variable is referenced. The matched character string must be enclosed in single quotation marks.
#{Built-in variable} == 'Character string' #{Built-in variable} matches 'Regular expression'
- Example 1:
If the account name for invoking the inference request is User A, the specified version is matched.
#DOMAIN_NAME == 'User A'
- Example 2:
If the account name in the inference request starts with op, the specified version is matched.
#DOMAIN_NAME matches 'op.*'
Table 5 Common regular expressions Character
Description
.
Match any single character except \n. To match any character including \n, use (.|\n).
*
Match the subexpression that it follows for zero or multiple times. For example, zo* can match z and zoo.
+
Match the subexpression that it follows for once or multiple times. For example, zo+ can match zo and zoo, but cannot match z.
?
Match the subexpression that it follows for zero or one time. For example, do(es)? can match does or do in does.
^
Match the start of the input string.
$
Match the end of the input string.
{n}
n is a non-negative integer, which matches exactly n number of occurrences of an expression. For example, o{2} cannot match o in Bob, but can match two os in food.
x|y
Match x or y. For example, z|food can match z or food, and (z|f)ood can match zood or food.
[xyz]
Character set, where any single character in it can be matched. For example, [abc] can match a in plain.
Figure 1 Traffic distribution by user
- Example 1:
- If multiple versions of a real-time service are deployed for gated launch, customized settings can be used to access different versions through the header.
Start with #HEADER_ to indicate that the header is referenced as a condition.
#HEADER_{key} == '{value}' #HEADER_{key} matches '{value}'
- Example 1:
If the header of an inference HTTP request contains a version and the value is 0.0.1, the condition is met. Otherwise, the condition is not met.
#HEADER_version == '0.0.1'
- Example 2:
If the header of an inference HTTP request contains testheader and the value starts with mock, the rule is matched.
#HEADER_testheader matches 'mock.*'
- Example 3:
If the header of an inference HTTP request contains uid and the hash code value meets the conditions described in the following algorithm, the rule is matched.
#HEADER_uid.hashCode() % 100 < 10
Figure 2 Using the header to access different versions
- Example 1:
- If a real-time service version supports different runtime configurations, you can use Setting Name and Setting Value to specify customized runtime parameters so that different users can use different running configurations.
Example:
When user A accesses the AI application, the user uses configuration A. When user B accesses the AI application, the user uses configuration B. When matching a running configuration, ModelArts adds a header to the request and also the customized running parameters specified by Setting Name and Setting Value.Figure 3 Customized running parameters added for a customized configuration rule
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.