Creating an Inference Endpoint
Before creating an inference service, you need to create an inference endpoint. When creating an inference endpoint, you can configure the maximum number of resources. Then, you can create inference services on the inference endpoint. The total number of resources of all inference services on the inference endpoint cannot exceed the maximum number of resources of the inference endpoint. This helps you control the resource usage of the inference endpoint.
Prerequisites
- You have a valid Huawei Cloud account.
- You have at least one workspace available.
Procedure
- Log in to Workspace Management Console.
- Select a created workspace, click Access Workspace, and choose Resources and Assets > Inference Endpoint.
- Click Create Inference Endpoint in the upper right corner. Enter the endpoint name, description, resource specifications, and quantity by referring to Table 1, and click Create Now.
Table 1 Basic information about creating an inference endpoint Parameter
Description
Endpoint Name
Indicates the name of an inference endpoint, which is mandatory.
The name contains 1 to 64 characters and must be unique.
Only letters, digits, underscores (_), hyphens (-), periods (.), and spaces are allowed.
Description
Indicates the description of an inference service, which is optional.
The value contains 0 to 1,024 characters. Special characters such as ^!<>=&"' are not supported.
Compute Unit Type
This parameter is used to filter resource specifications.
Resource Specifications
Indicates the resource specifications, which is mandatory. Different resource specifications support different models.
Pre-warmed Resources
Currently, only 0 is supported, which is the number of pre-warmed resources of the inference endpoint.
Maximum Number of Resources
Indicates the maximum number of resources of an inference endpoint, which is mandatory. The value ranges from 1 to 1,000. In addition, the maximum number of resources cannot be less than the number of pre-warmed resources.
- Choose Resources and Assets > Inference Endpoint > My Endpoint to view the created endpoint.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot