Updated on 2025-11-04 GMT+08:00

Using APIs to Call a Third-Party Model

After a pre-trained or trained model is deployed, you can use the text dialog API to call the model. The third-party inference service can be invoked using Pangu inference APIs (V1 inference APIs) or OpenAI APIs (V2 inference APIs). The authentication modes of V1 and V2 APIs are different, and the request body and response body are slightly different.

Table 1 NLP model inference APIs

API Type

API URI

V1 Inference API

/v1/{project_id}/deployments/{deployment_id}/chat/completions

V2 Inference API

/api/v2/chat/completions

Obtaining the Call Path

  1. Log in to ModelArts Studio Large Model Deveopment Platform. In the My Spaces area, click the required workspace.
    Figure 1 My Spaces
  2. Obtain the call path.
    In the navigation pane, choose Model Development > Model Deployment.
    • Obtain the call path of the deployed model. On the My service tab page, click the name of a model in the Running state. On the API call tab page, obtain the model call path and call the model based on the calling method on the tab page, as shown in Figure 2.
      Figure 2 Obtaining the call path of the deployed model

    • Obtain the call path of the preset service. On the Preset service tab page, select the NLP model to be called and click Call Path. In the Call Path dialog box, obtain the model call path, as shown in Figure 3.
      Figure 3 Obtaining the call path of a preset service

      Obtain the call path of the model in edge deployment mode. On the My service tab page, click the name of a model in the Running state. On the Details tab page, you can obtain the model call path.

      Load balancing mode:

      The model path is http://{ELB IP address}:{ELB load port}/{API URL}/{Inference API URL}. The ELP IP address must be the corresponding public IP address. The following figure shows how to obtain each part.

      Node mode:

      The model path is http://{Node IP address}:{Host port}/{Inference API URL}. The node IP address is the IP address of the worker node in the edge pool. The following figure shows how to obtain each part.

Using Postman to Call APIs

  1. Create a POST request in Postman and enter the model call path. For details, see Obtaining the Call Path.
  2. There are two authentication modes for calling APIs: token authentication and API key authentication. API key authentication is used when the API service deployed by a user needs to be opened to other users. The original token authentication cannot be used. In this case, API key authentication can be used to call requests.
    Set request header parameters by referring to Table 2.
    Table 2 Request header parameters

    Authentication Mode

    Parameter

    Value

    Token-based authentication

    Content-Type

    application/json

    X-Auth-Token

    Obtain the token by following the instructions provided in section "Calling REST APIs for Authentication" > "Token-based Authentication" in API Reference.

    API key authentication for V1 inference APIs

    Content-Type

    application/json

    X-Apig-AppCode

    API key. To obtain the API key, perform the following steps:

    1. Log in to ModelArts Studio and access the required workspace.
    2. In the navigation pane, choose System Management & Stats > Application Access. On the displayed page, click Create Application Access in the upper right corner.
    3. In the application configuration area, select a deployed model and click OK.
    4. Obtain the API key in the API Key column on the application access page.

    API key authentication for V2 inference APIs

    Content-Type

    application/json

    Authorization

    A character string consisting of Bearer and the API key obtained from created application access. A space is required between Bearer and the API key. For example, Bearer d59******9C3.

    Figure 4 shows how to set the request header parameters for token authentication.

    Figure 4 Setting request parameters
  3. Click Body, select raw, refer to the following code, and enter the request body.
    {
        "messages": [
          {
              "content": "Introduce the Yangtze River and its typical fish species."
           }
          ],
        "temperature": 0.9,
        "max_tokens": 600
    }
  4. Click Send on Postman to send the request. If the returned status code is 200, the NLP model API is successfully called.

API Key Authentication

If the inference service deployed by a user needs to be opened to other users, the original token-based authentication is not supported. In this case, API key authentication can be used to call APIs.

API key authentication is a quick response mechanism that authenticates an API call by adding the X-Apig-AppCode parameter (value: API key) to the HTTP request header. The API service only verifies the API key.

To obtain the API key, perform the following steps:

  1. Log in to ModelArts Studio and access the required workspace.
  2. In the navigation pane, choose System Management & Stats > Application Access. On the displayed page, click Create Application Access in the upper right corner.
  3. In the Associated Services area, select All services or specify a deployed inference service, set the application access name and description, and click OK.
  4. Obtain the API key in the API Key column on the application access page.