Help Center/ DataArts Fabric/ API Reference/ APIs/ Inference Service APIs/ Initiating an Invocation Request

Updated on 2025-09-15 GMT+08:00

View PDF

Initiating an Invocation Request

Function

This API is used to invoke a deployed large model inference instance and initiate an inference request. This API is synchronous with no accompanying APIs. This API performs content moderation, shielding content that does not meet the requirements. You can choose to disable this function.

URI

POST /v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations

**Table 1** Path Parameters
Parameter	Mandatory	Type	Description
workspace_id	Yes	String	Definition: Workspace ID. Constraints: N/A. Range: 1 to 36 characters. Only letters, digits, and hyphens (-) are allowed. Default Value: N/A.
instance_id	Yes	String	Definition: Instance ID. For details about how to obtain an instance ID, see [Obtaining an Inference Instance ID] (dataartsfabric_03_0025.xml). Constraints: N/A. Range: 1 to 36 characters. Only letters, digits, and hyphens (-) are allowed. Default Value: N/A.

Request Parameters

**Table 2** Request header parameters
Parameter	Mandatory	Type	Description
X-Auth-Token	No	String	Definition: Tenant token. Constraints: N/A. Range: N/A. Default Value: N/A.

**Table 3** Request body parameters
Parameter	Mandatory	Type	Description
messages	No	Array of ChatMessage objects	Definition: message. Constraints: N/A. Range: [1, 100000]. Default Value: N/A.
max_tokens	No	Integer	Definition: maximum number of tokens to be generated in a chat completion. The total length of input and generated tokens is limited by the model's context length. If this parameter is set to 0, the default value 4096 is used. The value range of the R1 model is [0, 32k], and that of the V3 model is [0, 16k]. This parameter cannot be set together with max_completion_tokens, or an error will be reported. Constraints: The total length of input tokens and generated tokens is constrained by the model's context length. Range: N/A. Default Value: N/A.
temperature	No	Double	Definition: Temperature is a number used to adjust the degree of randomness. Range: 0–2. A larger value (for example, 0.8) leads to a more random output, while a smaller value (for example, 0.2) leads to a more centralized and deterministic output. Constraints: N/A. Range: [0, 2]. Default Value: 1.
logit_bias	No	Object	Description: It is a map type. Each key is a token ID in the vocabulary (obtained using the tokenization API) and is an integer. Each value is the deviation value of the token and is a floating point number. This parameter is used to adjust the probability of a specified token in the model output to make the model output more suitable for your requirements. This parameter is unavailable now. Constraints: N/A. Range: N/A. Default Value: N/A.
top_p	No	Double	Definition: Nucleus sampling strategy, which is used to control the range of tokens the AI model considers based on the cumulative probability. If the value is 0, the model considers only the token with the largest logarithmic probability. Constraints: N/A. Range: [0, 1]. Default Value: 1.
stream	No	Boolean	Definition: Whether streaming responses are supported. If supported, messages are returned line by line (for an interactive effect). If not supported, all messages are returned at once. Constraints: N/A. Range: true or false. Default Value: N/A.
frequency_penalty	No	Double	Definition: Frequency penalty, which controls the repetition of words in the text to avoid certain words or phrases from appearing too frequently in the generated text. A positive value will penalize new tokens based on their existing frequency in the text, thereby reducing the likelihood of the model repeating the same line word-for-word. Constraints: N/A Range: [-2.0, 2.0]. Default Value: N/A.
presence_penalty	No	Double	Definition: Presence penalty, which controls the repetition of topics in the text to avoid repeatedly discussing the same topic or viewpoint in the conversation or text. Positive values penalize new tokens based on whether they have appeared in the text so far, increasing the model's likelihood of talking about new topics. Constraints: N/A Range: [-2.0, 2.0]. Default Value: N/A.
n	No	Integer	Definition: Number of chat completion options to generate for each input message. Note that you will be billed based on the total number of tokens generated across all options. Set n to 1 to minimize costs. Constraints: N/A. Range: N/A. Default Value: N/A.

**Table 4** ChatMessage
Parameter	Mandatory	Type	Description
role	Yes	String	Definition: Role. Constraints: N/A. Range: 1 to 64 characters, excluding the following characters: !<>=&''' Default Value: N/A.
content	No	Object	Definition: message content, which can be represented as either a string or an object array. Constraints: N/A. Range: N/A. Default Value: N/A.
name	No	String	Model information, which is used to distinguish participants of the same role.
tool_calls	No	Array of MessageToolCall objects	Tool calls generated by a model. At least one of the content and tool_calls fields must be non-empty.
tool_call_id	No	String	ID of the tool call generated by a model.

**Table 5** MessageToolCall
Parameter	Mandatory	Type	Description
id	Yes	String	Tool call ID.
type	Yes	String	Tool type. Currently, the value can only be function.
function	Yes	Function object	Function called by a model.

**Table 6** Function
Parameter	Mandatory	Type	Description
name	Yes	String	Name of the function called by a model.
arguments	Yes	String	Arguments of the function to be called, which are generated by a model and in JSON format. Note that a model does not always generate valid JSON and may assume arguments that are not defined in your function schema. Before calling a function, verify these arguments in the code.

Response Parameters

Status code: 200

A chat completion is created.

Status code: 400

**Table 7** Response body parameters
Parameter	Type	Description
error_code	String	Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A.
error_msg	String	Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.
solution_msg	String	Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.

Status code: 401

**Table 8** Response body parameters
Parameter	Type	Description
error_code	String	Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A.
error_msg	String	Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.
solution_msg	String	Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.

Status code: 404

**Table 9** Response body parameters
Parameter	Type	Description
error_code	String	Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A.
error_msg	String	Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.
solution_msg	String	Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.

Status code: 408

**Table 10** Response body parameters
Parameter	Type	Description
error_code	String	Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A.
error_msg	String	Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.
solution_msg	String	Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.

Status code: 500

**Table 11** Response body parameters
Parameter	Type	Description
error_code	String	Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A.
error_msg	String	Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.
solution_msg	String	Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A.

Example Requests

Invoke a deployed large model inference instance and initiate an inference request. The following is an example of the request parameters.

POST https://{endpoint}/v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations

{
  "messages" : [ {
    "role" : "user",
    "content" : "Summarize the development of LLMs in 2023."
  } ]
}

Example Responses

Status code: 200

A chat completion is created.

{
  "route_id" : "ac8111bf-3601-4905-8ddd-b41d3e636a4e"
}

Status code: 400

BadRequest

{
  "error_code" : "common.01000001",
  "error_msg" : "failed to read http request, please check your input, code: 400, reason: Type mismatch., cause: TypeMismatchException"
}

Status code: 401

Unauthorized

{
  "error_code" : "APIG.1002",
  "error_msg" : "Incorrect token or token resolution failed"
}

Status code: 403

Forbidden

{
  "error" : {
    "code" : "403",
    "message" : "X-Auth-Token is invalid in the request",
    "title" : "Forbidden"
  },
  "error_code" : 403,
  "error_msg" : "X-Auth-Token is invalid in the request",
  "title" : "Forbidden"
}

Status code: 404

NotFound

{
  "error_code" : "common.01000001",
  "error_msg" : "response status exception, code: 404"
}

Status code: 408

Request Time-out

{
  "error_code" : "common.00000408",
  "error_msg" : "timeout exception occurred"
}

Status code: 500

InternalServerError

{
  "error_code" : "common.00000500",
  "error_msg" : "internal error"
}

Status Codes

Status Code	Description
200	A chat completion is created.
400	BadRequest
401	Unauthorized
403	Forbidden
404	NotFound
408	Request Time-out
500	InternalServerError

Error Codes

See Error Codes.

Parent topic: Inference Service APIs

Previous topic: Inference Service APIs

Next topic: Ray session API.

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot