Help Center/ DataArts Fabric/ API Reference/ APIs/ Inference Service APIs/ Initiating an Invocation Request
Updated on 2025-09-15 GMT+08:00

Initiating an Invocation Request

Function

This API is used to invoke a deployed large model inference instance and initiate an inference request. This API is synchronous with no accompanying APIs. This API performs content moderation, shielding content that does not meet the requirements. You can choose to disable this function.

URI

POST /v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations

Table 1 Path Parameters

Parameter

Mandatory

Type

Description

workspace_id

Yes

String

Definition: Workspace ID.

Constraints: N/A.

Range: 1 to 36 characters. Only letters, digits, and hyphens (-) are allowed.

Default Value: N/A.

instance_id

Yes

String

Definition: Instance ID. For details about how to obtain an instance ID, see [Obtaining an Inference Instance ID] (dataartsfabric_03_0025.xml).

Constraints: N/A.

Range: 1 to 36 characters. Only letters, digits, and hyphens (-) are allowed.

Default Value: N/A.

Request Parameters

Table 2 Request header parameters

Parameter

Mandatory

Type

Description

X-Auth-Token

No

String

Definition: Tenant token.

Constraints: N/A.

Range: N/A.

Default Value: N/A.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

messages

No

Array of ChatMessage objects

  • Definition: message.

  • Constraints: N/A.

  • Range: [1, 100000].

  • Default Value: N/A.

max_tokens

No

Integer

  • Definition: maximum number of tokens to be generated in a chat completion. The total length of input and generated tokens is limited by the model's context length. If this parameter is set to 0, the default value 4096 is used. The value range of the R1 model is [0, 32k], and that of the V3 model is [0, 16k]. This parameter cannot be set together with max_completion_tokens, or an error will be reported.

  • Constraints: The total length of input tokens and generated tokens is constrained by the model's context length.

  • Range: N/A.

  • Default Value: N/A.

temperature

No

Double

Definition: Temperature is a number used to adjust the degree of randomness. Range: 0–2. A larger value (for example, 0.8) leads to a more random output, while a smaller value (for example, 0.2) leads to a more centralized and deterministic output.

Constraints: N/A.

Range: [0, 2].

Default Value: 1.

logit_bias

No

Object

  • Description: It is a map type. Each key is a token ID in the vocabulary (obtained using the tokenization API) and is an integer. Each value is the deviation value of the token and is a floating point number. This parameter is used to adjust the probability of a specified token in the model output to make the model output more suitable for your requirements. This parameter is unavailable now.

  • Constraints: N/A.

  • Range: N/A.

  • Default Value: N/A.

top_p

No

Double

  • Definition: Nucleus sampling strategy, which is used to control the range of tokens the AI model considers based on the cumulative probability. If the value is 0, the model considers only the token with the largest logarithmic probability.

  • Constraints: N/A.

  • Range: [0, 1].

  • Default Value: 1.

stream

No

Boolean

Definition: Whether streaming responses are supported. If supported, messages are returned line by line (for an interactive effect). If not supported, all messages are returned at once.

Constraints: N/A.

Range: true or false.

Default Value: N/A.

frequency_penalty

No

Double

Definition: Frequency penalty, which controls the repetition of words in the text to avoid certain words or phrases from appearing too frequently in the generated text. A positive value will penalize new tokens based on their existing frequency in the text, thereby reducing the likelihood of the model repeating the same line word-for-word.

Constraints: N/A

Range: [-2.0, 2.0].

Default Value: N/A.

presence_penalty

No

Double

Definition: Presence penalty, which controls the repetition of topics in the text to avoid repeatedly discussing the same topic or viewpoint in the conversation or text. Positive values penalize new tokens based on whether they have appeared in the text so far, increasing the model's likelihood of talking about new topics.

Constraints: N/A

Range: [-2.0, 2.0].

Default Value: N/A.

n

No

Integer

Definition: Number of chat completion options to generate for each input message. Note that you will be billed based on the total number of tokens generated across all options. Set n to 1 to minimize costs.

Constraints: N/A.

Range: N/A.

Default Value: N/A.

Table 4 ChatMessage

Parameter

Mandatory

Type

Description

role

Yes

String

Definition: Role.

Constraints: N/A.

Range: 1 to 64 characters, excluding the following characters: !<>=&'''

Default Value: N/A.

content

No

Object

  • Definition: message content, which can be represented as either a string or an object array.

  • Constraints: N/A.

  • Range: N/A.

  • Default Value: N/A.

name

No

String

Model information, which is used to distinguish participants of the same role.

tool_calls

No

Array of MessageToolCall objects

Tool calls generated by a model. At least one of the content and tool_calls fields must be non-empty.

tool_call_id

No

String

ID of the tool call generated by a model.

Table 5 MessageToolCall

Parameter

Mandatory

Type

Description

id

Yes

String

Tool call ID.

type

Yes

String

Tool type. Currently, the value can only be function.

function

Yes

Function object

Function called by a model.

Table 6 Function

Parameter

Mandatory

Type

Description

name

Yes

String

Name of the function called by a model.

arguments

Yes

String

Arguments of the function to be called, which are generated by a model and in JSON format. Note that a model does not always generate valid JSON and may assume arguments that are not defined in your function schema. Before calling a function, verify these arguments in the code.

Response Parameters

Status code: 200

A chat completion is created.

Status code: 400

Table 7 Response body parameters

Parameter

Type

Description

error_code

String

Definition: Error code.

Constraints: N/A.

Range: [8, 36].

Default Value: N/A.

error_msg

String

Definition: Error message.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

solution_msg

String

Definition: Solution description.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

Status code: 401

Table 8 Response body parameters

Parameter

Type

Description

error_code

String

Definition: Error code.

Constraints: N/A.

Range: [8, 36].

Default Value: N/A.

error_msg

String

Definition: Error message.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

solution_msg

String

Definition: Solution description.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

Status code: 404

Table 9 Response body parameters

Parameter

Type

Description

error_code

String

Definition: Error code.

Constraints: N/A.

Range: [8, 36].

Default Value: N/A.

error_msg

String

Definition: Error message.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

solution_msg

String

Definition: Solution description.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

Status code: 408

Table 10 Response body parameters

Parameter

Type

Description

error_code

String

Definition: Error code.

Constraints: N/A.

Range: [8, 36].

Default Value: N/A.

error_msg

String

Definition: Error message.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

solution_msg

String

Definition: Solution description.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

Status code: 500

Table 11 Response body parameters

Parameter

Type

Description

error_code

String

Definition: Error code.

Constraints: N/A.

Range: [8, 36].

Default Value: N/A.

error_msg

String

Definition: Error message.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

solution_msg

String

Definition: Solution description.

Constraints: N/A.

Range: [2, 4096].

Default Value: N/A.

Example Requests

Invoke a deployed large model inference instance and initiate an inference request. The following is an example of the request parameters.

POST https://{endpoint}/v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations

{
  "messages" : [ {
    "role" : "user",
    "content" : "Summarize the development of LLMs in 2023."
  } ]
}

Example Responses

Status code: 200

A chat completion is created.

{
  "route_id" : "ac8111bf-3601-4905-8ddd-b41d3e636a4e"
}

Status code: 400

BadRequest

{
  "error_code" : "common.01000001",
  "error_msg" : "failed to read http request, please check your input, code: 400, reason: Type mismatch., cause: TypeMismatchException"
}

Status code: 401

Unauthorized

{
  "error_code" : "APIG.1002",
  "error_msg" : "Incorrect token or token resolution failed"
}

Status code: 403

Forbidden

{
  "error" : {
    "code" : "403",
    "message" : "X-Auth-Token is invalid in the request",
    "title" : "Forbidden"
  },
  "error_code" : 403,
  "error_msg" : "X-Auth-Token is invalid in the request",
  "title" : "Forbidden"
}

Status code: 404

NotFound

{
  "error_code" : "common.01000001",
  "error_msg" : "response status exception, code: 404"
}

Status code: 408

Request Time-out

{
  "error_code" : "common.00000408",
  "error_msg" : "timeout exception occurred"
}

Status code: 500

InternalServerError

{
  "error_code" : "common.00000500",
  "error_msg" : "internal error"
}

Status Codes

Status Code

Description

200

A chat completion is created.

400

BadRequest

401

Unauthorized

403

Forbidden

404

NotFound

408

Request Time-out

500

InternalServerError

Error Codes

See Error Codes.