Initiating an Invocation Request
Function
This API is used to invoke a deployed large model inference instance and initiate an inference request. This API is synchronous with no accompanying APIs. This API performs content moderation, shielding content that does not meet the requirements. You can choose to disable this function.
URI
POST /v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
workspace_id |
Yes |
String |
Definition: Workspace ID. Constraints: N/A. Range: 1 to 36 characters. Only letters, digits, and hyphens (-) are allowed. Default Value: N/A. |
instance_id |
Yes |
String |
Definition: Instance ID. For details about how to obtain an instance ID, see [Obtaining an Inference Instance ID] (dataartsfabric_03_0025.xml). Constraints: N/A. Range: 1 to 36 characters. Only letters, digits, and hyphens (-) are allowed. Default Value: N/A. |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
No |
String |
Definition: Tenant token. Constraints: N/A. Range: N/A. Default Value: N/A. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
messages |
No |
Array of ChatMessage objects |
|
max_tokens |
No |
Integer |
|
temperature |
No |
Double |
Definition: Temperature is a number used to adjust the degree of randomness. Range: 0–2. A larger value (for example, 0.8) leads to a more random output, while a smaller value (for example, 0.2) leads to a more centralized and deterministic output. Constraints: N/A. Range: [0, 2]. Default Value: 1. |
logit_bias |
No |
Object |
|
top_p |
No |
Double |
|
stream |
No |
Boolean |
Definition: Whether streaming responses are supported. If supported, messages are returned line by line (for an interactive effect). If not supported, all messages are returned at once. Constraints: N/A. Range: true or false. Default Value: N/A. |
frequency_penalty |
No |
Double |
Definition: Frequency penalty, which controls the repetition of words in the text to avoid certain words or phrases from appearing too frequently in the generated text. A positive value will penalize new tokens based on their existing frequency in the text, thereby reducing the likelihood of the model repeating the same line word-for-word. Constraints: N/A Range: [-2.0, 2.0]. Default Value: N/A. |
presence_penalty |
No |
Double |
Definition: Presence penalty, which controls the repetition of topics in the text to avoid repeatedly discussing the same topic or viewpoint in the conversation or text. Positive values penalize new tokens based on whether they have appeared in the text so far, increasing the model's likelihood of talking about new topics. Constraints: N/A Range: [-2.0, 2.0]. Default Value: N/A. |
n |
No |
Integer |
Definition: Number of chat completion options to generate for each input message. Note that you will be billed based on the total number of tokens generated across all options. Set n to 1 to minimize costs. Constraints: N/A. Range: N/A. Default Value: N/A. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
role |
Yes |
String |
Definition: Role. Constraints: N/A. Range: 1 to 64 characters, excluding the following characters: !<>=&''' Default Value: N/A. |
content |
No |
Object |
|
name |
No |
String |
Model information, which is used to distinguish participants of the same role. |
tool_calls |
No |
Array of MessageToolCall objects |
Tool calls generated by a model. At least one of the content and tool_calls fields must be non-empty. |
tool_call_id |
No |
String |
ID of the tool call generated by a model. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
id |
Yes |
String |
Tool call ID. |
type |
Yes |
String |
Tool type. Currently, the value can only be function. |
function |
Yes |
Function object |
Function called by a model. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
Name of the function called by a model. |
arguments |
Yes |
String |
Arguments of the function to be called, which are generated by a model and in JSON format. Note that a model does not always generate valid JSON and may assume arguments that are not defined in your function schema. Before calling a function, verify these arguments in the code. |
Response Parameters
Status code: 200
A chat completion is created.
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A. |
error_msg |
String |
Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
solution_msg |
String |
Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
Status code: 401
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A. |
error_msg |
String |
Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
solution_msg |
String |
Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
Status code: 404
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A. |
error_msg |
String |
Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
solution_msg |
String |
Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
Status code: 408
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A. |
error_msg |
String |
Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
solution_msg |
String |
Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
Status code: 500
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Definition: Error code. Constraints: N/A. Range: [8, 36]. Default Value: N/A. |
error_msg |
String |
Definition: Error message. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
solution_msg |
String |
Definition: Solution description. Constraints: N/A. Range: [2, 4096]. Default Value: N/A. |
Example Requests
Invoke a deployed large model inference instance and initiate an inference request. The following is an example of the request parameters.
POST https://{endpoint}/v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations { "messages" : [ { "role" : "user", "content" : "Summarize the development of LLMs in 2023." } ] }
Example Responses
Status code: 200
A chat completion is created.
{ "route_id" : "ac8111bf-3601-4905-8ddd-b41d3e636a4e" }
Status code: 400
BadRequest
{ "error_code" : "common.01000001", "error_msg" : "failed to read http request, please check your input, code: 400, reason: Type mismatch., cause: TypeMismatchException" }
Status code: 401
Unauthorized
{ "error_code" : "APIG.1002", "error_msg" : "Incorrect token or token resolution failed" }
Status code: 403
Forbidden
{ "error" : { "code" : "403", "message" : "X-Auth-Token is invalid in the request", "title" : "Forbidden" }, "error_code" : 403, "error_msg" : "X-Auth-Token is invalid in the request", "title" : "Forbidden" }
Status code: 404
NotFound
{ "error_code" : "common.01000001", "error_msg" : "response status exception, code: 404" }
Status code: 408
Request Time-out
{ "error_code" : "common.00000408", "error_msg" : "timeout exception occurred" }
Status code: 500
InternalServerError
{ "error_code" : "common.00000500", "error_msg" : "internal error" }
Status Codes
Status Code |
Description |
---|---|
200 |
A chat completion is created. |
400 |
BadRequest |
401 |
Unauthorized |
403 |
Forbidden |
404 |
NotFound |
408 |
Request Time-out |
500 |
InternalServerError |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot