Third-Party NLP Models_Third-Party Models_Model Inference APIs_API_API Reference

Function

The third-party NLP model API is an API service based on the DeepSeek and Qwen models. It supports text-based interaction in multiple scenarios and can quickly generate high-quality dialogs, copywriting passages, and stories. It is suitable for scenarios such as text summarization, intelligent Q&A, and content creation.

Authorization Information

An account has required permissions to call all APIs by default. To call this API as an IAM user, the IAM user must be granted the required permissions. For details, see Permissions and Supported Actions.

URI

The NLP inference service can be invoked using Pangu inference APIs (V1 inference APIs) or OpenAI APIs (V2 inference APIs).

The authentication modes of V1 and V2 APIs are different, and the request body and response body of V1 and V2 APIs are slightly different.

**Table 1** NLP inference APIs
API Type	API URI
V1 inference API	POST /v1/{project_id}/deployments/{deployment_id}/chat/completions
V2 inference API	POST /api/v2/chat/completions

**Table 2** Path parameters of the V1 inference API
Parameter	Mandatory	Type	Description
project_id	Yes	String	Definition Project ID. For details about how to obtain a project ID, see Obtaining the Project ID. Constraints N/A Range N/A Default Value N/A
deployment_id	Yes	String	Definition Model deployment ID. For details about how to obtain the deployment ID, see Obtaining the Model Deployment ID. Constraints N/A Range N/A Default Value N/A

Request Parameters

The authentication modes of the V1 and V2 inference APIs are different, and the request and response parameters are also different. The details are as follows:

Header parameters

The V1 inference API supports both token-based authentication and API key authentication. The request header parameters for the two authentication modes are as follows:
- Table 3 lists the request header parameters for Token-based Authentication.

**Table 3** Request header parameters (token-based authentication)
Parameter	Mandatory	Type	Description
X-Auth-Token	Yes	String	Definition User token. Used to obtain the permission required to call APIs. The token is the value of X-Subject-Token in the response header in Token-based Authentication. Constraints N/A Range N/A Default Value N/A
Content-Type	Yes	String	Definition MIME type of the request body. Constraints N/A Range N/A Default Value application/json

For details about the request header parameters in API Key Authentication mode, see Table 4.

**Table 4** Request header parameters (API key authentication)
Parameter	Mandatory	Type	Description
X-Apig-AppCode	Yes	String	Definition API key. Used to obtain the permission required to call APIs. The API key is the value of X-Apig-AppCode in the response header in API key authentication. Constraints N/A Range N/A Default Value N/A
Content-Type	Yes	String	Definition MIME type of the request body. Constraints N/A Range N/A Default Value application/json

The V2 inference API supports only API key authentication. For details about the request header parameters, see Table 5.

**Table 5** Request header parameters of V2 inference API (OpenAI-compatible API key authentication)
Parameter	Mandatory	Type	Description
Authorization	Yes	String	Definition A character string consisting of Bearer and the API key obtained from created application access. A space is required between Bearer and the API key. For example, Bearer d59**9C3. Constraints N/A Range N/A Default Value N/A
Content-Type	Yes	String	Definition MIME type of the request body. Constraints N/A Range N/A Default Value application/json

Request body parameters

The request body parameters of the V1 and V2 inference APIs are the same, as described in Table 6.

**Table 6** Request body parameters
Parameter	Mandatory	Type	Description
messages	Yes	Array of ChatCompletionMessageParam objects	Definition Multi-turn dialogue question-answer pairs, which contain the role and content attributes. role indicates the role in a dialogue. The value can be system or user. If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user. In a dialogue request, you need to set role only once. content indicates the content of a dialogue, which can be any text. The messages parameter helps the model to generate a proper response based on the context in the dialogue. Constraints Array length: 1–20 Range N/A Default Value N/A
model	Yes	String	Definition ID of the model to be used. Set this parameter based on the deployed model. The value can be DeepSeek-R1 or DeepSeek-V3. Constraints N/A Range N/A Default Value N/A
stream	No	boolean	Definition Whether to enable the streaming mode. The streaming protocol is Server-Sent Events (SSE). If the streaming mode is enabled, set the value to true. After the streaming mode is enabled, this API sends the generated text to the client in real time, instead of sending all text at a time after the text generation is complete. Constraints N/A Range N/A Default Value false
temperature	No	Float	Definition Used to control the diversity and creativity of the generated text. A floating-point number that controls the randomness of sampling. A lower temperature produces more deterministic outputs. A higher temperature, for example, 0.9, produces more creative outputs. The value 0 indicates greedy sampling. If the value is greater than 1, the effect is very likely to be unavailable. temperature is one of the key parameters that affect the output quality and diversity of an LLM. Other parameters, like top_p, can also be used to adjust the behavior and preferences of the LLM. However, do not use temperature and top_p at the same time. Constraints N/A Range (0, 1] Default Value 1.0
top_p	No	Float	Definition Nucleus sampling parameter. As an alternative to adjusting the sampling temperature, the model will consider the results of the top_p probability tokens. 0.1 means that only tokens included in the top 10% probability will be considered. You are advised to change the value or temperature, but not both. NOTE: A token is the smallest unit of text a model can work with. A token can be a word or part of characters. An LLM turns input and output text into tokens, generates a probability distribution for each possible word, and then samples tokens according to the distribution. Constraints N/A Range (0.0, 1.0] Default Value 0.8
max_tokens	No	Integer	Definition Maximum number of output tokens for the generated text. Constraints The total length of the input text plus the generated text cannot exceed the maximum length that the model can process. Range Minimum value: 1 Maximum value: 8192 Default Value 4096
presence_penalty	No	Float	Definition Controls how the LLM processes new tokens. If a token has already appeared in the generated text, the model will penalize this token when generating text. If the value of presence_penalty is a positive number, the model tends to generate new tokens that have not appeared before. That is, the model tends to talk about new topics. Constraints N/A Range Minimum value: -2 Maximum value: 2 Default Value 0 (indicating that the parameter does not take effect)
frequency_penalty	No	Float	Definition Controls how the model penalizes new tokens based on their existing frequency. If a token appears frequently in the training set, the model will penalize this token when generating text. A positive value penalizes new tokens that have already been used frequently, making it less likely to repeat words or phrases exactly. Constraints N/A Range Minimum value: -2 Maximum value: 2 Default Value 0 (indicating that the parameter does not take effect)

**Table 7** ChatCompletionMessageParam
Parameter	Mandatory	Type	Description
role	Yes	String	Definition Role in a dialog. The default value range is as follows: system, user, assistant, tool, function. Customization is supported. If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user. When the parameter is returned, the value is fixed to assistant. In a dialogue request, you need to set role only once. Constraints N/A Range N/A Default Value system, user, assistant, tool, and function
content	Yes	String	Definition Content of a dialogue, which can be any text in tokens. Constraints A multi-turn dialogue cannot contain more than 20 content fields in the message parameter. Minimum length: 1 Maximum length: token length supported by different models. Range N/A Default Value None

Response Parameters

Non-streaming

Status code: 200

**Table 8** Response body parameters
Parameter	Type	Description
id	String	Definition Uniquely identifies each response. Constraints The value is in the format of "chatcmpl-{random_uuid()}". Range N/A Default Value N/A
object	String	Definition The value must be chat.completion. Constraints N/A Range N/A Default Value N/A
created	Integer	Definition Time when a response is generated, in seconds. Constraints N/A Range N/A Default Value N/A
model	String	Definition Request model ID. Constraints N/A Range N/A Default Value N/A
choices	Array of ChatCompletionResponseChoice objects	Definition List of generated text. Constraints N/A Range N/A Default Value N/A
usage	UsageInfo object	Definition Token usage for the dialogue. Model usage, using which you can prevent the model from generating excessive tokens. Constraints N/A Range N/A Default Value N/A
prompt_logprobs	Object	Definition Logarithmic probability information of the input text and the corresponding tokens. Constraints N/A Range N/A Default Value null

**Table 9** ChatCompletionResponseChoice
Parameter	Type	Description
message	ChatMessage object	Definition Generated text. Constraints N/A Range N/A Default Value N/A
index	Integer	Definition Index of the generated text, starting from 0 Constraints N/A Range N/A Default Value N/A
finish_reason	String	Definition Reason why the model stops generating tokens. Constraints N/A Range [stop, length, content_filter, tool_calls, insufficient_system_resource] stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered. length: The output length reaches the context length limit of the model or the max_tokens limit. content_filter: The output content is filtered due to filter conditions. tool_calls: The model determines to call an external tool (function/API) to complete the task. insufficient_system_resource: Generation is interrupted because system inference resources are insufficient. Default Value stop
logprobs	Object	Definition Evaluation metric, indicating the confidence value of the inference output. Constraints N/A Range N/A Default Value null
stop_reason	Union[Integer, String]	Definition Token ID or character string that instructs the model to stop generating. If the EOS token is encountered, the default value is returned. If the string or token ID in the stop parameter specified in the user request is encountered, the corresponding string or token ID is returned. This parameter is not a standard field of the OpenAI API but is supported by the vLLM API. Constraints N/A Range N/A Default Value None

**Table 10** UsageInfo
Parameter	Type	Description
prompt_tokens	Number	Definition Number of tokens contained in the user prompt. Constraints N/A Range N/A Default Value N/A
total_tokens	Number	Definition Number of all tokens in a dialog request. Constraints N/A Range N/A Default Value N/A
completion_tokens	Number	Definition Number of answers tokens generated by the inference model. Constraints N/A Range N/A Default Value N/A

**Table 11** ChatMessage
Parameter	Type	Description
role	String	Definition Role that generates the message. The value must be assistant. Constraints N/A Range assistant Default Value assistant
content	String	Definition Content of a dialog. Constraints Minimum length: 1 Maximum length: token length supported by different models. Range N/A Default Value N/A
reasoning_content	String	Definition Reasoning steps that led to the final conclusion (thinking process of the model). Constraints This parameter is available only for the DeepSeek-R1 model. Range N/A Default Value N/A

Streaming (with stream set to true)

Status code: 200

**Table 12** Data units output in streaming mode
Parameter	Type	Description
data	CompletionStreamResponse object	Definition If stream is set to true, messages generated by the model will be returned in streaming mode. The generated text is returned incrementally. Each data field contains a part of the generated text until all data fields are returned. Constraints N/A Range N/A Default Value N/A

**Table 13** CompletionStreamResponse
Parameter	Type	Description
id	String	Definition Unique identifier of the dialogue. Constraints N/A Range N/A Default Value N/A
created	Integer	Definition Unix timestamp (in seconds) when the chat was created. The timestamps of each chunk in the streaming response are the same. Constraints N/A Range N/A Default Value N/A
model	String	Definition Name of the model that generates the completion. Constraints N/A Range N/A Default Value N/A
object	String	Definition Object type, which is chat.completion.chunk. Constraints N/A Range N/A Default Value N/A
choices	ChatCompletionResponseStreamChoice	Definition A list of completion choices generated by the model. Constraints N/A Range N/A Default Value N/A

**Table 14** ChatCompletionResponseStreamChoice
Parameter	Type	Description
index	Integer	Definition Index of the completion in the completion choice list generated by the model. Constraints N/A Range N/A Default Value N/A
finish_reason	String	Definition Reason why the model stops generating tokens. Constraints N/A Range [stop, length, content_filter, tool_calls, insufficient_system_resource] stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered. length: The output length reaches the context length limit of the model or the max_tokens limit. content_filter: The output content is filtered due to filter conditions. tool_calls: The model determines to call an external tool (function/API) to complete the task. insufficient_system_resource: Generation is interrupted because system inference resources are insufficient. Default Value N/A

Status code: 400

**Table 15** Response body parameters
Parameter	Type	Description
error_code	String	Error code
error_msg	String	Error message

Example Request

Non-streaming

V1 inference API
POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions

Request Header:   
Content-Type: application/json   
X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      

Request Body:
{
  "model":"DeepSeek-V3",
  "messages":[
    {
      "role":"user",
      "content": "Hello"
    }]
}

V2 inference API
POST https://{endpoint}/api/v2/chat/completions

Request Header:   
Content-Type: application/json   
Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e...  

Request Body:
{
  "model":"DeepSeek-V3",
  "messages":[
    {
      "role":"user",
      "content": "Hello"
    }]
}

Streaming (with stream set to true)

V1 inference API
POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions

Request Header:   
Content-Type: application/json   
X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      

Request Body:
{
  "model":"DeepSeek-V3",
  "messages":[
    {
      "role":"user",
      "content": "Hello"
    }],
  "stream":true
}

V2 inference API
POST https://{endpoint}/api/v2/chat/completions

Request Header:   
Content-Type: application/json   
Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e...  

Request Body:
{
  "model":"DeepSeek-V3",
  "messages":[
    {
      "role":"user",
      "content": "Hello"
    }],
  "stream":true
}

Example Response

Status code: 200

OK

Non-streaming Q&A response

 {
    "id": "chat-9a75fc02e45d48db94f94ce38277beef",
    "object": "chat.completion",
    "created": 1743403365,
    "model": "DeepSeek-V3",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! How can I help you?",
                "tool_calls": []
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 64,
        "total_tokens": 73,
        "completion_tokens": 9
    }
}

Non-streaming Q&A response with chain of thought

{
    "id": "81c34733-0e7c-4b4b-a044-1e1fcd54b8db",
    "model": "deepseek-r1_32k",
    "created": 1747485310,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n\nHello! Nice to meet you. Is there anything I can do for you?",
                "reasoning_content": "Hmm, the user just sent a short "Hello", which is a greeting in Chinese. First, I need to confirm their needs. They might want to test the reply or have a specific question. Also, I need to consider whether to respond in English, but since the user used Chinese, it's more appropriate to reply in Chinese.\n\nThen, I need to ensure the reply is friendly and complies with the guidelines, without involving sensitive content. The user may expect further conversation or need help with a question. At this point, I should maintain an open-ended response, inviting them to raise specific questions or needs. For example, I could say, "Hello! Nice to meet you, is there anything I can help you with?" This is both polite and proactive in offering assistance.\n\nAdditionally, avoid using any format or markdown, keeping it natural and concise. The user might be new to this platform and unfamiliar with how to ask questions, so a more encouraging tone might be better. Check for any spelling or grammatical errors to ensure the reply is correct and error-free.\n"
                "tool_calls": [
                ]
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "completion_tokens": 184,
        "prompt_tokens": 6,
        "total_tokens": 190
    }
}

Streaming Q&A response

Response body of the V1 inference API
data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"message":{"role":"assistant"},"logprobs":null,"finish_reason":null}]

data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content": "Hello"},"logprobs":null,"finish_reason":null}]}

data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content":", can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]}

data:[DONE]

Response body of the V2 inference API
data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}]

data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content": "Hello"},"logprobs":null,"finish_reason":null}]}

data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content":". Can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]}

data:[DONE]

Streaming Q&A response with chain of thought

Response body of the V1 inference API
data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Hmm"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"reasoning_content":", "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content":"user sent "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}}

...


data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "final"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"\n\nHello."},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": ["{\"index\":0,\"message\":{\"content\":\"you\"},\"logprobs\":null,\"finish_reason\":null}"],\"usage\":{\"prompt_tokens\":6,\"total_tokens\":195,\"completion_tokens\":189}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content":". What"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"can I do"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"for you"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"content":"? "},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}

data:[DONE]

Response body of the V2 inference API
data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"Hmm"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"reasoning_content":","},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"user sent"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}}

...


data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"final"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":"\n\nHello"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"you"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":195,"completion_tokens":189}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "Can I help"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "your"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}

data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}

data:[DONE]

Streaming Q&A response when the content is not approved

event:moderation data:{"suggestion":"block","reply":"As an AI language model, my goal is to provide help and information in a positive, proactive, and safe manner. Your question is beyond my answer range."}

data:[DONE]

Status Codes

For details, see Status Codes.

Error Codes

For details, see Error Codes.

Third-Party NLP Models

Function

Authorization Information

URI

Request Parameters

Response Parameters

Example Request

Example Response

Status Codes

Error Codes

Feedback

Was this page helpful?