Updated on 2025-11-03 GMT+08:00

Third-Party NLP Models

Function

The third-party NLP model API is an API service based on the DeepSeek and Qwen models. It supports text-based interaction in multiple scenarios and can quickly generate high-quality dialogs, copywriting passages, and stories. It is suitable for scenarios such as text summarization, intelligent Q&A, and content creation.

URI

The NLP inference service can be invoked using Pangu inference APIs (V1 inference APIs) or OpenAI APIs (V2 inference APIs).

The authentication modes of V1 and V2 APIs are different, and the request body and response body of V1 and V2 APIs are slightly different.

Table 1 NLP inference APIs

API Type

API URI

V1 inference API

POST /v1/{project_id}/deployments/{deployment_id}/chat/completions

V2 inference API

POST /api/v2/chat/completions

Table 2 Path parameters of the V1 inference API

Parameter

Mandatory

Type

Description

project_id

Yes

String

Definition

Project ID. For details about how to obtain a project ID, see Obtaining the Project ID.

Constraints

N/A

Range

N/A

Default Value

N/A

deployment_id

Yes

String

Definition

Model deployment ID. For details about how to obtain the deployment ID, see Obtaining the Model Deployment ID.

Constraints

N/A

Range

N/A

Default Value

N/A

Request Parameters

The authentication modes of the V1 and V2 inference APIs are different, and the request and response parameters are also different. The details are as follows:

Header parameters

  1. The V1 inference API supports both token-based authentication and API key authentication. The request header parameters for the two authentication modes are as follows:
Table 3 Request header parameters (token-based authentication)

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

String

Definition

User token.

Used to obtain the permission required to call APIs. The token is the value of X-Subject-Token in the response header in Token-based Authentication.

Constraints

N/A

Range

N/A

Default Value

N/A

Content-Type

Yes

String

Definition

MIME type of the request body.

Constraints

N/A

Range

N/A

Default Value

application/json

Table 4 Request header parameters (API key authentication)

Parameter

Mandatory

Type

Description

X-Apig-AppCode

Yes

String

Definition

API key.

Used to obtain the permission required to call APIs. The API key is the value of X-Apig-AppCode in the response header in API key authentication.

Constraints

N/A

Range

N/A

Default Value

N/A

Content-Type

Yes

String

Definition

MIME type of the request body.

Constraints

N/A

Range

N/A

Default Value

application/json

  1. The V2 inference API supports only API key authentication. For details about the request header parameters, see Table 5.
Table 5 Request header parameters of V2 inference API (OpenAI-compatible API key authentication)

Parameter

Mandatory

Type

Description

Authorization

Yes

String

Definition

A character string consisting of Bearer and the API key obtained from created application access. A space is required between Bearer and the API key. For example, Bearer d59******9C3.

Constraints

N/A

Range

N/A

Default Value

N/A

Content-Type

Yes

String

Definition

MIME type of the request body.

Constraints

N/A

Range

N/A

Default Value

application/json

Request body parameters

The request body parameters of the V1 and V2 inference APIs are the same, as described in Table 6.

Table 6 Request body parameters

Parameter

Mandatory

Type

Description

messages

Yes

Array of ChatCompletionMessageParam objects

Definition

Multi-turn dialogue question-answer pairs, which contain the role and content attributes.

  • role indicates the role in a dialogue. The value can be system or user.

    If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user. In a dialogue request, you need to set role only once.

  • content indicates the content of a dialogue, which can be any text.

The messages parameter helps the model to generate a proper response based on the context in the dialogue.

Constraints

Array length: 1–20

Range

N/A

Default Value

N/A

model

Yes

String

Definition

ID of the model to be used. Set this parameter based on the deployed model. The value can be DeepSeek-R1 or DeepSeek-V3.

Constraints

N/A

Range

N/A

Default Value

N/A

stream

No

boolean

Definition

Whether to enable the streaming mode. The streaming protocol is Server-Sent Events (SSE).

If the streaming mode is enabled, set the value to true. After the streaming mode is enabled, this API sends the generated text to the client in real time, instead of sending all text at a time after the text generation is complete.

Constraints

N/A

Range

N/A

Default Value

false

temperature

No

Float

Definition

Used to control the diversity and creativity of the generated text.

A floating-point number that controls the randomness of sampling. A lower temperature produces more deterministic outputs. A higher temperature, for example, 0.9, produces more creative outputs. The value 0 indicates greedy sampling. If the value is greater than 1, the effect is very likely to be unavailable.

temperature is one of the key parameters that affect the output quality and diversity of an LLM. Other parameters, like top_p, can also be used to adjust the behavior and preferences of the LLM. However, do not use temperature and top_p at the same time.

Constraints

N/A

Range

(0, 1]

Default Value

1.0

top_p

No

Float

Definition

Nucleus sampling parameter. As an alternative to adjusting the sampling temperature, the model will consider the results of the top_p probability tokens. 0.1 means that only tokens included in the top 10% probability will be considered. You are advised to change the value or temperature, but not both.

NOTE:

A token is the smallest unit of text a model can work with. A token can be a word or part of characters. An LLM turns input and output text into tokens, generates a probability distribution for each possible word, and then samples tokens according to the distribution.

Constraints

N/A

Range

(0.0, 1.0]

Default Value

0.8

max_tokens

No

Integer

Definition

Maximum number of output tokens for the generated text.

Constraints

The total length of the input text plus the generated text cannot exceed the maximum length that the model can process.

Range

  • Minimum value: 1
  • Maximum value: 8192

Default Value

4096

presence_penalty

No

Float

Definition

Controls how the LLM processes new tokens. If a token has already appeared in the generated text, the model will penalize this token when generating text. If the value of presence_penalty is a positive number, the model tends to generate new tokens that have not appeared before. That is, the model tends to talk about new topics.

Constraints

N/A

Range

  • Minimum value: -2
  • Maximum value: 2

Default Value

0 (indicating that the parameter does not take effect)

frequency_penalty

No

Float

Definition

Controls how the model penalizes new tokens based on their existing frequency. If a token appears frequently in the training set, the model will penalize this token when generating text. A positive value penalizes new tokens that have already been used frequently, making it less likely to repeat words or phrases exactly.

Constraints

N/A

Range

Minimum value: -2

Maximum value: 2

Default Value

0 (indicating that the parameter does not take effect)

Table 7 ChatCompletionMessageParam

Parameter

Mandatory

Type

Description

role

Yes

String

Definition

Role in a dialog. The default value range is as follows: system, user, assistant, tool, function. Customization is supported.

If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user.

When the parameter is returned, the value is fixed to assistant.

In a dialogue request, you need to set role only once.

Constraints

N/A

Range

N/A

Default Value

system, user, assistant, tool, and function

content

Yes

String

Definition

Content of a dialogue, which can be any text in tokens.

Constraints

A multi-turn dialogue cannot contain more than 20 content fields in the message parameter.

Minimum length: 1

Maximum length: token length supported by different models.

Range

N/A

Default Value

None

Response Parameters

Non-streaming

Status code: 200

Table 8 Response body parameters

Parameter

Type

Description

id

String

Definition

Uniquely identifies each response.

Constraints

The value is in the format of "chatcmpl-{random_uuid()}".

Range

N/A

Default Value

N/A

object

String

Definition

The value must be chat.completion.

Constraints

N/A

Range

N/A

Default Value

N/A

created

Integer

Definition

Time when a response is generated, in seconds.

Constraints

N/A

Range

N/A

Default Value

N/A

model

String

Definition

Request model ID.

Constraints

N/A

Range

N/A

Default Value

N/A

choices

Array of ChatCompletionResponseChoice objects

Definition

List of generated text.

Constraints

N/A

Range

N/A

Default Value

N/A

usage

UsageInfo object

Definition

Token usage for the dialogue. Model usage, using which you can prevent the model from generating excessive tokens.

Constraints

N/A

Range

N/A

Default Value

N/A

prompt_logprobs

Object

Definition

Logarithmic probability information of the input text and the corresponding tokens.

Constraints

N/A

Range

N/A

Default Value

null

Table 9 ChatCompletionResponseChoice

Parameter

Type

Description

message

ChatMessage object

Definition

Generated text.

Constraints

N/A

Range

N/A

Default Value

N/A

index

Integer

Definition

Index of the generated text, starting from 0

Constraints

N/A

Range

N/A

Default Value

N/A

finish_reason

String

Definition

Reason why the model stops generating tokens.

Constraints

N/A

Range

[stop, length, content_filter, tool_calls, insufficient_system_resource]

  • stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered.
  • length: The output length reaches the context length limit of the model or the max_tokens limit.
  • content_filter: The output content is filtered due to filter conditions.
  • tool_calls: The model determines to call an external tool (function/API) to complete the task.
  • insufficient_system_resource: Generation is interrupted because system inference resources are insufficient.

Default Value

stop

logprobs

Object

Definition

Evaluation metric, indicating the confidence value of the inference output.

Constraints

N/A

Range

N/A

Default Value

null

stop_reason

Union[Integer, String]

Definition

Token ID or character string that instructs the model to stop generating. If the EOS token is encountered, the default value is returned. If the string or token ID in the stop parameter specified in the user request is encountered, the corresponding string or token ID is returned. This parameter is not a standard field of the OpenAI API but is supported by the vLLM API.

Constraints

N/A

Range

N/A

Default Value

None

Table 10 UsageInfo

Parameter

Type

Description

prompt_tokens

Number

Definition

Number of tokens contained in the user prompt.

Constraints

N/A

Range

N/A

Default Value

N/A

total_tokens

Number

Definition

Number of all tokens in a dialog request.

Constraints

N/A

Range

N/A

Default Value

N/A

completion_tokens

Number

Definition

Number of answers tokens generated by the inference model.

Constraints

N/A

Range

N/A

Default Value

N/A

Table 11 ChatMessage

Parameter

Type

Description

role

String

Definition

Role that generates the message. The value must be assistant.

Constraints

N/A

Range

assistant

Default Value

assistant

content

String

Definition

Content of a dialog.

Constraints

Minimum length: 1

Maximum length: token length supported by different models.

Range

N/A

Default Value

N/A

reasoning_content

String

Definition

Reasoning steps that led to the final conclusion (thinking process of the model).

Constraints

This parameter is available only for the DeepSeek-R1 model.

Range

N/A

Default Value

N/A

Streaming (with stream set to true)

Status code: 200

Table 12 Data units output in streaming mode

Parameter

Type

Description

data

CompletionStreamResponse object

Definition

If stream is set to true, messages generated by the model will be returned in streaming mode. The generated text is returned incrementally. Each data field contains a part of the generated text until all data fields are returned.

Constraints

N/A

Range

N/A

Default Value

N/A

Table 13 CompletionStreamResponse

Parameter

Type

Description

id

String

Definition

Unique identifier of the dialogue.

Constraints

N/A

Range

N/A

Default Value

N/A

created

Integer

Definition

Unix timestamp (in seconds) when the chat was created. The timestamps of each chunk in the streaming response are the same.

Constraints

N/A

Range

N/A

Default Value

N/A

model

String

Definition

Name of the model that generates the completion.

Constraints

N/A

Range

N/A

Default Value

N/A

object

String

Definition

Object type, which is chat.completion.chunk.

Constraints

N/A

Range

N/A

Default Value

N/A

choices

ChatCompletionResponseStreamChoice

Definition

A list of completion choices generated by the model.

Constraints

N/A

Range

N/A

Default Value

N/A

Table 14 ChatCompletionResponseStreamChoice

Parameter

Type

Description

index

Integer

Definition

Index of the completion in the completion choice list generated by the model.

Constraints

N/A

Range

N/A

Default Value

N/A

finish_reason

String

Definition

Reason why the model stops generating tokens.

Constraints

N/A

Range

[stop, length, content_filter, tool_calls, insufficient_system_resource]

  • stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered.
  • length: The output length reaches the context length limit of the model or the max_tokens limit.
  • content_filter: The output content is filtered due to filter conditions.
  • tool_calls: The model determines to call an external tool (function/API) to complete the task.
  • insufficient_system_resource: Generation is interrupted because system inference resources are insufficient.

Default Value

N/A

Status code: 400

Table 15 Response body parameters

Parameter

Type

Description

error_code

String

Error code

error_msg

String

Error message

Example Request

  • Non-streaming
    V1 inference API
    POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }]
    }
    V2 inference API
    POST https://{endpoint}/api/v2/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e...  
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }]
    }
  • Streaming (with stream set to true)
    V1 inference API
    POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }],
      "stream":true
    }
    V2 inference API
    POST https://{endpoint}/api/v2/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e...  
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }],
      "stream":true
    }

Example Response

Status code: 200

OK

  • Non-streaming Q&A response
     {
        "id": "chat-9a75fc02e45d48db94f94ce38277beef",
        "object": "chat.completion",
        "created": 1743403365,
        "model": "DeepSeek-V3",
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "Hello! How can I help you?",
                    "tool_calls": []
                },
                "finish_reason": "stop"
            }
        ],
        "usage": {
            "prompt_tokens": 64,
            "total_tokens": 73,
            "completion_tokens": 9
        }
    }
  • Non-streaming Q&A response with chain of thought
    {
        "id": "81c34733-0e7c-4b4b-a044-1e1fcd54b8db",
        "model": "deepseek-r1_32k",
        "created": 1747485310,
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "\n\nHello! Nice to meet you. Is there anything I can do for you?",
                    "reasoning_content": "Hmm, the user just sent a short "Hello", which is a greeting in Chinese. First, I need to confirm their needs. They might  want to test the reply or have a specific question. Also, I need to consider whether to respond in English, but since the user used Chinese, it's more appropriate to reply in Chinese.\n\nThen, I need to ensure the reply is friendly and complies with the guidelines, without involving sensitive content. The user may expect further conversation or need help with a question. At this point, I should maintain an open-ended response, inviting them to raise specific questions or needs. For example, I could say, "Hello! Nice to meet you, is there anything I can help you with?" This is both polite and proactive in offering assistance.\n\nAdditionally, avoid using any format or markdown, keeping it natural and concise. The user might be new to this platform and unfamiliar with how to ask questions, so a more encouraging tone might be better. Check for any spelling or grammatical errors to ensure the reply is correct and error-free.\n"
                    "tool_calls": [
                    ]
                },
                "finish_reason": "stop"
            }
        ],
        "usage": {
            "completion_tokens": 184,
            "prompt_tokens": 6,
            "total_tokens": 190
        }
    }
  • Streaming Q&A response
    Response body of the V1 inference API
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"message":{"role":"assistant"},"logprobs":null,"finish_reason":null}]
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content": "Hello"},"logprobs":null,"finish_reason":null}]}
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content":", can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]}
    
    data:[DONE]
    Response body of the V2 inference API
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}]
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content": "Hello"},"logprobs":null,"finish_reason":null}]}
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content":". Can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]}
    
    data:[DONE]
  • Streaming Q&A response with chain of thought
    Response body of the V1 inference API
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Hmm"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"reasoning_content":", "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content":"user sent "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}}
    
    ...
    
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "final"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"\n\nHello."},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": ["{\"index\":0,\"message\":{\"content\":\"you\"},\"logprobs\":null,\"finish_reason\":null}"],\"usage\":{\"prompt_tokens\":6,\"total_tokens\":195,\"completion_tokens\":189}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content":". What"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"can I do"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"for you"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"content":"? "},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:[DONE]
    Response body of the V2 inference API
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"Hmm"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"reasoning_content":","},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"user sent"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}}
    
    ...
    
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"final"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":"\n\nHello"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"you"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":195,"completion_tokens":189}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "Can I help"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "your"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:[DONE]
  • Streaming Q&A response when the content is not approved
    event:moderation data:{"suggestion":"block","reply":"As an AI language model, my goal is to provide help and information in a positive, proactive, and safe manner. Your question is beyond my answer range."}
    
    data:[DONE]

Status Codes

For details, see Status Codes.

Error Codes

For details, see Error Codes.