Updated on 2025-07-28 GMT+08:00

DeepSeek

Function

DeepSeek API is an API service based on the DeepSeek model. It supports text-based interaction in multiple scenarios and can quickly generate high-quality dialogs, copywriting passages, and stories. It is suitable for scenarios such as text summarization, intelligent Q&A, and content creation.

URI

The NLP inference service can be invoked using Pangu inference APIs (V1 inference APIs) or OpenAI APIs (V2 inference APIs).

The authentication modes of V1 and V2 APIs are different, and the request body and response body of V1 and V2 APIs are slightly different.

Table 1 NLP inference APIs

API Type

API URI

V1 inference API

POST /v1/{project_id}/deployments/{deployment_id}/chat/completions

V2 inference API

POST /api/v2/chat/completions

Table 2 Path parameters of the V1 inference API

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain a project ID, see Obtaining the Project ID.

deployment_id

Yes

String

Model deployment ID. For details about how to obtain the deployment ID, see Obtaining the Model Deployment ID.

Request Parameters

The authentication modes of the V1 and V2 inference APIs are different, and the request and response parameters are also different. The details are as follows:

Header parameters

  1. The V1 inference API supports both token-based authentication and API key authentication. The request header parameters for the two authentication modes are as follows:
Table 3 Request header parameters (token-based authentication)

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

String

User token.

Used to obtain the permission required to call APIs. The token is the value of X-Subject-Token in the response header in Authentication.

Content-Type

Yes

String

MIME type of the request body. The value is application/json.

Table 4 Request header parameters (API key authentication)

Parameter

Mandatory

Type

Description

X-Apig-AppCode

Yes

String

API key.

Used to obtain the permission required to call APIs. The API key is the value of X-Apig-AppCode in the response header in API Key Authentication.

Content-Type

Yes

String

MIME type of the request body. The value is application/json.

  1. The V2 inference API supports only API key authentication. For details about the request header parameters, see Table 5.
Table 5 Request header parameters of V2 inference API (OpenAI-compatible API key authentication)

Parameter

Mandatory

Type

Description

Authorization

Yes

String

A character string consisting of Bearer and the API key obtained from created application access. A space is required between Bearer and the API key. For example, Bearer d59******9C3.

Content-Type

Yes

String

MIME type of the request body. The value is application/json.

Request body parameters

The request body parameters of the V1 and V2 inference APIs are the same, as described in Table 6.

Table 6 Request body parameters

Parameter

Mandatory

Type

Description

messages

Yes

Array of ChatCompletionMessageParam objects

Multi-turn dialogue question-answer pairs, which contain the role and content attributes.

  • role indicates the role in a dialogue. The value can be system or user.

    If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user. In a dialogue request, you need to set role only once.

  • content indicates the content of a dialogue, which can be any text.

The messages parameter helps the model to generate a proper response based on the context in the dialogue.

model

Yes

String

ID of the model to be used. Set this parameter based on the deployed model. The value can be DeepSeek-R1 or DeepSeek-V3.

stream

No

boolean

Whether to enable the streaming mode. The streaming protocol is Server-Sent Events (SSE).

If the streaming mode is enabled, set the value to true. After the streaming mode is enabled, this API sends the generated text to the client in real time, instead of sending all text at a time after the text generation is complete.

Default value: false

temperature

No

Float

Used to control the diversity and creativity of the generated text.

A floating-point number that controls the randomness of sampling. A lower temperature produces more deterministic outputs. A higher temperature, for example, 0.9, produces more creative outputs. The value 0 indicates greedy sampling. If the value is greater than 1, the effect is very likely to be unavailable.

temperature is one of the key parameters that affect the output quality and diversity of an LLM. Other parameters, like top_p, can also be used to adjust the behavior and preferences of the LLM. However, do not use temperature and top_p at the same time.

Minimum value: 0. It is recommended that the value be greater than or equal to 1e-5.

Maximum value: 1.0

Default value: 1.0

top_p

No

Float

Nucleus sampling parameter. As an alternative to adjusting the sampling temperature, the model will consider the results of the top_p probability tokens. 0.1 means that only tokens included in the top 10% probability will be considered. You are advised to change the value or temperature, but not both.

Value range: (0.0, 1.0]

Default value: 0.8

NOTE:

A token is the smallest unit of text a model can work with. A token can be a word or part of characters. An LLM turns input and output text into tokens, generates a probability distribution for each possible word, and then samples tokens according to the distribution.

max_tokens

No

Integer

Maximum number of output tokens for the generated text.

The total length of the input text plus the generated text cannot exceed the maximum length that the model can process.

Minimum value: 1

Maximum value: 8192

Default value: 4096

presence_penalty

No

Float

Controls how the LLM processes new tokens. If a token has already appeared in the generated text, the model will penalize this token when generating text. If the value of presence_penalty is a positive number, the model tends to generate new tokens that have not appeared before. That is, the model tends to talk about new topics.

Minimum value: -2

Maximum value: 2

Default value: 0 (indicating that this parameter does not take effect)

frequency_penalty

No

Float

Controls how the model penalizes new tokens based on their existing frequency. If a token appears frequently in the training set, the model will penalize this token when generating text. A positive value penalizes new tokens that have already been used frequently, making it less likely to repeat words or phrases exactly.

Minimum value: -2

Maximum value: 2

Default value: 0 (indicating that this parameter does not take effect)

Table 7 ChatCompletionMessageParam

Parameter

Mandatory

Type

Description

role

Yes

String

Role in a dialog. The default value range is as follows: system, user, assistant, tool, function. Customization is supported.

If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user.

When the parameter is returned, the value is fixed to assistant.

In a dialogue request, you need to set role only once.

content

Yes

String

Content of a dialogue, which can be any text in tokens.

A multi-turn dialogue cannot contain more than 20 content fields in the message parameter.

Minimum length: 1

Maximum length: token length supported by different models.

Default value: None

Response Parameters

Non-streaming

Status code: 200

Table 8 Response body parameters

Parameter

Type

Description

id

String

Uniquely identifies each response. The value is in the format of "chatcmpl-{random_uuid()}".

object

String

The value must be chat.completion.

created

Integer

Time when a response is generated, in seconds.

model

String

Request model ID.

choices

Array ofChatCompletionResponseChoice objects

List of generated text.

usage

UsageInfo object

Token usage for the dialogue. This parameter helps you learn about model usage, preventing the model from generating excessive tokens.

prompt_logprobs

Object

Logarithmic probability information of the input text and the corresponding tokens.

Default value: null

Table 9 ChatCompletionResponseChoice

Parameter

Type

Description

message

ChatMessage object

Generated text

index

Integer

Index of the generated text, starting from 0

finish_reason

String

Reason why the model stops generating tokens.

Value: stop, length, content_filter, tool_calls, or insufficient_system_resource

stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered.

length: The output length reaches the context length limit of the model or the max_tokens limit.

content_filter: The output content is filtered due to filter conditions.

tool_calls: The model determines to call an external tool (function/API) to complete the task.

insufficient_system_resource: Generation is interrupted because system inference resources are insufficient.

Default value: stop

logprobs

Object

Evaluation metric, indicating the confidence value of the inference output.

Default value: null

stop_reason

Union[Integer, String]

Token ID or character string that instructs the model to stop generating. If the EOS token is encountered, the default value is returned. If the string or token ID in the stop parameter specified in the user request is encountered, the corresponding string or token ID is returned. This parameter is not a standard field of the OpenAI API but is supported by the vLLM API.

Default value: None

Table 10 UsageInfo

Parameter

Type

Description

prompt_tokens

Number

Number of tokens contained in the user prompt.

total_tokens

Number

Number of all tokens in a dialog request.

completion_tokens

Number

Number of answers tokens generated by the inference model.

Table 11 ChatMessage

Parameter

Type

Description

role

String

Role that generates the message. The value must be assistant.

content

String

Content of a dialogue

Minimum length: 1

Maximum length: token length supported by different models.

reasoning_content

String

Reasoning steps that led to the final conclusion (thinking process of the model).

NOTE:

This parameter is available only for the DeepSeek-R1 model.

Streaming (with stream set to true)

Status code: 200

Table 12 Data units output in streaming mode

Parameter

Type

Description

data

CompletionStreamResponse object

If stream is set to true, messages generated by the model will be returned in streaming mode. The generated text is returned incrementally. Each data field contains a part of the generated text until all data fields are returned.

Table 13 CompletionStreamResponse

Parameter

Type

Description

id

String

Unique identifier of the dialogue.

created

Integer

Unix timestamp (in seconds) when the chat was created. The timestamps of each chunk in the streaming response are the same.

model

String

Name of the model that generates the completion.

object

String

Object type, which is chat.completion.chunk.

choices

ChatCompletionResponseStreamChoice

A list of completion choices generated by the model.

Table 14 ChatCompletionResponseStreamChoice

Parameter

Type

Description

index

Integer

Index of the completion in the completion choice list generated by the model.

finish_reason

String

Reason why the model stops generating tokens.

Value: stop, length, content_filter, tool_calls, or insufficient_system_resource

stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered.

length: The output length reaches the context length limit of the model or the max_tokens limit.

content_filter: The output content is filtered due to filter conditions.

tool_calls: The model determines to call an external tool (function/API) to complete the task.

insufficient_system_resource: Generation is interrupted because system inference resources are insufficient.

Status code: 400

Table 15 Response body parameters

Parameter

Type

Description

error_code

String

Error code

error_msg

String

Error message

Example Request

  • Non-streaming
    V1 inference API:
    POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }]
    }
    V2 inference API:
    POST https://{endpoint}/api/v2/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e...  
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }]
    }
  • Streaming (with stream set to true)
    V1 inference API:
    POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG...      
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }],
      "stream":true
    }
    V2 inference API:
    POST https://{endpoint}/api/v2/chat/completions
    
    Request Header:   
    Content-Type: application/json   
    Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e...  
    
    Request Body:
    {
      "model":"DeepSeek-V3",
      "messages":[
        {
          "role":"user",
          "content": "Hello"
        }],
      "stream":true
    }

Example Response

Status code: 200

OK

  • Non-streaming Q&A response
     {
        "id": "chat-9a75fc02e45d48db94f94ce38277beef",
        "object": "chat.completion",
        "created": 1743403365,
        "model": "DeepSeek-V3",
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "Hello. How can I help you?",
                    "tool_calls": []
                },
                "finish_reason": "stop"
            }
        ],
        "usage": {
            "prompt_tokens": 64,
            "total_tokens": 73,
            "completion_tokens": 9
        }
    }
  • Non-streaming Q&A response with chain of thought
    {
        "id": "81c34733-0e7c-4b4b-a044-1e1fcd54b8db",
        "model": "deepseek-r1_32k",
        "created": 1747485310,
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "\n\nHello. Nice to meet you. Is there anything I can do for you?",
                    "reasoning_content": "Hmm, the user just sent a short "Hello", which is a greeting in Chinese. First, I need to confirm their needs. They might  want to test the reply or have a specific question. Also, I need to consider whether to respond in English, but since the user used Chinese, it's more appropriate to reply in Chinese.\n\nThen, I need to ensure the reply is friendly and complies with the guidelines, without involving sensitive content. The user may expect further conversation or need help with a question. At this point, I should maintain an open-ended response, inviting them to raise specific questions or needs. For example, I could say, "Hello! Nice to meet you, is there anything I can help you with?" This is both polite and proactive in offering assistance.\n\nAdditionally, avoid using any format or markdown, keeping it natural and concise. The user might be new to this platform and unfamiliar with how to ask questions, so a more encouraging tone might be better. Check for any spelling or grammatical errors to ensure the reply is correct and error-free.\n"
                    "tool_calls": [
                    ]
                },
                "finish_reason": "stop"
            }
        ],
        "usage": {
            "completion_tokens": 184,
            "prompt_tokens": 6,
            "total_tokens": 190
        }
    }
  • Streaming Q&A response
    Response body of the V1 inference API
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"message":{"role":"assistant"},"logprobs":null,"finish_reason":null}]
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content": "Hello"},"logprobs":null,"finish_reason":null}]}
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content":", can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]}
    
    data:[DONE]
    Response body of the V2 inference API
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}]
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content": "Hello"},"logprobs":null,"finish_reason":null}]}
    
    data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content":". Can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]}
    
    data:[DONE]
  • Streaming Q&A response with chain of thought
    Response body of the V1 inference API
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Hmm"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"reasoning_content":", "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content":"user sent "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}}
    
    ...
    
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "final"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"\n\nHello."},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": ["{\"index\":0,\"message\":{\"content\":\"you\"},\"logprobs\":null,\"finish_reason\":null}"],\"usage\":{\"prompt_tokens\":6,\"total_tokens\":195,\"completion_tokens\":189}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content":". What"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"can I do"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"for you"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"content":"? "},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:[DONE]
    Response body of the V2 inference API
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"Hmm"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"reasoning_content":","},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"user sent"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}}
    
    ...
    
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"final"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":"\n\nHello"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"you"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":195,"completion_tokens":189}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "Can I help"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "your"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}}
    
    data:[DONE]
  • Streaming Q&A response when the content is not approved
    event:moderation data:{"suggestion":"block","reply":"As an AI language model, my goal is to provide help and information in a positive, proactive, and safe manner. Your question is beyond my answer range."}
    
    data:[DONE]

Status Codes

For details, see Status Codes.

Error Codes

For details, see Error Codes.