DeepSeek
Function
DeepSeek API is an API service based on the DeepSeek model. It supports text-based interaction in multiple scenarios and can quickly generate high-quality dialogs, copywriting passages, and stories. It is suitable for scenarios such as text summarization, intelligent Q&A, and content creation.
URI
The NLP inference service can be invoked using Pangu inference APIs (V1 inference APIs) or OpenAI APIs (V2 inference APIs).
The authentication modes of V1 and V2 APIs are different, and the request body and response body of V1 and V2 APIs are slightly different.
API Type |
API URI |
---|---|
V1 inference API |
POST /v1/{project_id}/deployments/{deployment_id}/chat/completions |
V2 inference API |
POST /api/v2/chat/completions |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
project_id |
Yes |
String |
Project ID. For details about how to obtain a project ID, see Obtaining the Project ID. |
deployment_id |
Yes |
String |
Model deployment ID. For details about how to obtain the deployment ID, see Obtaining the Model Deployment ID. |
Request Parameters
The authentication modes of the V1 and V2 inference APIs are different, and the request and response parameters are also different. The details are as follows:
Header parameters
- The V1 inference API supports both token-based authentication and API key authentication. The request header parameters for the two authentication modes are as follows:
- Table 3 lists the request header parameters for Token-based Authentication.
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
Yes |
String |
User token. Used to obtain the permission required to call APIs. The token is the value of X-Subject-Token in the response header in Authentication. |
Content-Type |
Yes |
String |
MIME type of the request body. The value is application/json. |
- For details about the request header parameters in API Key Authentication mode, see Table 4.
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Apig-AppCode |
Yes |
String |
API key. Used to obtain the permission required to call APIs. The API key is the value of X-Apig-AppCode in the response header in API Key Authentication. |
Content-Type |
Yes |
String |
MIME type of the request body. The value is application/json. |
- The V2 inference API supports only API key authentication. For details about the request header parameters, see Table 5.
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
Authorization |
Yes |
String |
A character string consisting of Bearer and the API key obtained from created application access. A space is required between Bearer and the API key. For example, Bearer d59******9C3. |
Content-Type |
Yes |
String |
MIME type of the request body. The value is application/json. |
Request body parameters
The request body parameters of the V1 and V2 inference APIs are the same, as described in Table 6.
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
messages |
Yes |
Array of ChatCompletionMessageParam objects |
Multi-turn dialogue question-answer pairs, which contain the role and content attributes.
The messages parameter helps the model to generate a proper response based on the context in the dialogue. |
model |
Yes |
String |
ID of the model to be used. Set this parameter based on the deployed model. The value can be DeepSeek-R1 or DeepSeek-V3. |
stream |
No |
boolean |
Whether to enable the streaming mode. The streaming protocol is Server-Sent Events (SSE). If the streaming mode is enabled, set the value to true. After the streaming mode is enabled, this API sends the generated text to the client in real time, instead of sending all text at a time after the text generation is complete. Default value: false |
temperature |
No |
Float |
Used to control the diversity and creativity of the generated text. A floating-point number that controls the randomness of sampling. A lower temperature produces more deterministic outputs. A higher temperature, for example, 0.9, produces more creative outputs. The value 0 indicates greedy sampling. If the value is greater than 1, the effect is very likely to be unavailable. temperature is one of the key parameters that affect the output quality and diversity of an LLM. Other parameters, like top_p, can also be used to adjust the behavior and preferences of the LLM. However, do not use temperature and top_p at the same time. Minimum value: 0. It is recommended that the value be greater than or equal to 1e-5. Maximum value: 1.0 Default value: 1.0 |
top_p |
No |
Float |
Nucleus sampling parameter. As an alternative to adjusting the sampling temperature, the model will consider the results of the top_p probability tokens. 0.1 means that only tokens included in the top 10% probability will be considered. You are advised to change the value or temperature, but not both. Value range: (0.0, 1.0] Default value: 0.8
NOTE:
A token is the smallest unit of text a model can work with. A token can be a word or part of characters. An LLM turns input and output text into tokens, generates a probability distribution for each possible word, and then samples tokens according to the distribution. |
max_tokens |
No |
Integer |
Maximum number of output tokens for the generated text. The total length of the input text plus the generated text cannot exceed the maximum length that the model can process. Minimum value: 1 Maximum value: 8192 Default value: 4096 |
presence_penalty |
No |
Float |
Controls how the LLM processes new tokens. If a token has already appeared in the generated text, the model will penalize this token when generating text. If the value of presence_penalty is a positive number, the model tends to generate new tokens that have not appeared before. That is, the model tends to talk about new topics. Minimum value: -2 Maximum value: 2 Default value: 0 (indicating that this parameter does not take effect) |
frequency_penalty |
No |
Float |
Controls how the model penalizes new tokens based on their existing frequency. If a token appears frequently in the training set, the model will penalize this token when generating text. A positive value penalizes new tokens that have already been used frequently, making it less likely to repeat words or phrases exactly. Minimum value: -2 Maximum value: 2 Default value: 0 (indicating that this parameter does not take effect) |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
role |
Yes |
String |
Role in a dialog. The default value range is as follows: system, user, assistant, tool, function. Customization is supported. If you want the model to answer questions as a specific persona, set role to system. If you do not use a specific persona, set role to user. When the parameter is returned, the value is fixed to assistant. In a dialogue request, you need to set role only once. |
content |
Yes |
String |
Content of a dialogue, which can be any text in tokens. A multi-turn dialogue cannot contain more than 20 content fields in the message parameter. Minimum length: 1 Maximum length: token length supported by different models. Default value: None |
Response Parameters
Non-streaming
Status code: 200
Parameter |
Type |
Description |
---|---|---|
id |
String |
Uniquely identifies each response. The value is in the format of "chatcmpl-{random_uuid()}". |
object |
String |
The value must be chat.completion. |
created |
Integer |
Time when a response is generated, in seconds. |
model |
String |
Request model ID. |
choices |
Array ofChatCompletionResponseChoice objects |
List of generated text. |
usage |
UsageInfo object |
Token usage for the dialogue. This parameter helps you learn about model usage, preventing the model from generating excessive tokens. |
prompt_logprobs |
Object |
Logarithmic probability information of the input text and the corresponding tokens. Default value: null |
Parameter |
Type |
Description |
---|---|---|
message |
ChatMessage object |
Generated text |
index |
Integer |
Index of the generated text, starting from 0 |
finish_reason |
String |
Reason why the model stops generating tokens. Value: stop, length, content_filter, tool_calls, or insufficient_system_resource stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered. length: The output length reaches the context length limit of the model or the max_tokens limit. content_filter: The output content is filtered due to filter conditions. tool_calls: The model determines to call an external tool (function/API) to complete the task. insufficient_system_resource: Generation is interrupted because system inference resources are insufficient. Default value: stop |
logprobs |
Object |
Evaluation metric, indicating the confidence value of the inference output. Default value: null |
stop_reason |
Union[Integer, String] |
Token ID or character string that instructs the model to stop generating. If the EOS token is encountered, the default value is returned. If the string or token ID in the stop parameter specified in the user request is encountered, the corresponding string or token ID is returned. This parameter is not a standard field of the OpenAI API but is supported by the vLLM API. Default value: None |
Parameter |
Type |
Description |
---|---|---|
prompt_tokens |
Number |
Number of tokens contained in the user prompt. |
total_tokens |
Number |
Number of all tokens in a dialog request. |
completion_tokens |
Number |
Number of answers tokens generated by the inference model. |
Parameter |
Type |
Description |
---|---|---|
role |
String |
Role that generates the message. The value must be assistant. |
content |
String |
Content of a dialogue Minimum length: 1 Maximum length: token length supported by different models. |
reasoning_content |
String |
Reasoning steps that led to the final conclusion (thinking process of the model).
NOTE:
This parameter is available only for the DeepSeek-R1 model. |
Streaming (with stream set to true)
Status code: 200
Parameter |
Type |
Description |
---|---|---|
data |
CompletionStreamResponse object |
If stream is set to true, messages generated by the model will be returned in streaming mode. The generated text is returned incrementally. Each data field contains a part of the generated text until all data fields are returned. |
Parameter |
Type |
Description |
---|---|---|
id |
String |
Unique identifier of the dialogue. |
created |
Integer |
Unix timestamp (in seconds) when the chat was created. The timestamps of each chunk in the streaming response are the same. |
model |
String |
Name of the model that generates the completion. |
object |
String |
Object type, which is chat.completion.chunk. |
choices |
A list of completion choices generated by the model. |
Parameter |
Type |
Description |
---|---|---|
index |
Integer |
Index of the completion in the completion choice list generated by the model. |
finish_reason |
String |
Reason why the model stops generating tokens. Value: stop, length, content_filter, tool_calls, or insufficient_system_resource stop: The model stops generating text after the task is complete or when a pre-defined stop sequence is encountered. length: The output length reaches the context length limit of the model or the max_tokens limit. content_filter: The output content is filtered due to filter conditions. tool_calls: The model determines to call an external tool (function/API) to complete the task. insufficient_system_resource: Generation is interrupted because system inference resources are insufficient. |
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Error code |
error_msg |
String |
Error message |
Example Request
- Non-streaming
V1 inference API: POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "model":"DeepSeek-V3", "messages":[ { "role":"user", "content": "Hello" }] }
V2 inference API: POST https://{endpoint}/api/v2/chat/completions Request Header: Content-Type: application/json Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e... Request Body: { "model":"DeepSeek-V3", "messages":[ { "role":"user", "content": "Hello" }] }
- Streaming (with stream set to true)
V1 inference API: POST https://{endpoint}/v1/{project_id}/alg-infer/3rdnlp/service/{deployment_id}/v1/chat/completions Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "model":"DeepSeek-V3", "messages":[ { "role":"user", "content": "Hello" }], "stream":true }
V2 inference API: POST https://{endpoint}/api/v2/chat/completions Request Header: Content-Type: application/json Authorization: Bearer 201ca68f-45f9-4e19-8fa4-831e... Request Body: { "model":"DeepSeek-V3", "messages":[ { "role":"user", "content": "Hello" }], "stream":true }
Example Response
Status code: 200
OK
- Non-streaming Q&A response
{ "id": "chat-9a75fc02e45d48db94f94ce38277beef", "object": "chat.completion", "created": 1743403365, "model": "DeepSeek-V3", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello. How can I help you?", "tool_calls": [] }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 64, "total_tokens": 73, "completion_tokens": 9 } }
- Non-streaming Q&A response with chain of thought
{ "id": "81c34733-0e7c-4b4b-a044-1e1fcd54b8db", "model": "deepseek-r1_32k", "created": 1747485310, "choices": [ { "index": 0, "message": { "role": "assistant", "content": "\n\nHello. Nice to meet you. Is there anything I can do for you?", "reasoning_content": "Hmm, the user just sent a short "Hello", which is a greeting in Chinese. First, I need to confirm their needs. They might want to test the reply or have a specific question. Also, I need to consider whether to respond in English, but since the user used Chinese, it's more appropriate to reply in Chinese.\n\nThen, I need to ensure the reply is friendly and complies with the guidelines, without involving sensitive content. The user may expect further conversation or need help with a question. At this point, I should maintain an open-ended response, inviting them to raise specific questions or needs. For example, I could say, "Hello! Nice to meet you, is there anything I can help you with?" This is both polite and proactive in offering assistance.\n\nAdditionally, avoid using any format or markdown, keeping it natural and concise. The user might be new to this platform and unfamiliar with how to ask questions, so a more encouraging tone might be better. Check for any spelling or grammatical errors to ensure the reply is correct and error-free.\n" "tool_calls": [ ] }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 184, "prompt_tokens": 6, "total_tokens": 190 } }
- Streaming Q&A response
Response body of the V1 inference API data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"message":{"role":"assistant"},"logprobs":null,"finish_reason":null}] data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content": "Hello"},"logprobs":null,"finish_reason":null}]} data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"message":{"content":", can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]} data:[DONE]
Response body of the V2 inference API data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}] data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content": "Hello"},"logprobs":null,"finish_reason":null}]} data:{"id":"chat-97313a4bc0a342558364345de0380291","object":"chat.completion.chunk","created":1743404317,"model":"DeepSeek-V3","choices": [{"index":0,"delta":{"content":". Can I help you?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}]} data:[DONE]
- Streaming Q&A response with chain of thought
Response body of the V1 inference API data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Hmm"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"reasoning_content":", "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content":"user sent "},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}} ... data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"reasoning_content": "final"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"\n\nHello."},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": ["{\"index\":0,\"message\":{\"content\":\"you\"},\"logprobs\":null,\"finish_reason\":null}"],\"usage\":{\"prompt_tokens\":6,\"total_tokens\":195,\"completion_tokens\":189}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"message":{"content":". What"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"can I do"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"message":{"content":"for you"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"message":{"content":"? "},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}} data:[DONE]
Response body of the V2 inference API data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":6,"completion_tokens":0}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"Hmm"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":7,"completion_tokens":1}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"reasoning_content":","},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":8,"completion_tokens":2}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content":"user sent"},"logprobs":null,"finish_reason":null}], "usage":{"prompt_tokens":6,"total_tokens":10,"completion_tokens":4}} ... data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"reasoning_content": "Generate"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":185,"completion_tokens":179}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"final"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":186,"completion_tokens":180}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"reasoning_content":"reply.\n"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":188,"completion_tokens":182}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":"\n\nHello"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":191,"completion_tokens":185}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [ {"index":0,"delta":{"content":". Nice"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":193,"completion_tokens":187}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "to meet"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":194,"completion_tokens":188}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"you"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":195,"completion_tokens":189}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":197,"completion_tokens":191}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "Can I help"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":199,"completion_tokens":193}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices": [{"index":0,"delta":{"content": "your"},"logprobs":null,"finish_reason":null}] ,"usage":{"prompt_tokens":6,"total_tokens":201,"completion_tokens":195}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}} data:{"id":"chat-cc897cfa872a4fc993a803bbddf9268a","object":"chat.completion.chunk","created":1747485542,"model":"DeepSeek-R1","choices":[],"usage":{"prompt_tokens":6,"total_tokens":203,"completion_tokens":197}} data:[DONE]
- Streaming Q&A response when the content is not approved
event:moderation data:{"suggestion":"block","reply":"As an AI language model, my goal is to provide help and information in a positive, proactive, and safe manner. Your question is beyond my answer range."} data:[DONE]
Status Codes
For details, see Status Codes.
Error Codes
For details, see Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot