Updated on 2025-12-08 GMT+08:00

Creating an Interactive Dialog

Function

Creates an interactive dialog.

Calling Method

For details, see Calling APIs.

Authorization Information

Each account has all the permissions required to call all APIs, but IAM users must be assigned the required permissions. For details about the required permissions, see Permissions Policies and Supported Actions.

URI

POST /v1/{project_id}/smart-chat-rooms

Table 1 Path Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain the project ID, see Obtaining a Project ID.

Request Parameters

Table 2 Request header parameters

Parameter

Mandatory

Type

Description

X-Auth-Token

No

String

User token. This parameter is mandatory when token authentication is used.

You can obtain the token by calling the IAM API used to obtain a user token.

Value of X-Subject-Token in the response header.

Authorization

No

String

Authentication information. This parameter is mandatory for AK/SK authentication.

X-Sdk-Date

No

String

Time when the request is sent. This parameter is mandatory for AK/SK authentication.

The format is YYYYMMDD'T'HHMMSS'Z'.

X-Project-Id

No

String

Project ID. This parameter is mandatory for AK/SK authentication.

X-App-UserId

No

String

Third-party user ID, which does not allow Chinese characters.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

room_name

Yes

String

Dialog name

room_description

No

String

Dialog description.

video_config

No

VideoConfig object

Video output configuration.

NOTE:
  • Intelligent interaction supports only codec=H264, bitrate, width, height, and frame_rate.

model_asset_id

No

String

Virtual avatar model asset ID.

voice_config

No

VoiceConfig object

Voice parameter settings.

NOTE:
  • This parameter will be discarded. Use voice_config_list instead.

voice_config_list

No

Array of ChatVoiceConfig objects

Voice configuration parameter list.

robot_id

No

String

Bot ID. For details about how to obtain the ID, see Creating an Application.

billing_mode

No

String

Billing mode. The default value is CONCURRENCY.

  • CONCURRENCY: concurrent billing

  • CLIENT: billed by access end

  • CLIENT_TOKENS: billed by access end (tokens)

reuse_resource

No

Boolean

Whether to allow the use of unallocated concurrent quota (cannot be reused in device mode). By default, the unallocated concurrent quota is not used.

concurrency

No

Integer

Definition:

Concurrent dialogs.

Constraints:

By default, this parameter is not specified. Only after specifying the number of concurrent dialogs can you start an intelligent interaction task.

Value range:

0~1024

client_nums

No

Integer

Definition:

Number of allowed access terminals.

Value range:

0~1024

default_language

No

String

Default language, which is used by the intelligent interaction APIs. Default value: CN

  • CN: Simplified Chinese

  • EN: English

  • ESP: Spanish (supported only outside China)

  • por: Portuguese (supported only outside China)

  • Arabic: Arabic (supported only outside China)

  • Thai: Thai (supported only outside China)

background_config

No

BackgroundConfigInfo object

Background setting. If background_config or background_type is set to null, the background setting is empty.

layer_config

No

Array of LayerConfig objects

Overlay configuration.

review_config

No

ReviewConfig object

Content review configuration

chat_subtitle_config

No

ChatSubtitleConfig object

Dialog subtitle configuration

chat_video_type

No

String

Interactive dialog device.

  • COMPUTER: PC

  • MOBILE: mobile phone

  • Hub: large screen

exit_mute_threshold

No

Integer

Definition:

Inactivity timeout after which a dialog automatically closes.

Value range:

0~1440

enable_semantic_action

No

Boolean

Whether to load model assets preferentially.

chat_resource_config

No

Array of ChatResourceConfig objects

Resource configuration.

Table 4 VideoConfig

Parameter

Mandatory

Type

Description

clip_mode

No

String

Definition

Clipping mode of the output video.

Constraints

N/A

Range

  • RESIZE: video scaling

  • CROP: video cropping

Default value:

RESIZE

codec

No

String

Definition

Video encoding format and video file format.

Constraints

Only virtual avatar video production supports VP8 and QTRLE encoding. When QTRLE encoding is used, the number of characters for text-based control is less than 1,500, and the audio length for audio-based control is less than 5 minutes.

You can use QTRLE encoding only after being whitelisted.

Range

  • H264: H.264 encoding, MP4 file output

  • VP8: VP8 encoding, WebM file output

  • QTRLE: QTRLE encoding, MOV file output

Default Value

H264

Default value:

H264

bitrate

Yes

Integer

Definition

Average output bitrate. Unit: kbit/s

Constraints

  • Quality is prioritized for virtual avatar video production, which may exceed the preset bitrate.

  • Bitrate range for virtual avatar video production: [1000, 8000].

Default Value

N/A

Value range:

40~30000

width

Yes

Integer

Definition

Video width. Unit: pixel

Constraints

  • When clip_mode is set to RESIZE, the following resolutions are supported: 1920 x 1080, 1080 x 1920, 1280 x 720, 720 x 1280, 3840 x 2160, and 2160 x 3840. 4K is available only when the virtual avatar model supports 4K.

  • When clip_mode is set to CROP, (dx, dy) is the origin, and the width is the actual width of the reserved video.

  • Virtual avatar livestreaming and intelligent interaction support only 1080 x 1920 and 1920 x 1080.

Default Value

N/A

Value range:

0~3840

height

Yes

Integer

Definition

Video height.

Unit: pixel

Constraints

  • When clip_mode is set to RESIZE, the following resolutions are supported: 1920 x 1080, 1080 x 1920, 1280 x 720, 720 x 1280, 3840 x 2160, and 2160 x 3840.

  • When clip_mode is set to CROP, (dx, dy) is the origin, and the height is the actual height of the reserved video.

  • Virtual avatar livestreaming and intelligent interaction support only 1080 x 1920 and 1920 x 1080.

Default Value

N/A

Value range:

0~3840

frame_rate

No

String

Definition

Frame rate. Unit: FPS

Constraints

The virtual avatar video frame rate is fixed at 25 FPS.

Default value:

25

is_subtitle_enable

No

Boolean

Definition

Whether the output video is subtitled.

Constraints

Subtitles are not supported for virtual avatar livestreaming.

Range

  • true: subtitling enabled

  • false: subtitling disabled

Default value:

false

subtitle_config

No

SubtitleConfig object

Subtitle configuration.

dx

No

Integer

Definition

Horizontal coordinate of the pixel in the upper left corner of the cropped video.

NOTE:
The image layout size is based on the model resolution. For example, for a model with the resolution of 1920 x 1080, the value of dx ranges from 0 to 1920.

Constraints

This parameter takes effect when clip_mode is set to CROP.

Default Value

N/A

Value range:

-1920~3840

dy

No

Integer

Definition

Vertical coordinate of the pixel in the upper left corner of the cropped video.

NOTE:
The image layout size is based on the model resolution. For example, for a model with the resolution of 1920 x 1080, the value of dy ranges from 0 to 1080.

Constraints

This parameter takes effect when clip_mode is set to CROP.

Default Value

N/A

Value range:

-1920~3840

is_enable_super_resolution

No

Boolean

Definition

Whether super resolution is enabled for a video.

Constraints

This parameter is available only for virtual avatar video production.

Range

  • true: enabled

  • false: disabled

Default value:

false

is_end_at_first_frame

No

Boolean

Definition

Whether the end frame of a video is the same as the start frame. Set this parameter to true if multiple virtual avatar videos need to be seamlessly merged.

Constraints

This parameter is supported only for virtual avatar video production. This setting becomes invalid after an action tag is inserted during video production.

Range

  • true: enabled

  • false: disabled

Default value:

false

output_external_url

No

String

External URL to which a video file is uploaded.

NOTE:
  • You can upload a video to an external URL only after being whitelisted.

is_vocabulary_config_enable

No

Boolean

Definition

Whether to apply the pronunciation configuration of the current tenant.

Constraints

This parameter is available only for virtual avatar video production.

Range

  • true: enabled

  • false: disabled

Default value:

true

Table 5 SubtitleConfig

Parameter

Mandatory

Type

Description

dx

No

Integer

Definition

Coordinates of the pixel in the lower left corner of the subtitle box.

Constraints

N/A

Default Value

N/A

Value range:

0~1920

dy

No

Integer

Definition

Coordinates of the pixel in the lower left corner of the subtitle box.

Constraints

N/A

Default Value

N/A

Value range:

0~1920

h

No

Integer

Definition

Subtitle box height.

Constraints

The parameter h is used to facilitate the calculation of the coordinates in the upper left corner of the subtitle box. This parameter is not used in the background.

Value range:

0~1920

w

No

Integer

Definition

Subtitle box width.

Constraints

  • The subtitle box width is fixed at 80/ %of the screen width.

  • The parameter w is used to facilitate the calculation of the coordinates in the upper left corner of the subtitle box. This parameter is not used in the background.

Value range:

0~1920

font_name

No

String

Definition

Font. For details about the supported fonts, see Supported Fonts.

Constraints

N/A

Range

The value can contain 0 to 64 characters.

Default value:

HarmonyOS_Sans_SC_Black

font_size

No

Integer

Definition

Font size. The interface value ranges from 0 to 120. The actual value range is 24 to 120. Use the actual value range.

Constraints

N/A

Value range:

0~120

Default value:

54

font_color

No

String

Definition

RGB color value of the subtitle font.

Constraints

None.

Range

The value has a fixed length and contains 0 to 7 characters.

Default value:

#FFFFFF

stroke_color

No

String

Definition

RGB color value of the subtitle font stroke.

Constraints

None.

Range

The value has a fixed length and contains 0 to 7 characters.

stroke_thickness

No

Float

Definition

Pixel value of the subtitle font stroke.

Constraints

None.

Range

0-50

Value range:

0~50

opacity

No

Float

Definition

Subtitle font opacity. 0 indicates 100/ %transparency and 1 indicates 100/ %opacity. The default value is 1.

Constraints

None.

Range

0-1

Value range:

0~1

Default value:

1

Table 6 VoiceConfig

Parameter

Mandatory

Type

Description

voice_asset_id

Yes

String

Definition

Timbre asset ID, which can be queried from the asset library.

For details about how to query voice IDs, see Querying Preset Voice IDs.

Constraints

N/A

Range

The value can contain 1 to 256 characters.

Default Value

N/A

speed

No

Integer

Definition

Speaking speed. 50 indicates 0.5x speaking speed, 100 indicates normal speaking speed, and 200 indicates 2x speaking speed.

The value 100 indicates the normal speaking speed of an adult, which is about 250 words per minute.

Constraints

N/A

Value range:

50~200

Default value:

100

pitch

No

Integer

Definition

Pitch.

Constraints

N/A

Value range:

50~200

Default value:

100

volume

No

Integer

Definition

Volume.

Constraints

N/A

Value range:

90~240

Default value:

140

Table 7 ChatVoiceConfig

Parameter

Mandatory

Type

Description

voice_asset_id

No

String

Speech synthesis feature string

speed

No

Integer

Speaking speed. The value ranges from 50 to 200 and defaults to 100.

NOTE:
The value 100 indicates the normal speaking speed of an adult, which is about 250 words per minute.

Value range:

50~200

Default value:

100

pitch

No

Integer

Pitch. The value ranges from 50 to 200 and defaults to 100.

Value range:

50~200

Default value:

100

volume

No

Integer

Volume. The value ranges from 90 to 240 and defaults to 140.

Value range:

90~240

Default value:

140

provider

No

String

Third-party TTS vendor. Options:

  • XIMALAYA: Himalaya TTS

  • HUAWEI_EI: EI TTS

  • MOBVOI: Mobvoi TTS

language

No

String

Language type. Default value: CN

  • CN: Simplified Chinese

  • EN: English

  • ESP: Spanish (supported only outside China)

  • por: Portuguese (supported only outside China)

  • Arabic: Arabic (supported only outside China)

  • Thai: Thai (supported only outside China)

Default value:

CN

Table 8 BackgroundConfigInfo

Parameter

Mandatory

Type

Description

background_type

Yes

String

Definition

Background type.

Constraints

N/A

Range

  • IMAGE: image background, which is used as the virtual avatar video background

  • COLOR: solid color background. The RGB value of the specified color is used as the virtual avatar video background.

Default Value

N/A

background_config

No

String

Definition

Background file URL.

Constraints

  • External URLs are allowed only for livestreaming. For other services, obtain a URL from the asset library.

  • This parameter is mandatory when background_type is set to IMAGE.

Range

The value contains 1 to 2,048 characters.

Default Value

N/A

background_color_config

No

String

Definition

RGB color value of a solid color background.

Constraints

This parameter is mandatory when background_type is set to COLOR.

Range

The value contains 0 to 16 characters.

Default value:

#FFFFFF

background_asset_id

No

String

Definition

Background asset ID.

NOTE:
If a background image is used, enter the image asset ID.

Constraints

N/A

Range

The value can contain 0 to 64 characters.

Default Value

N/A

background_image_config

No

BackgroundImageConfig object

Background image size and position setting.

Table 9 BackgroundImageConfig

Parameter

Mandatory

Type

Description

dx

Yes

Integer

Definition

X axis position of the pixel in the upper left corner of the background image. The coordinate of the upper left corner of the preview area is 0x0.

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The background image must cover the entire preview area. That is, dx ≤ 0, dx + width ≥ 1920 in landscape mode, and dx + width ≥ 1080 in portrait mode.

Value range:

-5760~0

Default value:

0

dy

Yes

Integer

Definition

Y axis position of the pixel in the upper left corner of the background image. The coordinate of the upper left corner of the preview area is 0x0.

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The background image must cover the entire preview area. That is, dy ≤ 0, dy + height ≥ 1080 in landscape mode, and dy + height ≥ 1920 in portrait mode.

Value range:

-5760~0

Default value:

0

width

Yes

Integer

Definition

Width (in pixels) of the background image (relative to the preview area size).

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The background image must cover the entire preview area. That is, width > 1080, dx + width ≥ 1920 in landscape mode, and dx + width ≥ 1080 in portrait mode.

Value range:

1~7680

height

Yes

Integer

Definition

Height (in pixels) of the background image (relative to the preview area size).

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The background image must cover the entire preview area. height > 1080, dy + height ≥ 1080 in landscape mode, and dy + height ≥ 1920 in portrait mode.

Value range:

1~7680

Table 10 LayerConfig

Parameter

Mandatory

Type

Description

layer_type

Yes

String

Definition

Layer type.

Constraints

N/A

Range

  • HUMAN: person layer

  • IMAGE: image layer

  • VIDEO: video layer

  • TEXT: text layer

Default Value

N/A

asset_id

No

String

Definition

ID of the asset overlaid on a video. You do not need to set this parameter for external assets.

Constraints

N/A

Range

The value can contain 0 to 64 characters.

Default Value

N/A

group_id

No

String

Definition

Groups materials in multiple scenes. Materials with the same group_id share location information when they are applied globally.

Constraints

N/A

Range

The value can contain 0 to 64 characters.

Default Value

N/A

sequence_no

No

Integer

Definition

Overlay of the paragraph currently being shown. This field is forward compatible and optional.

This parameter is valid only for livestreaming.

Constraints

The paragraph is subject to sequence_no.

Default Value

N/A

Value range:

0~2147483647

position

No

LayerPositionConfig object

Layer position configuration.

size

No

LayerSizeConfig object

Layer size configuration.

rotation

No

LayerRotationConfig object

Overlay rotation configuration.

image_config

No

ImageLayerConfig object

Image layer configuration.

video_config

No

VideoLayerConfig object

Video overlay configuration.

text_config

No

TextLayerConfig object

Material text layer configuration.

Table 11 LayerPositionConfig

Parameter

Mandatory

Type

Description

dx

Yes

Integer

Definition

X axis position of the pixel in the upper left corner of the image. The coordinate of the upper left corner of the image layout is 0x0.

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image.

Value range:

-1920~3840

Default value:

0

dy

Yes

Integer

Definition

Y axis position of the pixel in the upper left corner of the image. The coordinate of the upper left corner of the image layout is 0x0.

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image.

Value range:

-1920~3840

Default value:

0

layer_index

Yes

Integer

Definition

Overlay sequence of an image, video, or person image.

NOTE:
The overlay sequence is an integer starting from 1 and incremented by 1.

Constraints

If there are duplicate overlays, the relationship between the duplicate overlays is random.

Value range:

1~100

Default value:

100

Table 12 LayerSizeConfig

Parameter

Mandatory

Type

Description

width

No

Integer

Definition

Y axis position of the pixel in the upper left corner of the image, that is, width (in pixels) of the image overlay (relative to the preview area size).

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image.

Value range:

1~7680

height

No

Integer

Definition

Height (in pixels) of the image overlay (relative to the preview area size).

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Constraints

The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image. |

Value range:

1~7680

Table 13 LayerRotationConfig

Parameter

Mandatory

Type

Description

angle

No

Integer

Definition

Rotation angle.

Range

0 to 360 degrees

Default Value

0 degrees

Constraints

The material is rotated around the center point.

Video materials cannot be rotated.

Value range:

0~360

Table 14 ImageLayerConfig

Parameter

Mandatory

Type

Description

image_url

No

String

Definition

Image file URL.

Constraints

  • External URLs are allowed only for livestreaming. For other services, obtain a URL from the asset library.

Range

The value contains 1 to 2,048 characters.

Default Value

N/A

Table 15 VideoLayerConfig

Parameter

Mandatory

Type

Description

video_url

No

String

Definition

Video file URL.

Constraints

  • External URLs are allowed only for livestreaming. For other services, obtain a URL from the asset library.

Range

The value contains 1 to 2,048 characters.

Default Value

N/A

video_cover_url

No

String

Definition

Video thumbnail file URL.

Constraints

  • External URLs are allowed only for livestreaming. For other services, obtain a URL from the asset library.

Range

The value contains 1 to 2,048 characters.

Default Value

N/A

loop_count

No

Integer

Definition

Number of times that a video is played cyclically.

Options:

  • 0: not played

  • -1: played cyclically

Constraints

N/A

Value range:

-1~100

Default value:

-1

video_sound

No

Integer

Definition

The percentage used to adjust the volume of the video overlay. The value ranges from 0 to 100.

The default value 0 indicates the audio is muted.

Constraints

N/A

Value range:

0~100

is_play_the_entire_video

No

Boolean

Definition

Whether to play the entire video. true indicates that the entire video is played. false indicates that the video stops playing when the inserted scene text or audio ends.

Options:

The default value is false.

Constraints

N/A

Table 16 TextLayerConfig

Parameter

Mandatory

Type

Description

text_context

No

String

Definition

Text of the text layer. The content must be encoded using Base64.

For example, if you want to add the text watermark "Test text watermark", set text_context to 5rWL6K+V5paH5a2X5rC05Y2w.

Constraints

N/A

Range

The value contains 0 to 1,024 characters.

Default Value

N/A

font_name

No

String

Font. For details about the supported fonts, see Supported Fonts.

Constraints

N/A

Range

The value can contain 0 to 64 characters.

Default value:

HarmonyOS_Sans_SC_Black

font_size

No

Integer

Definition

Font size (in pixels). The interface value ranges from 0 to 120. The actual value range is 4 to 120. Use the actual value range.

Constraints

N/A

Value range:

0~120

Default value:

16

font_color

No

String

Definition

Font color. RGB color value.

Constraints

N/A

Range

The value contains 0 to 16 characters.

Default value:

#FFFFFF

Table 17 ReviewConfig

Parameter

Mandatory

Type

Description

no_need_review

No

Boolean

Content review whitelist. This feature is available only for users in the whitelist. The auto review policies apply to other users.

Table 18 ChatSubtitleConfig

Parameter

Mandatory

Type

Description

dx

No

Integer

Details:

Coordinates of the pixel in the upper left corner of the subtitle box.

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Value range:

0~1920

dy

No

Integer

Details:

Coordinates of the pixel in the upper left corner of the subtitle box.

The video resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Value range:

0~1920

width

No

Integer

Details:

Width (in pixel) of the layer image (relative to the image layout size).

The image layout resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Value range:

1~7680

height

No

Integer

Details:

Height (in pixel) of the layer image (relative to the image layout size).

The image layout resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16).

Value range:

1~7680

Table 19 ChatResourceConfig

Parameter

Mandatory

Type

Description

resource_id

No

String

Resource ID

count_resource

No

Integer

Resource quantity

Value range:

0~10000

Response Parameters

Status code: 200

Table 20 Response header parameters

Parameter

Type

Description

X-Request-Id

String

Request ID.

Table 21 Response body parameters

Parameter

Type

Description

room_id

String

Interactive dialog ID

Status code: 400

Table 22 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error description.

Status code: 401

Table 23 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error description.

Status code: 404

Table 24 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error description.

Status code: 500

Table 25 Response body parameters

Parameter

Type

Description

error_code

String

Error code.

error_msg

String

Error description.

Example Requests

POST https://{endpoint}/v1/70b76xxxxxx34253880af501cdxxxxxx/smart-chat-rooms

{
  "room_name" : "The Legend of Nature",
  "room_description" : "Courseware"
}

Example Responses

Status code: 200

Succeeded.

{
  "room_id" : "26f06524-4f75-4b3a-a853-b649a21aaf66"
}

Status code: 400

Parameters error, including the error code and its description.

{
  "error_code" : "MSS.00000003",
  "error_msg" : "Invalid parameter"
}

Status code: 401

Authentication is not performed or fails.

{
  "error_code" : "MSS.00000001",
  "error_msg" : "Unauthorized"
}

Status code: 404

No content.

{
  "error_code" : "MSS.00000002",
  "error_msg" : "Not Found"
}

Status code: 500

Internal service error.

{
  "error_code" : "MSS.00000004",
  "error_msg" : "Internal Error"
}

Status Codes

Status Code

Description

200

Succeeded.

400

Parameters error, including the error code and its description.

401

Authentication is not performed or fails.

404

No content.

500

Internal service error.

Error Codes

See Error Codes.