Creating a Virtual Avatar Video Production Task
Function
Creates a virtual avatar video production task.
Calling Method
For details, see Calling APIs.
URI
POST /v1/{project_id}/2d-digital-human-videos
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
project_id |
Yes |
String |
Project ID. For details about how to obtain the project ID, see Obtaining a Project ID. |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
X-Auth-Token |
No |
String |
User token. This parameter is mandatory when token authentication is used. You can obtain the token by calling the IAM API used to obtain a user token. Value of X-Subject-Token in the response header. |
Authorization |
No |
String |
Authentication information. This parameter is mandatory for AK/SK authentication. |
X-Sdk-Date |
No |
String |
Time when the request is sent. This parameter is mandatory for AK/SK authentication. The format is YYYYMMDD'T'HHMMSS'Z'. |
X-Project-Id |
No |
String |
Project ID. This parameter is mandatory for AK/SK authentication. |
X-App-UserId |
No |
String |
Third-party user ID, which does not allow Chinese characters. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
script_id |
No |
String |
Script ID.
|
model_asset_id |
No |
String |
Virtual avatar model asset ID, which can be queried from the asset library. |
voice_config |
No |
VoiceConfig object |
Timbre configuration. |
video_config |
No |
VideoConfig object |
Video output configuration. |
shoot_scripts |
No |
Array of ShootScriptItem objects |
Shooting script list. |
output_asset_config |
No |
OutputAssetConfig object |
Output asset information configuration. |
background_music_config |
No |
BackgroundMusicConfig object |
Background music configuration. |
review_config |
No |
ReviewConfig object |
Configures content review. |
callback_config |
No |
CallBackConfig object |
Callback setting. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
voice_asset_id |
Yes |
String |
Details: Timbre asset ID, which can be queried from the asset library. Constraints: N/A Options: The value contains 1 to 256 characters. Default value: N/A |
speed |
No |
Integer |
Details: Speaking speed. 50 indicates 0.5x speaking speed, 100 indicates normal speaking speed, and 200 indicates 2x speaking speed. The value 100 indicates the normal speaking speed of an adult, which is about 150 words per minute. Constraints: N/A Value range: 50-200 Default value: 100 |
pitch |
No |
Integer |
Details: Pitch. Constraints: N/A Value range: 50-200 Default value: 100 |
volume |
No |
Integer |
Details: Volume. Constraints: N/A Value range: 90-240 Default value: 140 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
clip_mode |
No |
String |
Details: Clipping mode of the output video. Constraints: N/A Options:
Default value: RESIZE |
codec |
Yes |
String |
Details: Video encoding format and video file format. Constraints: Only virtual avatar video production supports VP8 encoding. Options:
Default value: N/A |
bitrate |
Yes |
Integer |
Details: Average output bitrate. Unit: kbit/s Constraints:
Default value: N/A Value range: 40-30000 |
width |
Yes |
Integer |
Details: Video width. Unit: pixel. Constraints:
Default value: N/A Value range: 0-3840 |
height |
Yes |
Integer |
Details: Video height. Unit: pixel. Constraints:
Default value: N/A Value range: 0-3840 |
frame_rate |
No |
String |
Details: Frame rate. Unit: FPS Constraints: The virtual avatar video frame rate is fixed at 25 FPS. Default value: 25 |
is_subtitle_enable |
No |
Boolean |
Details: Whether the output video is subtitled. Constraints: Subtitles are not supported for virtual avatar livestreaming. Options:
Default value: false |
subtitle_config |
No |
SubtitleConfig object |
Subtitle configuration. |
disable_system_watermark |
No |
Boolean |
Details: Indicates whether the system watermark is disabled for the output video. Constraints: Currently, this parameter takes effect only for trustlisted tenants. Value range:
Default value: false |
dx |
No |
Integer |
Details: Horizontal coordinate of the pixel in the upper left corner of the cropped video. The image layout size is based on the model resolution. For example, for a model with the resolution of 1920 x 1080, the value of dx ranges from 0 to 1920. Constraints: This parameter takes effect when clip_mode is set to CROP. Default value: N/A Value range: -1920-3840 |
dy |
No |
Integer |
Details: Vertical coordinate of the pixel in the upper left corner of the cropped video. The image layout size is based on the model resolution. For example, for a model with the resolution of 1920 x 1080, the value of dy ranges from 0 to 1080. Constraints: This parameter takes effect when clip_mode is set to CROP. Default value: N/A Value range: -1920-3840 |
is_enable_super_resolution |
No |
Boolean |
Details: Whether super resolution is enabled for a video. Constraints: This parameter is available only for virtual avatar video production. Options:
Default value: false |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
dx |
No |
Integer |
Details: Coordinates of the pixel in the lower left corner of the subtitle box. Constraints: N/A Default value: N/A Value range: 0-1920 |
dy |
No |
Integer |
Details: Coordinates of the pixel in the lower left corner of the subtitle box. Constraints: N/A Default value: N/A Value range: 0-1920 |
font_name |
No |
String |
Details: Font. The following fonts are supported:
Constraints: N/A Options: The value contains 0 to 64 characters. Default value: HarmonyOS_Sans_SC_Black |
font_size |
No |
Integer |
Details: Font size. The interface value ranges from 0 to 120. The actual value range is 4 to 120. Use the actual value range. Constraints: N/A Value range: 0-120 Default value: 54 |
h |
No |
Integer |
Details: Subtitle box height. Constraints: The parameter h is used to facilitate the calculation of the coordinates in the upper left corner of the subtitle box. This parameter is not used in the background. Value range: 0-1920 |
w |
No |
Integer |
Details: Subtitle box width. Constraints:
Value range: 0-1920 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
sequence_no |
No |
Integer |
Details: Script No. Constraints: The sequence number of a script must be unique. Default value: N/A Value range: 0-2147483647 |
start_time |
No |
Float |
Details: Start time. The unit is second. Start time relative to the content. Constraints: Reserved field. Currently, only sequence_no needs to be set. Default value: N/A Value range: 0-2592000 |
end_time |
No |
Float |
Details: End time. The unit is second. End time relative to the content. Constraints: Reserved field. Currently, only sequence_no needs to be set. Default value: N/A Value range: 0-2592000 |
shoot_script |
Yes |
ShootScript object |
Performance script. |
subtitle_file_info |
No |
SubtitleFiles object |
Subtitle file information. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
script_type |
No |
String |
Details: Script type, that is, the control mode of video production. Constraints: N/A Options: TEXT: text control, that is, using TTS AUDIO: speech control Default value: TEXT |
text_config |
No |
TextConfig object |
Commentary configuration. |
audio_drive_action_config |
No |
Array of AudioDriveActionConfig objects |
Action configuration for speech control. |
animation_config |
No |
Array of AnimationConfig objects |
Action configuration.
|
background_config |
No |
Array of BackgroundConfigInfo objects |
Background configuration. |
emotion_config |
No |
Array of EmotionConfig objects |
Emotion tag configuration.
|
layer_config |
No |
Array of LayerConfig objects |
Layer configuration. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
text |
Yes |
String |
Details: Script. Two modes are supported: plain text mode and tag mode.
Constraints: The value can contain a maximum of 10,000 characters, excluding the SSML tag. Options: The value contains 0 to 131,072 characters. Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
action_tag |
Yes |
String |
Action tag |
action_name |
No |
String |
Action name |
action_start_time |
Yes |
Float |
Action start time Value range: 0-2592000 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
background_type |
Yes |
String |
Details: Background type. Constraints: N/A Options:
Default value: N/A |
background_title |
No |
String |
Background title.
|
human_position_2d |
No |
HumanPosition2D object |
Position of a virtual avatar in the background image. If this parameter is not set, the virtual avatar is in the middle of the image by default.
|
human_size_2d |
No |
HumanSize2D object |
Size of a virtual avatar in the background image.
|
background_cover_url |
No |
String |
URL for downloading the thumbnail image of a video file. This parameter is valid only when the presentation material is a video.
|
background_config |
No |
String |
Details: Background file URL. Constraints:
Options: The value contains 1 to 2,048 characters. Default value: N/A |
background_color_config |
No |
String |
Details: RGB color value of a solid color background. Constraints: This parameter is mandatory when background_type is set to COLOR. Options: The value contains 0 to 16 characters. Default value: #FFFFFF |
background_asset_id |
No |
String |
Details: Background asset ID. If a background image is used, enter the image asset ID. Constraints: N/A Options: The value contains 0 to 64 characters. Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
position |
No |
String |
Position of a virtual avatar in the background image.
If the values of position_x and position_y exist, position does not take effect. Default value: MIDDLE |
position_x |
No |
Integer |
X-axis position of the virtual avatar, that is, the X-axis pixel value of the center point at the bottom of the virtual avatar image. The resolution of the landscape (16:9) background image is 1920 x 1080 pixels. The resolution of the portrait (9:16) background image is 1080 x 1920 pixels. Value range: -1920-3840 |
position_y |
No |
Integer |
Y-axis position of the virtual avatar, that is, the Y-axis pixel value of the center point at the bottom of the virtual avatar image. The resolution of the landscape (16:9) background image is 1920 x 1080 pixels. The resolution of the portrait (9:16) background image is 1080 x 1920 pixels. Value range: -1920-3840 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
width |
No |
Integer |
Width (in pixel) of a virtual avatar. The resolution of the landscape (16:9) background image is 1920 x 1080 pixels. The resolution of the portrait (9:16) background image is 1080 x 1920 pixels. Value range: 1-7680 |
height |
No |
Integer |
Height (in pixel) of a virtual avatar. The resolution of the landscape (16:9) background image is 1920 x 1080 pixels. The resolution of the portrait (9:16) background image is 1080 x 1920 pixels. Value range: 1-7680 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
emotion |
No |
String |
Emotion tag configuration.
The default value is HAPPY. Default value: HAPPY |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
layer_type |
Yes |
String |
Details: Layer type. Constraints: N/A Options:
Default value: N/A |
asset_id |
No |
String |
Details: ID of the asset overlaid on a video. You do not need to set this parameter for external assets. Constraints: N/A Options: The value contains 0 to 64 characters. Default value: N/A |
group_id |
No |
String |
Details: Groups materials in multiple scenes. Materials with the same group_id share location information when they are applied globally. Constraints: N/A Options: The value contains 0 to 64 characters. Default value: N/A |
position |
No |
LayerPositionConfig object |
Layer position configuration. |
size |
No |
LayerSizeConfig object |
Layer size configuration. |
image_config |
No |
ImageLayerConfig object |
Material image layer configuration. |
video_config |
No |
VideoLayerConfig object |
Material video layer configuration. |
text_config |
No |
TextLayerConfig object |
Material text layer configuration. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
dx |
Yes |
Integer |
Details: X axis position of the pixel in the upper left corner of the image. The coordinate of the upper left corner of the image layout is 0x0. The image layout resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16). Constraints: The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image. Value range: -1920-3840 Default value: 0 |
dy |
Yes |
Integer |
Details: Y axis position of the pixel in the upper left corner of the image. The coordinate of the upper left corner of the image layout is 0x0. The image layout resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16). Constraints: The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image. Value range: -1920-3840 Default value: 0 |
layer_index |
Yes |
Integer |
Details: Layer sequence of an image, video, or person image. The layer sequence is an integer starting from 1 and incremented by 1. Constraints: If duplicate layers exist, the overlay relationship between the duplicate layers is random. Value range: 1-100 Default value: 100 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
width |
No |
Integer |
Details: Y axis position of the pixel in the upper left corner of the image. Width (in pixel) of the layer image (relative to the image layout size). The image layout resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16). Constraints: The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image. Value range: 1-7680 |
height |
No |
Integer |
Details: Height (in pixel) of the layer image (relative to the image layout size). The image layout resolution is 1920 x 1080 in landscape mode (16:9) and 1080 x 1920 in portrait mode (9:16). Constraints: The value is the pixel value relative to the image layout. It indicates only the layout position relationship and is irrelevant to the resolution of the output image. | Value range: 1-7680 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
image_url |
No |
String |
Details: Image file URL. Constraints: N/A Options: The value contains 1 to 2,048 characters. Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
video_url |
No |
String |
Details: Video file URL. Constraints: N/A Options: The value contains 1 to 2,048 characters. Default value: N/A |
video_cover_url |
No |
String |
Details: Video thumbnail file URL. Constraints: N/A Options: The value contains 1 to 2,048 characters. Default value: N/A |
loop_count |
No |
Integer |
Details: Number of times that a video is played cyclically. Options:
Constraints: N/A Value range: -1-100 Default value: -1 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
text_context |
No |
String |
Details: Text of the text layer. The content must be encoded using Base64. For example, if you want to add the text watermark "Test text watermark", set text_context to 5rWL6K+V5paH5a2X5rC05Y2w. Constraints: N/A Options: The value contains 0 to 1,024 characters. Default value: N/A |
font_name |
No |
String |
Details: Font. The following fonts are supported: Constraints: N/A Options: For details about the supported fonts, see Supported Fonts. Default value: HarmonyOS_Sans_SC_Black |
font_size |
No |
Integer |
Details: Font size (in pixel). The interface value ranges from 0 to 120. The actual value range is 4 to 120. Use the actual value range. Constraints: N/A Value range: 0-120 Default value: 16 |
font_color |
No |
String |
Details: Font color. RGB color value. Constraints: N/A Options: The value contains 0 to 16 characters. Default value: #FFFFFF |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
text_subtitle_file |
No |
SubtitleFileInfo object |
|
audio_subtitle_file |
No |
SubtitleFileInfo object |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
subtitle_file_download_url |
No |
String |
URL for downloading subtitle files. |
subtitle_file_upload_url |
No |
String |
URL for uploading subtitle files. |
subtitle_file_state |
No |
String |
Subtitle file generation status.
|
job_id |
No |
String |
Subtitle file generation task ID. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
asset_name |
Yes |
String |
Details: Output video asset name. Constraints: N/A Options: The value contains 0 to 256 characters. Default value: N/A |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
music_asset_id |
No |
String |
Details: Music asset ID. Constraints: N/A Options: The value contains 0 to 64 characters. Default value: N/A |
volume |
No |
Integer |
Details: Music volume. For example, 100 indicates that the volume is 100%, and 50 indicates that the volume is 50%. Constraints: N/A Value range: 0-100 Default value: 100 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
no_need_review |
No |
Boolean |
Content review whitelist. This feature is available only for users in the whitelist. The auto review policies apply to other users. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
callback_url |
Yes |
String |
Callback URL. The callback request body is in JSON format and contains the following parameters: result: SUCCEED or FAILED asset_id: asset ID job_id: task |
auth_type |
No |
String |
Authentication type.
Default value: NONE |
key |
No |
String |
Key |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
X-Request-Id |
String |
Request ID. |
Parameter |
Type |
Description |
---|---|---|
job_id |
String |
Task ID. |
Status code: 400
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Error code. |
error_msg |
String |
Error description. |
Status code: 401
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Error code. |
error_msg |
String |
Error description. |
Status code: 500
Parameter |
Type |
Description |
---|---|---|
error_code |
String |
Error code. |
error_msg |
String |
Error description. |
Example Requests
POST https://{endpoint}/v1/0d697589d98091f12f92c0073501cd79/2d-digital-human-videos { "model_asset_id" : "0c7798664ee7178b3dba3bbef57c32e7", "voice_config" : { "voice_asset_id" : "394f3a27cd0b3d6164ca75c3db1edf6c", "speed" : 100, "pitch" : 100, "volume" : 140 }, "video_config" : { "codec" : "H264", "bitrate" : 5000, "width" : 1920, "height" : 1080, "frame_rate" : "30" }, "shoot_scripts" : [ { "sequence_no" : 0, "shoot_script" : { "text_config" : { "text" : "Hello, everyone. I'm Yunling." }, "background_config" : [ { "background_type" : "IMAGE", "background_config" : "https://{endpoint}/0d697589d98091f12f92c0073501cd79/c7885ffdfb347337a890208ca7fd07e3/34534f0262813a6838bdcfb8bc949af6.jpg?AccessKeyId=WTEZCVDFUF3XHXCTPIJ8&Expires=1686872878&Signature=zXGOEQlrgZ4yAUziwlGcdbXLPIM%3D" } ], "layer_config" : [ { "layer_type" : "HUMAN", "position" : { "dx" : "656,", "dy" : "0,", "layer_index" : 1 }, "size" : { "width" : 607.5, "height" : 1080 } } ], "script_type" : "TEXT" } } ], "output_asset_config" : { "asset_name" : "Yunling's self-introduction." } }
Example Responses
Status code: 200
The information is returned when the request succeeds.
{ "job_id" : "26f06524-4f75-4b3a-a853-b649a21aaf66" }
Status code: 400
{ "error_code" : "MSS.00000003", "error_msg" : "Invalid parameter" }
Status code: 401
{ "error_code" : "MSS.00000001", "error_msg" : "Unauthorized" }
Status code: 500
{ "error_code" : "MSS.00000004", "error_msg" : "Internal Error" }
Status Codes
Status Code |
Description |
---|---|
200 |
The information is returned when the request succeeds. |
400 |
Parameters error, including the error code and its description. |
401 |
Authentication is not performed or fails. |
500 |
Internal service error. |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot