Avatar Reply
This diagram element is used to configure virtual humans who respond to customers in intelligent video customer service. It can be used to play virtual human videos to customers in flows.
Diagram Element
Parameter Description
- Node Name: name of a node.
- Description: details of a node.
- Operation: The options are Initiate, TTS, and Call Ending.
Figure 1 Setting Operation for the Avatar Reply diagram element
- Initiate: A session with the virtual human service is initiated. You need to specify the virtual human to be used.
- TTS: The configured text is provided for the virtual human service for video synthesis and playback. Variables can be added to the reply text, and the virtual human service dynamically plays a voice based on the return values of the variables.
- Call Ending: The session with the virtual human service is ended.
- Avatar Image: This parameter is displayed when Operation is set to Initiate. Select a virtual human configured on the page.
Figure 2 Selecting a virtual human
- Reply Variable: This parameter is displayed when Operation is set to TTS. Enter the text variable to be used by the virtual human to reply.
Figure 3 Selecting TTS
- Reply Template: This parameter is displayed when Operation is set to TTS. Select a template whose Type is TTS under Configuration Center > Chatbot Management > Flow Configuration > Resource > Resource Template.
- Reply Variable: This parameter is displayed when Operation is set to TTS. Set Reply Variable to a variable of the Character type.
- Reply Resource: This parameter is displayed when Operation is set to TTS. Select a text resource whose Status is Approved under Configuration Center > Resource Management > Audio and Video.
- Reply Mode: This parameter is mandatory. The options are as follows:
- Playback only: Only the voice or video is played, and customer input does not need to be received. Generally, this option is selected for static voice playback.
- Interruption recognition: The customer needs to answer by voice. Generally, this option is selected for replying to text.
- Interruption by key presses: The keys pressed by the customer need to be obtained. If Reply Mode of the Robot Reply diagram element is set to Interruption by key presses and the key interaction result needs to be obtained, the Robot Reply diagram element cannot be directly connected to the Call Ending diagram element, and the Key Recognition and Semantic Recognition diagram elements cannot be directly connected before the Call Ending diagram element.
- Recognition and key presses: Both voice and key information can be received. The information received first is used for matching.
- Recognition after playback: The system starts to identify the voice or video only after the voice or video is played. If a customer speaks during the playback, the system does not receive the voice.
- Recognition and key presses after playback: The system starts to identify the voice or video and collect digits only after the voice or video is played. If a customer speaks or presses a key during the playback, no information can be received. If a customer speaks or presses a key after the playback, the information that is received first is used for matching.
- No interruption after digit collection: Keys can be pressed when the voice or video is played, but the playback is not interrupted.
- Custom Variable Value: whether voice playback can be interrupted. If voice playback can be interrupted, set Minimum Voice Playing Duration.
String true -- Voice playback can be interrupted during recognition, and the minimum voice playback duration can be passed.
String false -- Recognition is performed after voice playback.
- Timeout Interval: timeout period, in seconds. If this parameter is not set, the default value 15 is used. If the duration exceeds the value of this parameter, the system determines that timeout occurs. The value range of this parameter varies with vendors. Currently, the maximum value is 180.
- If Reply Mode is set to Recognition after playback, the duration starts from the time when the TTS voice playback ends to the time when the recognition stops.
- If Reply Mode is set to Interruption by key presses or Interruption recognition, the duration starts from the time when the voice playback ends. For example, if a customer does not speak after the TTS voice playback ends, the robot goes to the next diagram element when the timeout period ends.
- If Reply Mode is set to Playback only, the timeout period does not take effect.
Pay attention to the setting of the timeout period in the following scenarios: long TTS voice playback, many keys (such as an ID card) to be pressed by a customer, and many opinions to be said by a customer. When Reply Mode is set to Interruption by key presses or Interruption recognition, if the timeout period is short, the recognition may start before the voice playback ends, or the recognition may end before the customer completes key pressing. If the timeout period is long, there may be a long-time silence after the customer completes key pressing, and a timeout notification is displayed after a long time.
- ASR Advanced Settings: The options are Enable and Disable. The default value is Disable.
- Recognition Type: The option is Common. The default value is Common. This parameter is configurable when ASR Advanced Settings is set to Enable.
- Subscriber Silence Timeout Interval: The default value is 100, in seconds. The value ranges from 0 to 32000. This parameter is configurable when ASR Advanced Settings is set to Enable.
- Recognition Timeout Interval: The default value is 200, in seconds. The value ranges from 0 to 600. This parameter is configurable when ASR Advanced Settings is set to Enable.
- Subscriber Pause Timeout Interval: The default value is 500, in milliseconds. The value ranges from 300 to 2000. This parameter is configurable when ASR Advanced Settings is set to Enable.
- ASR Extended Parameter: Enter the data required by the ASR business on the IVR side, for example, the vendor information. The value is returned to the IVR system using the vendor parameter of the dialog interface.
- TTS Advanced Settings: The options are Enable and Disable. The default value is Disable.
Figure 4 TTS Advanced Settings
- Speaker: Enter the speaker to be used by the virtual human. This parameter is available when TTS Advanced Settings is enabled.
- Voice Speed: Enter a value ranging from 0.5 to 1.5. Only one decimal place is supported. 1.0 indicates the normal voice speed. 0.5 indicates the slowest voice speed, and 1.5 indicates the fastest voice speed. This parameter is available when TTS Advanced Settings is enabled.
- Composite Video Configuration: Enable or disable this function. By default, this function is disabled.
Figure 5 Composite Video Configuration
- Actions: Enter the action to be used by the virtual human. This parameter is available when Composite Video Configuration is enabled.
- Picture and Video Settings: Configure Front Image, Backend Image, Front Video, and Backend Video.
Figure 6 Picture and Video Settings
- Front Image: Select the foreground image used during virtual human video synthesis. A maximum of five foreground images can be configured for a diagram element.
- Image: Select a 2D virtual human image configured on the page.
- Abscissa: Enter the horizontal coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Ordinate: Enter the vertical coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Scale: Enter the zoom ratio of the image to be displayed. Only one decimal place is supported. The value ranges from 0.5 to 1.0.
- Start Time (ms): Enter the time when the image starts to be displayed. The value is an integer greater than 0, in milliseconds. This parameter can be left blank.
- Display Duration (ms): Enter the duration for displaying the image. The value is an integer greater than 0, in milliseconds. This parameter can be left blank.
- Backend Image: Select the background image used during virtual human video synthesis. A maximum of five background images can be configured for a diagram element.
- Image: Select a 2D virtual human image configured on the page.
- Abscissa: Enter the horizontal coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Ordinate: Enter the vertical coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Scale: Enter the zoom ratio of the image to be displayed. Only one decimal place is supported. The value ranges from 0.5 to 1.0.
- Start Time (ms): Enter the time when the image starts to be displayed. The value is an integer greater than 0, in milliseconds. This parameter can be left blank.
- Display Duration (ms): Enter the duration for displaying the image. The value is an integer greater than 0, in milliseconds. This parameter can be left blank.
- Front Video: Select the foreground video used during virtual human video synthesis. Only one foreground video can be configured for a diagram element.
- Video: Select a 2D virtual human video configured on the page.
- Abscissa: Enter the horizontal coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Ordinate: Enter the vertical coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Scale: Enter the zoom ratio of the video to be displayed. Only one decimal place is supported. The value ranges from 0.5 to 1.0.
- Start Time (ms): Enter the time when the video starts to be displayed. The value is an integer greater than 0, in milliseconds. This parameter can be left blank.
- Backend Video: Select the background video used during virtual human video synthesis. Only one background video can be configured for a diagram element.
- Video: Select a 2D virtual human video configured on the page.
- Abscissa: Enter the horizontal coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Ordinate: Enter the vertical coordinate for image display. The value is an integer greater than 0. The default value is 0.
- Scale: Enter the zoom ratio of the video to be displayed. Only one decimal place is supported. The value ranges from 0.5 to 1.0.
- Start Time (ms): Enter the time when the video starts to be displayed. The value is an integer greater than 0, in milliseconds. This parameter can be left blank.
- Front Image: Select the foreground image used during virtual human video synthesis. A maximum of five foreground images can be configured for a diagram element.
Condition Branch Description
Condition Branch |
Description |
Usage |
---|---|---|
SYSERROR_INNER |
ODFS internal error |
Triggered when an unknown error occurs in the ODFS. |
Using the Diagram Element
- Click the diagram element or drag it to the canvas. Before configuring virtual human reply parameters, you need to maintain virtual humans in advance. Then, you can select one of them in the Service Parameter area.
Typical Application Scenario
The following describes how to use the Avatar Reply diagram element to play a welcome tone to customers.
- Sign in to the AICC and choose .
- Configure an intelligent IVR flow.
- Choose and click New to add a simple flow.
- Click + in the Flow Variable area. In the dialog box that is displayed, set the variable name and data type. The default value of the variable will be played to the customer.
Figure 7 Adding a flow variable
Figure 8 Flow orchestration example
- Save and publish the flow.
- Choose and bind the flow to a robot.
- Choose Test Call. In the test dialog box that is displayed, click Start Call to test the robot. If the robot automatically answers the video path generated by the virtual human service, the configuration is successful.
. In the last column corresponding to the robot, click
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot