Updated on 2025-10-27 GMT+08:00

Trying Text-based Dialogue in ModelArts Studio (MaaS)

In ModelArts Studio, you can test the inference performance of any running model service on the Text-based Dialogue page.

Constraints

This function is available only in the ME-Riyadh region.

Prerequisites

You have created an endpoint on the Real-Time Inference page. For details, see Creating an Endpoint on ModelArts Studio (MaaS).

Procedure

  1. Log in to the ModelArts Studio (MaaS) console and select the target region on the top navigation bar.
  2. Use either of the following methods to test the model:
    • Method 1
      1. In the navigation tree on the left, choose Text-based Dialogue.
      2. Click Select Service. In the Endpoint tab, select the target model service and click OK.
    • Method 2
      1. In the navigation pane on the left, choose Real-Time Inference.
      2. Click Endpoint. Then, click Try in the Operation column.
  3. In the upper right corner, click Parameters and drag or directly enter values to configure inference parameters. You can click Reset to return to the default settings.
    Figure 1 Configuring inference parameters
    Table 1 Parameters

    Parameter

    Description

    Temperature

    Controls the randomness and creativity of the generated text. A higher temperature value increases randomness.

    • Lower values result in more focused and definite outputs.
    • Higher values result in more random and creative outputs.

    Value range: 0 to 2

    Default value: The default value varies for different models; refer to the actual environment.

    Top P

    Adjusts the diversity of the generated text. A higher value results in more varied and creative output.

    • Lower values: Fewer token types are available for output, making the output more deterministic.
    • Higher values: More token types are available for output, increasing diversity.

    Value range: 0.1 to 1

    Default value: The default value varies for different models; refer to the actual environment.

    Detailed explanation: top_p can set the size of the token candidate list. The top tokens whose cumulative probability just exceeds the set value P are included in the candidate list, and a token is randomly sampled from this list.

    Top K

    Controls the diversity of the output tokens. A higher top_k value results in a richer variety of output token types. Determine how many of the highest ranking tokens are considered.

    • Lower values: Fewer token types are available for output, making the output more deterministic.
    • Higher values: More token types are available for output, increasing diversity.

    Value range: 1 to 1000

    Default value: 20

    Detailed explanation: top_k can set the number of top K tokens with the highest probability to be retained, and one token is randomly selected from these K tokens as the final output. This method can limit the length of the output sequence while still maintaining a certain level of diversity in the samples.

  4. Type your question into the dialog box or select suggested words from the console. See the results and test the model service.

    The model output does not reflect the platform's views. The platform does not guarantee the legality, authenticity, or accuracy of this information and assumes no responsibility for it.