Updated on 2025-07-02 GMT+08:00

Creating a Prompt Evaluation Task

Create an evaluation task to batch evaluate candidate prompts. The procedure is as follows:

  1. Log in to ModelArts Studio and access a workspace.
  2. In the navigation pane, choose Agent Dev. On the displayed page, choose Prompt Engineering > Prompt dev.
  3. In the prompt engineering project list, locate the desired project and click writing in the Operation column.
  4. On the writing page, choose candidate from the navigation pane. In the candidate list, select the prompts for horizontal comparison and click Create Evaluation.
    Figure 1 Create Evaluation
  5. Select the variable set and evaluation method.
    • Evaluation case set: The platform assembles the prompt to be evaluated and the variables in the selected dataset into a complete prompt. The model can generate a corresponding outcome in response to this prompt.
    • Evaluation method: Using the selected evaluation method, the platform compares the model's generated result with the expected one, and then offers the corresponding score.
    Figure 2 Creating a prompt evaluation task
  6. Click Confirm. The evaluation task is automatically executed.