Help Center/ Web Application Firewall/ User Guide/ Configuring Protection Policies/ Configuring Protection Rules/ Configuring AI Model Check Rules for Security and Compliance of LLM Applications
Updated on 2026-03-05 GMT+08:00

Configuring AI Model Check Rules for Security and Compliance of LLM Applications

The rapid advancement of generative AI has driven widespread adoption of Large Language Models (LLMs) in AI inference. However, this also brings emerging security challenges. Inadequate input validation may expose sensitive data, adversarial prompt injections can induce policy-violating outputs, and biased training data risks perpetuating discriminatory outputs. To effectively mitigate these issues, WAF offers AI model check rules. You can let WAF check prompts for injection and compliance risks and identify inappropriate or non-compliant outputs in responses. It can also check images with the OCR technology for inappropriate contents. All these keep your model inputs and outputs secure, stable, available, and legally compliant.

Solution Overview

WAF AI model check detects external requests (inputs) and response data (outputs) of large models.

  • WAF AI model check verifies text prompts and examines image content to ensure that the inputs of large models are secure and legally compliant.
  • WAF AI model check performs Response Compliance checks on the response data of large models to ensure that the outputs of large models are secure and legally compliant.
Figure 1 AI Model Check

You can use AI model check in the following scenarios:
  • In-house large models built and deployed locally

    Some enterprise may build and deploy their own large models to provide services for external systems. For those large models, the AI model check module can help mitigate threats from inappropriate text and image. So that enterprises can easily protect large model interfaces from being abused, protect its web framework, and keep services stable.

  • Applications built by calling third-party large model cloud services

    Some enterprise may buy API services from third-party large models to build their service applications. Malicious users may consume tokens to increase enterprise cost burden. They may also enter malicious prompts or inappropriate images that cause the enterprise accounts to be suspended or blocked. The AI model check module can help address these issues.

Constraints

Function

Constraint

Edition

  • Cloud mode - CNAME access: This function is supported in the standard, professional, and enterprise editions.

    If you buy the standard, professional, or enterprise edition for cloud mode, the LLM Firewall Text Security edition is automatically adapted to the standard, professional, or professional edition you buy. The LLM Firewall Text Security service and WAF edition you buy have the same required duration.

    If you no longer need LLM Firewall Text Security, unsubscribe from it separately. After the unsubscription, it will become unavailable. You will receive a refund based on your resource usage details.

  • Cloud mode - load balancer access: This function is not supported.
  • Dedicated mode: This function is not supported.

Text Check

The check content can only be in UTF-8 format. Otherwise, the protection may fail.

Request body format

The request body must be in JSON format. The response format can be JSON or data:+JSON. For example:
data: {"choices": [{"index": 0, "delta": {"content": "Content", "type": "text"}}], "created": 1743564162}

If the raw response from origin server of a large model is in JSON format, the termination response is also in JSON format. If the raw response is in data:+JSON format, the termination response is also in data:+JSON format.

Rule effective time

It takes several minutes for a new rule to take effect. After a rule takes effect, protection events triggered by the rule will be displayed on the Events page. For details, see Querying a Protection Event.

Prerequisites

Configure AI Model Check Rules

  1. Log in to the WAF console.
  2. Click in the upper left corner and select a region or project.
  3. (Optional) If you have enabled the enterprise project function, in the upper part of the navigation pane on the left, select your enterprise project from the Filter by enterprise project drop-down list. Then, WAF will display the related security data in the enterprise project on the page.
  4. In the navigation pane on the left, choose Policies.
  5. In the policy list, click the name of the target policy to go to the protection rule configuration page.

    You can also go to the Website Settings page, locate the target domain name, and click the number next to the protection policy in the Policy column to go to the protection rule configuration page.

  6. Locate the LLM Content Security configuration box and toggle on this protection.

    : enabled.

  7. Add the AI check rule.

On the Text Check Rules tab, add a text check rule to check LLM prompts and responses.

  1. On the Text Check Rules tab, click Add Rule.
  2. In the Add Text Check Rule dialog box, configure the following parameters and click OK.

    Table 1 Parameters for an LLM text check rule

    Parameter

    Description

    Example Value

    Basic Information

    Rule Name

    Name of the LLM check rule. The name can contain a maximum of 128 characters. Only letters, digits, underscores (_), hyphens (-), colons (:), and periods (.) are allowed.

    waf-text

    Protocol Template

    MAF supports the following templates:
    • OpenAI Response: This template preset in MAF can meet most application scenarios. You can also adjust the template based on the actual service requirements.
      This template is configured with the following information by default:
      • Model Q&A Path: /v1/chat/completions.
      • Prompt Verification:
        • Injection detection is enabled by default.
        • Compliance check is enabled by default.
        • Prompt Index: $.messages[-1].content, which indicates that the value of content of the last element in the messages array is extracted.
        • Protective Action: Log only.
      • Response Compliance:
        • Response check is enabled by default.
        • Response Content Index: $.choices[-1].delta.content, indicating that the value of the content parameter of the last element in the choices array is extracted.
        • Protective Action: Anonymize
    • Custom: a blank template. It does not contain the default task configuration. You need to customize parameters based on service details.

    OpenAI Response

    Model Q&A Path

    Enter the URL of the model Q&A. Enter a maximum of 4,096 characters. The value cannot start or end with a space. The following special characters are not allowed: <>

    /v1/chat/completions

    Description (Optional)

    Description of the rule.

    -

    Prompt Verification

    Injection Detection

    Injection detection detects attacks like reverse engineering and role-playing attacks that target large models.

    If you enable this function, WAF will block malicious inputs designed by attackers as prompts.

    Compliance Check

    Compliance check detects violent, discriminatory, illegal, and immoral content.

    If you enable this function, WAF can effectively filter out non-compliant information entered by users.

    Prompt Index

    An index is used to identify or locate a prompt in a specific data structure.

    You can configure prompt indexes to easily search for, access, and process information related to prompts. The index is the JSONPath of the request body and complies with the JSONPath syntax. All examples in Table 5 are supported.

    $.messages[-1].content

    Protective Action

    Protective action taken when a prompt matches the detection requirements. The options are as follows:

    • Log only: If a prompt matches the detection requirements, attack information is only logged.
    • Block: If a prompt matches the detection requirements, the request is blocked.

      If you set Protective Action to Block, set HTTP Return Code, Block Page Type, and Page Content for the block page.

      For details about how to configure the Block action, see Example 1: Prompt Verification.

    Block

    Response Compliance

    Response Compliance Check

    Compliance check checks the model response data.

    If you enable this function, WAF can effectively filter out non-compliant outputs of the large models.

    Response Content Index

    You can configure response content index to easily search for, access, and process information related to response content. This parameter is the JSONPath of the request body. The JSONPath syntax is used. Examples 1, 2, 3, and 5 in Table 5 are supported.

    $.choices[-1].delta.content

    Protective Action

    Protective action taken when an output matches the detection requirements. The options are as follows:

    • Log only: If an output matches the detection requirements, attack information is only logged.
    • Anonymize: If an AI model output contains sensitive words, WAF identifies sensitive words based on the context and masks them as - in final responses.
    • Terminate response: If an AI model output contains sensitive words, WAF returns the content modified according to the termination response protocol, ends the on-going request, and ignores the subsequent responses from the origin server of the AI model.

    Anonymize

    Protocol for Terminating Response

    If you set Protective Action to Terminate response, you need to set Protocol for Terminating Response. Protocol for Terminating Response is used to define the response content if the Terminate response action is taken.

    • By default, MAF presets the following protocol to terminate response:
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      [
          {
              "$.choices": [
                  {
                      "finish_reason": "content_filter",
                      "index": 0,
                      "delta": {
                          "content": "Sorry, I cannot answer this question right now. Let's try a different topic.",
                          "role": "assistant"
                      }
                  }
              ],
              "$.model": "$.model",
              "$.created": "$.created",
              "$.id": "$.id",
              "$.service_tier": "$.service_tier",
              "$.object": "$.object",
              "$.usage": "$.usage"
          },
          {
              "$": "[DONE]"
          }
      ]
      
    • You can customize the default response termination protocol template. The syntax of the protocol for terminating response must comply with the following rules:
      • The value must be in the form of a JSON array, with each element being a JSON object.
      • The object can be empty, that is, {}. If the object is empty, the blocked response data is copied. The key value syntax is the same as the response index syntax. If an invalid value is detected, the configuration of this object will be skipped.
      • The response termination protocol list accepts up to five objects, each supporting up to 10 indexes. Any data beyond these limits will be ignored. Each object requires unique indexes. Duplicate entries may produce incorrect results.
      • If the response content remains unchanged, you can configure a constant, for example, $.data.choices in Scenario 2: Protective Action is set to Terminate response.
      • If the response content is different but can be obtained from the response containing card, configure the object value following the response index syntax, for example, $.data.model in Scenario 2: Protective Action is set to Terminate response.
      • When multiple operations are performed on the same index, later operations will overwrite earlier ones. For example, if you assign a value to $.data.model and then assign a value to $.data, the value assigned to $.data.model will not take effect.
      • For array-type objects, you can assign values to their elements. For non-array objects, using an array index causes errors. For example, the raw request data is as follows:
        1
        {"data":{"arr":["1","2","3"],"item":{"sub_item":1}}}
        

        So, $.data.arr.new_index or $.data.item[1].new_index is invalid.

        If you have to make such modifications, clear $.data.arr or $.data.item and then assign a value, as shown in the following:

        1
        2
        3
        4
        5
        6
        [
            {
                "$.data.arr": "{}",
                "$.data.arr.new_index": "new_data"
            }
        ]
        
      • Array objects do not allow value assignment to non-existent negative subscripts. For example, the raw request data is as follows:
        1
        {"data":{"arr":["1","2","3"],"item":{"sub_item":1}}}
        

        So, $.data.arr[-4] is invalid. To insert 0 at the beginning of $.data.arr, assign values to $.data.arr in reverse order. Assigning values in the original order will change the original values, resulting in unexpected results.

        1
        2
        3
        4
        5
        6
        7
        8
        [
            {
                "$.data.arr[3]": "$.data.arr[2]",
                "$.data.arr[2]": "$.data.arr[1]",
                "$.data.arr[1]": "$.data.arr[0]",
                "$.data.arr[0]": "0"
            }
        ]
        

    For details about how to configure the response termination protocol, see Example 2: Response Compliance Check.

    For details about examples, see Example 2: Response Compliance Check.

After completing the preceding configurations, you can:

  • Check the rule status: In the protection rule list, check the rule you added. Rule Status is Enabled by default.
  • Disable the rule: If you do not want the rule to take effect, click Disable in the Operation column of the rule.
  • Delete or modify the rule: Click Delete or Modify in the Operation column of the rule.

Configuration Examples

You can take the steps below to verify that WAF checks content of an LLM with URL prefix equal to /v1/chat/completions.

DeepSeek is used in this example. Sensitive word: provide a debit card.

Syntax Supported by Indexes

A large model index is used to identify or locate the position or number of a prompt or response content in the JSONPath of the request body. It complies with the JSONPath syntax. If you enable Injection Detection, Compliance Check, and Response Compliance Check are enabled, you need to configure indexes to facilitate the searching, accessing, and processing of prompt or response content.

The following indexes need to be configured for AI model checks:
  • Prompt Index: If Injection Detection and Compliance Check are enabled, this index is used to locate the position of the prompt in the JSONPath of the request body. All syntax in the following table is supported.
  • Response Content Index: If Response Compliance Check is enabled, this index is used to locate the response content in the JSONPath of the response body. Syntax 1, 2, 3, and 5 in the following table are supported. Wildcard extraction is not supported.
  • Image index: If Image Check is enabled and the OpenAI request method is used for image upload, you need to configure an image index to locate the image content in the JSONPath of the response body. All syntaxes in the following table are supported.
Table 5 Index examples

No.

Scenario

JSON structure

Path (Max. Depth: 10 Levels)

Result

Description

1

Single object

{"prompt": {"role": "user","content": "..."}}

$.prompt.content

"..."

Periods (.) are used to access subnodes layer by layer to locate the target field. Each subnode name in the path must be explicitly specified.

2

Obtaining the first element

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}

$.prompt[0].content

"A"

The array index starts from 0. [0] indicates the first element.

3

Obtaining the last element

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}

$.prompt[-1].content

"B"

The negative index -1 is used to indicate the last element.

4

Obtaining elements using the wildcard character

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"},{"role": "user", "content": "C"}]}

$.prompt[*].content

"ABC"

[*] matches all elements in an array. Recursive retrieval is not supported.

5

Obtaining the root node

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}

$

{"prompt":[{"role": "user","content": "A"},{"role":"assistant","content":"B"}]}

The JSON content of the root node is obtained.

Related Operations

  • Viewing protection logs: Querying a Protection Event.

    After detecting an attack, WAF reports the attack log to SecMaster. You can view and analyze the attack log on SecMaster. For more details, see Large Model Safety Workbench.

  • Unsubscribing from LLM Content Security
    1. In the navigation pane on the left, click Dashboard.
    2. In the Product Details card, click Details in the Cloud mode area.
    3. In the Cloud Mode Details panel, choose Advanced Functions, and click Unsubscribe.

    After the unsubscription, LLM content security will become unavailable. You will receive a refund based on your resource usage details.