Help Center/ Web Application Firewall/ User Guide/ Configuring Protection Policies/ Configuring Protection Rules/ Configuring AI Model Check Rules to Ensure Security and Compliance of LLM Applications
Updated on 2025-10-27 GMT+08:00

Configuring AI Model Check Rules to Ensure Security and Compliance of LLM Applications

The rapid advancement of generative AI has driven widespread adoption of Large Language Models (LLMs) in AI inference. However, this also brings emerging security challenges. Inadequate input validation may expose sensitive data, adversarial prompt injections can induce policy-violating outputs, and biased training data risks perpetuating discriminatory outputs. To effectively mitigate these issues, WAF offers AI model check rules. You can let WAF check prompts for injection and compliance risks and identify inappropriate or non-compliant outputs in responses. This helps keep your model inputs and outputs secure, stable, available, and legally compliant.

Solution Overview

WAF AI model check detects external requests (inputs) and response data (outputs) of large models.

  • WAF AI model check performs Prompt Verification to ensure that the inputs of large models are secure and legally compliant.
  • WAF AI model check performs Response Compliance checks on the response data of large models to ensure that the outputs of large models are secure and legally compliant.
Figure 1 AI Model Check

The main application scenarios are as follows:

  • In-house large models built and deployed locally

    Some enterprise may build and deploy their own large models to provide services for external systems. For those large models, the AI model check module can help mitigate threats, including prompt injection, reverse engineering, role-playing attacks, and jailbreak attempts. So that enterprises can easily protect large model interfaces from being abused, protect its web framework, and keep services stable.

  • Applications built by calling third-party large model cloud services

    Some enterprise may buy API services from third-party large models to build their service applications. Malicious users may consume tokens to increase enterprise cost burden. They may also enter malicious prompts that cause the enterprise accounts to be suspended or blocked. The AI model check module can help address these issues.

Prerequisites

Constraints

  • This type of protection rule is supported only by Cloud Mode - CNAME access. It is not supported by Cloud Mode - Load balancer access or Dedicated Mode.
  • This function is supported by the standard, professional, and enterprise editions for the cloud mode. If you buy the standard, professional, or enterprise edition for cloud mode, the LLM content security edition is automatically adapted to the standard, professional, or professional edition, respectively. The LLM content security service and WAF edition you buy have the same required duration.

    If you no longer need LLM content security, unsubscribe from it separately. After the unsubscription, the function will become unavailable. You will receive a refund based on your resource usage details.

  • AI model check is supported only in CN North-Beijing4, CN East-Shanghai1, and CN South-Guangzhou regions. To use this function in other regions, submit a service ticket to apply for this function.
  • The check content can only be in UTF8 format. Otherwise, the protection may fail.
  • The request body must be in JSON. The response format can be JSON or data:+JSON. For example:
    data: {"choices": [{"index": 0, "delta": {"content": "Content", "type": "text"}}], "created": 1743564162}

    If the raw response from origin server of a large model is in JSON format, the termination response is also in JSON format. If the raw response is in data:+JSON format, the termination response is also in data:+JSON format.

Step 1: Buy LLM Content Security

If you have buy LLM content security when buying cloud WAF, skip this step. For details, see Buying Cloud Mode WAF. For details about the billing of LLM content security, see Billing Items.

  1. Log in to the WAF console.
  2. Click in the upper left corner and select a region or project.
  3. (Optional) If you have enabled the enterprise project function, in the upper part of the navigation pane on the left, select your enterprise project from the Filter by enterprise project drop-down list. Then, WAF will display the related security data in the enterprise project on the page.
  4. In the navigation pane on the left, choose Dashboard.
  5. In the Product Details card, click Details in the Cloud mode area.
  6. On the Cloud Mode Details panel, choose Advanced Functions > LLM Content Security and click Buy Now.
  7. On the Buy LLM Content Security page, confirm the purchase details, read and select WAF Disclaimer, and click Pay Now.

    Now, you can configure LLM content security rules to keep LLM applications secure and legally compliant.

Step 2: Configure AI Model Check Rules

  1. Log in to the WAF console.
  2. Click in the upper left corner and select a region or project.
  3. (Optional) If you have enabled the enterprise project function, in the upper part of the navigation pane on the left, select your enterprise project from the Filter by enterprise project drop-down list. Then, WAF will display the related security data in the enterprise project on the page.
  4. In the navigation pane on the left, click Policies.
  5. Click the name of the target policy to go to the protection rule configuration page.

    Before configuring protection rules, ensure that the target protection policy has been applied to a domain name. A protection policy can be applied to multiple protected domain names, but a protected domain name can have only one protection policy.

  6. Locate the LLM Content Security configuration box and toggle on this protection.

    : enabled.

  7. In the upper left corner above the rule list, click Add Rule.
  8. In the Add AI Check Rules dialog box, set the following parameters and click OK.

    Table 1 Parameters for an AI check rule

    Parameter

    Description

    Example Value

    Rule Name

    Enter the name of the protection rule.

    waftest

    Rule Description (Optional)

    Description of the rule.

    --

    Model Q&A Path

    Enter the URL of the model Q&A. The URL cannot contain special characters (<>*), cannot start or end with spaces, and cannot exceed 4,096 characters.

    /v1/chat/completions

    Prompt Verification

    Injection Detection

    Injection detection detects attacks like reverse engineering and role-playing attacks that target large models.

    If you enable this function, WAF will block malicious inputs designed by attackers as prompts.

    Compliance Check

    Compliance check detects violent, discriminatory, illegal, and immoral content.

    If you enable this function, WAF can effectively filter out non-compliant information entered by users.

    Prompt Index

    An index is used to identify or locate a prompt in a specific data structure.

    You can configure prompt indexes to easily search for, access, and process information related to prompts. The index is the JSONPath of the request body and complies with the JSONPath syntax. All examples in Table 5 are supported.

    $.messages[-1].content

    Protective Action

    Protective action taken when a prompt matches the detection requirements. The options are as follows:

    • Log only: If a prompt matches the detection requirements, attack information is only logged.
    • Block: If a prompt matches the detection requirements, the request is blocked.

      If you set Protective Action to Block, set HTTP Return Code, Block Page Type, and Page Content for the block page.

      For details about how to configure the Block action, see Example 1: Prompt Verification.

    Block

    Response Compliance

    Response Compliance Check

    Compliance check checks the model response data.

    If you enable this function, WAF can effectively filter out non-compliant outputs of large models.

    Response Content Index

    You can configure response content index to easily search for, access, and process information related to response content. This parameter is the JSONPath of the request body. The JSONPath syntax is used. Examples 1, 2, 3, and 5 in Table 5 are supported.

    $.choices[-1].delta.content

    Protective Action

    Protective action taken when an output matches the detection requirements. The options are as follows:

    • Log only: If an output matches the detection requirements, attack information is only logged.
    • Anonymize: If an AI model output contains sensitive words, WAF identifies sensitive words based on the context and masks them as - in final responses.
    • Terminate response: If an AI model output contains sensitive words, WAF returns the content modified according to the termination response protocol, ends the on-going request, and ignores the subsequent responses from the origin server of the AI model.

    Anonymize

    Protocol for Terminating Response

    If you set Protective Action to Terminate response, you need to set Protocol for Terminating Response. Protocol for Terminating Response is used to define the response content if the Terminate response action is taken.

    The syntax of the protocol for terminating response must comply with the following rules:
    • The value must be in the form of a JSON array, with each element being a JSON object.
    • The object can be empty, that is, {}. If the object is empty, the blocked response data is copied. The key value syntax is the same as the response index syntax. If an invalid value is detected, the configuration of this object will be skipped.
    • The response termination protocol list accepts up to five objects, each supporting up to 10 indexes. Any data beyond these limits will be ignored. Each object requires unique indexes. Duplicate entries may produce incorrect results.
    • If the response content remains unchanged, you can configure a constant, for example, $.data.choices in Scenario 2: Protective Action is set to Terminate response.
    • If the response content is different but can be obtained from the response containing card, configure the object value following the response index syntax, for example, $.data.model in Scenario 2: Protective Action is set to Terminate response.
    • When multiple operations are performed on the same index, later operations will overwrite earlier ones. For example, if you assign a value to $.data.model and then assign a value to $.data, the value assigned to $.data.model will not take effect.
    • For array-type objects, you can assign values to their elements. For non-array objects, using an array index causes errors. For example, the raw request data is as follows:
      1
      {"data":{"arr":["1","2","3"],"item":{"sub_item":1}}}
      

      So, $.data.arr.new_index or $.data.item[1].new_index is invalid.

      If you have to make such modifications, clear $.data.arr or $.data.item and then assign a value, as shown in the following:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      [
          {
              "$.data.arr": "{}",
              "$.data.arr.new_index": "new_data"
          },
          {
              "$.data.item": "{}",
              "$.data.item[1].new_index": "new_data",
          }
      ]
      
    • Array objects do not allow value assignment to non-existent negative subscripts. For example, the raw request data is as follows:
      1
      {"data":{"arr":["1","2","3"],"item":{"sub_item":1}}}
      

      So, $.data.arr[-4] is invalid. To insert 0 at the beginning of $.data.arr, assign values to $.data.arr in reverse order. Assigning values in the original order will change the original values, resulting in unexpected results.

      1
      2
      3
      4
      5
      6
      7
      8
      [
          {
              "$.data.arr[3]": "$.data.arr[2]",
              "$.data.arr[2]": "$.data.arr[1]",
              "$.data.arr[1]": "$.data.arr[0]",
              "$.data.arr[0]": "0"
          }
      ]
      

    For details about how to configure the response termination protocol, see Example 2: Response Compliance Check.

    For details about examples, see Example 2: Response Compliance Check.

    After completing the preceding configurations, you can:

    • Check the rule status: In the protection rule list, check the rule you added. Rule Status is Enabled by default.
    • Disable the rule: If you do not want the rule to take effect, click Disable in the Operation column of the rule.
    • Delete or modify the rule: Click Delete or Modify in the Operation column of the rule.

Configuration Examples

You can take the steps below to verify that WAF checks content of an LLM with URL prefix equal to /v1/chat/completions.

DeepSeek is used in this example. Sensitive word: provide a debit card.

Syntax Supported by Indexes

A large model index is used to identify or locate the position or number of a prompt or response content in the JSONPath of the request body. It complies with the JSONPath syntax. If your enable Injection Detection, Compliance Check, and Response Compliance Check are enabled, you need to configure indexes to facilitate the searching, accessing, and processing of prompt or response content.

The following indexes need to be configured for AI model checks:
  • Prompt Index: If Injection Detection and Compliance Check are enabled, this index is used to locate the position of the prompt in the JSONPath of the request body. All syntax in the following table is supported.
  • Response Content Index: If Response Compliance Check is enabled, this index is used to locate the response content in the JSONPath of the response body. Syntax 1, 2, 3, and 5 in the following table are supported. Wildcard extraction is not supported.
Table 5 Prompt and response content index example

No.

Scenario

JSON structure

Path (Max. Depth: 10 Levels)

Result

Description

1

Single object

{"prompt": {"role": "user","content": "..."}}

$.prompt.content

"..."

Periods (.) are used to access subnodes layer by layer to locate the target field. Each subnode name in the path must be explicitly specified.

2

Obtaining the first element

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}

$.prompt[0].content

"A"

The array index starts from 0. [0] indicates the first element.

3

Obtaining the last element

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}

$.prompt[-1].content

["B"]

The negative index -1 is used to indicate the last element.

4

Obtaining elements using the wildcard character

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"},{"role": "user", "content": "C"}]}

$.prompt[*].content

["A", "B", "C"]

[*] matches all elements in an array. Recursive retrieval is not supported.

5

Obtaining the root node

{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}

$

{"prompt":[{"role": "user","content": "A"},{"role":"assistant","content":"B"}]}

The JSON content of the root node is obtained.

Related Operations

  • Viewing protection logs: Querying a Protection Event.

    After detecting an attack, WAF reports the attack log to SecMaster. You can view and analyze the attack log on SecMaster. For more details, see AI Risk Overview.

  • Unsubscribing from LLM Content Security
    1. In the navigation pane on the left, click Dashboard.
    2. In the Product Details card, click Details in the Cloud mode area.
    3. In the Cloud Mode Details panel, choose Advanced Functions, and click Unsubscribe.

    After the unsubscription, LLM content security will become unavailable. You will receive a refund based on your resource usage details.