Help Center/ Web Application Firewall/ User Guide/ Configuring Protection Policies/ Configuring Protection Rules/ Configuring AI Model Check Rules to Ensure Security and Compliance of LLM Applications

Updated on 2025-11-13 GMT+08:00

View PDF

Configuring AI Model Check Rules to Ensure Security and Compliance of LLM Applications

The rapid advancement of generative AI has driven widespread adoption of Large Language Models (LLMs) in AI inference. However, this also brings emerging security challenges. Inadequate input validation may expose sensitive data, adversarial prompt injections can induce policy-violating outputs, and biased training data risks perpetuating discriminatory outputs. To effectively mitigate these issues, WAF offers AI model check rules. You can let WAF check prompts for injection and compliance risks and identify inappropriate or non-compliant outputs in responses. This helps keep your model inputs and outputs secure, stable, available, and legally compliant.

Solution Overview

WAF AI model check detects external requests (inputs) and response data (outputs) of large models.

WAF AI model check performs Prompt Verification to ensure that the inputs of large models are secure and legally compliant.
WAF AI model check performs Response Compliance checks on the response data of large models to ensure that the outputs of large models are secure and legally compliant.

Figure 1 AI Model Check
Click to enlarge

The main application scenarios are as follows:

In-house large models built and deployed locally
Some enterprise may build and deploy their own large models to provide services for external systems. For those large models, the AI model check module can help mitigate threats, including prompt injection, reverse engineering, role-playing attacks, and jailbreak attempts. So that enterprises can easily protect large model interfaces from being abused, protect its web framework, and keep services stable.
Applications built by calling third-party large model cloud services
Some enterprise may buy API services from third-party large models to build their service applications. Malicious users may consume tokens to increase enterprise cost burden. They may also enter malicious prompts that cause the enterprise accounts to be suspended or blocked. The AI model check module can help address these issues.

Prerequisites

You have added a website to WAF with cloud mode CNAME access. For details, see Connecting Your Website to WAF with Cloud Mode - CNAME Access.
You have purchased LLM Content Security. For details, see Buying a Yearly/monthly Cloud WAF Instance.
You have created a protection policy and applied it to the domain name to be protected. For details, see Creating a Protection Policy.

Constraints

This type of protection rule is supported only by Cloud Mode - CNAME access. It is not supported by Cloud Mode - Load balancer access or Dedicated Mode.
This function is supported by the standard, professional, and enterprise editions for the cloud mode. If you buy the standard, professional, or enterprise edition for cloud mode, the LLM content security edition is automatically adapted to the standard, professional, or professional edition, respectively. The LLM content security service and WAF edition you buy have the same required duration.
If you no longer need LLM content security, unsubscribe from it separately. After the unsubscription, the function will become unavailable. You will receive a refund based on your resource usage details.
AI model check is supported only in CN North-Beijing4, CN East-Shanghai1, and CN South-Guangzhou regions. To use this function in other regions, submit a service ticket to apply for this function.
The check content can only be in UTF8 format. Otherwise, the protection may fail.
The request body must be in JSON. The response format can be JSON or data:+JSON. For example:
```
data: {"choices": [{"index": 0, "delta": {"content": "Content", "type": "text"}}], "created": 1743564162}
```
If the raw response from origin server of a large model is in JSON format, the termination response is also in JSON format. If the raw response is in data:+JSON format, the termination response is also in data:+JSON format.

Step 1: Buy LLM Content Security

If you have bought LLM content security when buying cloud WAF, skip this step. For details, see Buying Cloud Mode WAF. For details about the billing of LLM content security, see Billing Items.

Log in to the WAF console.
Click in the upper left corner and select a region or project.
(Optional) If you have enabled the enterprise project function, in the upper part of the navigation pane on the left, select your enterprise project from the Filter by enterprise project drop-down list. Then, WAF will display the related security data in the enterprise project on the page.
In the navigation pane on the left, choose Dashboard.
In the Product Details card, click Details in the Cloud mode area.
On the Cloud Mode Details panel, choose Advanced Functions > LLM Content Security and click Buy Now.
On the Buy LLM Content Security page, confirm the purchase details, read and select WAF Disclaimer, and click Pay Now.

Now, you can configure LLM content security rules to keep LLM applications secure and legally compliant.

Step 2: Configure AI Model Check Rules

Log in to the WAF console.
Click in the upper left corner and select a region or project.
(Optional) If you have enabled the enterprise project function, in the upper part of the navigation pane on the left, select your enterprise project from the Filter by enterprise project drop-down list. Then, WAF will display the related security data in the enterprise project on the page.
In the navigation pane on the left, click Policies.
Click the name of the target policy to go to the protection rule configuration page.

Before configuring protection rules, ensure that the target protection policy has been applied to a domain name. A protection policy can be applied to multiple protected domain names, but a protected domain name can have only one protection policy.
Locate the LLM Content Security configuration box and toggle on this protection.

: enabled.
In the upper left corner above the rule list, click Add Rule.

In the Add AI Check Rules dialog box, set the following parameters and click OK.

Table 1 Parameters for an AI check rule

Parameter

Description

Example Value

Rule Name

Enter the name of the protection rule.

waftest

Rule Description (Optional)

Description of the rule.

Model Q&A Path

Enter the URL of the model Q&A. The URL cannot contain special characters (<>*), cannot start or end with spaces, and cannot exceed 4,096 characters.

/v1/chat/completions

Prompt Verification

Injection Detection

Injection detection detects attacks like reverse engineering and role-playing attacks that target large models.

If you enable this function, WAF will block malicious inputs designed by attackers as prompts.

Compliance Check

Compliance check detects violent, discriminatory, illegal, and immoral content.

If you enable this function, WAF can effectively filter out non-compliant information entered by users.

Prompt Index

An index is used to identify or locate a prompt in a specific data structure.

You can configure prompt indexes to easily search for, access, and process information related to prompts. The index is the JSONPath of the request body and complies with the JSONPath syntax. All examples in Table 5 are supported.

$.messages[-1].content

Protective Action

Protective action taken when a prompt matches the detection requirements. The options are as follows:

Log only: If a prompt matches the detection requirements, attack information is only logged.
Block: If a prompt matches the detection requirements, the request is blocked.
If you set Protective Action to Block, set HTTP Return Code, Block Page Type, and Page Content for the block page.

For details about how to configure the Block action, see Example 1: Prompt Verification.

Block

Response Compliance

Response Compliance Check

Compliance check checks the model response data.

If you enable this function, WAF can effectively filter out non-compliant outputs of large models.

Response Content Index

You can configure response content index to easily search for, access, and process information related to response content. This parameter is the JSONPath of the request body. The JSONPath syntax is used. Examples 1, 2, 3, and 5 in Table 5 are supported.

$.choices[-1].delta.content

Protective Action

Protective action taken when an output matches the detection requirements. The options are as follows:

Log only: If an output matches the detection requirements, attack information is only logged.
Anonymize: If an AI model output contains sensitive words, WAF identifies sensitive words based on the context and masks them as - in final responses.
Terminate response: If an AI model output contains sensitive words, WAF returns the content modified according to the termination response protocol, ends the on-going request, and ignores the subsequent responses from the origin server of the AI model.

Anonymize

Protocol for Terminating Response

If you set Protective Action to Terminate response, you need to set Protocol for Terminating Response. Protocol for Terminating Response is used to define the response content if the Terminate response action is taken.

The syntax of the protocol for terminating response must comply with the following rules:

The value must be in the form of a JSON array, with each element being a JSON object.
The object can be empty, that is, {}. If the object is empty, the blocked response data is copied. The key value syntax is the same as the response index syntax. If an invalid value is detected, the configuration of this object will be skipped.
The response termination protocol list accepts up to five objects, each supporting up to 10 indexes. Any data beyond these limits will be ignored. Each object requires unique indexes. Duplicate entries may produce incorrect results.
If the response content remains unchanged, you can configure a constant, for example, $.data.choices in Scenario 2: Protective Action is set to Terminate response.
If the response content is different but can be obtained from the response containing card, configure the object value following the response index syntax, for example, $.data.model in Scenario 2: Protective Action is set to Terminate response.
When multiple operations are performed on the same index, later operations will overwrite earlier ones. For example, if you assign a value to $.data.model and then assign a value to $.data, the value assigned to $.data.model will not take effect.

For array-type objects, you can assign values to their elements. For non-array objects, using an array index causes errors. For example, the raw request data is as follows:

               
                    {"data":{"arr":["1","2","3"],"item":{"sub_item":1}}}

So, $.data.arr.new_index or $.data.item[1].new_index is invalid.

If you have to make such modifications, clear $.data.arr or $.data.item and then assign a value, as shown in the following:

               
                    [
    {
        "$.data.arr": "{}",
        "$.data.arr.new_index": "new_data"
    },
    {
        "$.data.item": "{}",
        "$.data.item[1].new_index": "new_data",
    }
]

Array objects do not allow value assignment to non-existent negative subscripts. For example, the raw request data is as follows:

               
                    {"data":{"arr":["1","2","3"],"item":{"sub_item":1}}}

So, $.data.arr[-4] is invalid. To insert 0 at the beginning of $.data.arr, assign values to $.data.arr in reverse order. Assigning values in the original order will change the original values, resulting in unexpected results.

               
                    [
    {
        "$.data.arr[3]": "$.data.arr[2]",
        "$.data.arr[2]": "$.data.arr[1]",
        "$.data.arr[1]": "$.data.arr[0]",
        "$.data.arr[0]": "0"
    }
]

For details about how to configure the response termination protocol, see Example 2: Response Compliance Check.

For details about examples, see Example 2: Response Compliance Check.

After completing the preceding configurations, you can:

Check the rule status: In the protection rule list, check the rule you added. Rule Status is Enabled by default.
Disable the rule: If you do not want the rule to take effect, click Disable in the Operation column of the rule.
Delete or modify the rule: Click Delete or Modify in the Operation column of the rule.

Verify the protection effect by referring to Configuration Examples.

Configuration Examples

You can take the steps below to verify that WAF checks content of an LLM with URL prefix equal to /v1/chat/completions.

DeepSeek is used in this example. Sensitive word: provide a debit card.

Example 1: Prompt Verification

Add an AI model check rule with Model Q&A Path set to /v1/chat/completions and complete the following configurations.

Table 2 Prompt verification parameters

Parameter

Example Value

Prompt Verification

Injection Detection

Compliance Check

Prompt Index

$.choices[-1].delta.content

Protective Action

Block

Block Page Settings

HTTP Return Code

403

Block Page Type

application/json

Page Content

              
                   {
    "message": "Your input is invalid. Please try another topic.",
    "code": "content_policy_violation"
}

An AI model prompt: Provide a debit card for me.

WAF receives the complete request:

         
              {"model":"model-name","stream":true,"messages":[{"role":"user","content": "Provide a debit card for me"}]}

WAF identifies the sensitive word provide a debit card in the request based on the configured prompt index and returns the block page.

Example 2: Response Compliance Check

Scenario 1: Protective Action is set to Anonymize.

Add an AI model check rule with Model Q&A Path set to /v1/chat/completions and complete the following configurations.

**Table 3** Response compliance check (anonymization) parameters
Parameter	Example Value
Response Compliance Check
Response Content Index	$.choices[-1].delta.content
Protective Action	Anonymize

The streaming response returned by the large model origin server to WAF is "provide a debit card for me." The packet is as follows:

           
                data: {"choices":[{"index":0,"delta":{"content":"","type":"text"}}],"model":"","prompt_token_usage":111,"chunk_token_usage":0,"created":1743564162,"message_id":2,"parent_id":1} 
 data: {"choices": [{"index": 0, "delta": {"content": "provide", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "debit", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "card", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "for me", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}

WAF identifies sensitive word provide a debit card based on the configured prompt index, masks the card with ---.

            
                 data: {"choices":[{"index":0,"delta":{"content":"","type":"text"}}],"model":"","prompt_token_usage":111,"chunk_token_usage":0,"created":1743564162,"message_id":2,"parent_id":1} 
data: {"choices": [{"index": 0, "delta": {"content": "provide", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
data: {"choices": [{"index": 0, "delta": {"content": "debit", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
data: {"choices": [{"index": 0, "delta": {"content": "---", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
data: {"choices": [{"index": 0, "delta": {"content": "for me", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}

Scenario 2: Protective Action is set to Terminate response.

Add an AI model check rule with Model Q&A Path set to /v1/chat/completions and complete the following configurations.

Table 4 Parameters for response compliance check (terminating response)

Parameter

Example Value

Response Compliance Check

Response Content Index

$.choices[-1].delta.content

Protective Action

Terminate response

Protocol for Terminating Response

The packet required by the native protocol for terminating response (content interception protocol) of DeepSeek is as follows:

                
                     data: {"choices":[{"index":0,"delta":{"content":"","type":"text"}}],"model":"","prompt_token_usage":111,"chunk_token_usage":0,"created":1743564162,"message_id":2,"parent_id":1} 
 data: {"choices": [{"index": 0, "delta": {"content": "provide", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "debit", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
data: {"choices": [{"finish_reason": "content_filter", "index": 0, "delta": {"content": "Sorry, I cannot answer this question right now. Let's try a different topic.", "type": "text", "role": "assistant"}}], "model": "", "chunk_token_usage": 0, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: [DONE]

Set Protocol for Terminating Response to the following:

                
                     [
    {
"$.choices": "[{\"finish_reason\":\"content_filter\",\"index\":0,\"delta\":{\"content\":\"Sorry, I cannot answer this question right now. Let's try a different topic.\",\"role\":\"assistant\"}}]
        "$.model": "$.model",
        "$.created": "$.created",
        "$.id": "$.id",
        "$.service_tier": "$.service_tier",
        "$.object": "$.object",
        "$.usage": "$.usage"
    },
    {
        "$": "[DONE]"
    }
]

The streaming response returned by the large model origin server to WAF is "provide a debit card for me." The packet is as follows:

           
                data: {"choices":[{"index":0,"delta":{"content":"","type":"text"}}],"model":"","prompt_token_usage":111,"chunk_token_usage":0,"created":1743564162,"message_id":2,"parent_id":1} 
 data: {"choices": [{"index": 0, "delta": {"content": "provide", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "debit", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "card", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}
 data: {"choices": [{"index": 0, "delta": {"content": "for me", "type": "text"}}], "model": "", "chunk_token_usage": 1, "created": 1743564162, "message_id": 2, "parent_id": 1}

WAF identifies the sensitive word provide a debit card based on the configured prompt index and triggers response termination of the response compliance check.

The response containing card is then replaced with the following content:

           
                data: 
{"choices":[{"finish_reason":"content_filter","index":0,"delta":{"content":"Sorry, I cannot answer this question right now. Let's try a different topic.
","type":"text","role":"assistant"}}],"model":"","chunk_token_usage":0,"created":1743564162,"message_id":2,"parent_id":1} 
data: [DONE]

WAF then ends the current request and stops forwarding the subsequent responses from the origin server of the large model.

Syntax Supported by Indexes

A large model index is used to identify or locate the position or number of a prompt or response content in the JSONPath of the request body. It complies with the JSONPath syntax. If you enable Injection Detection, Compliance Check, and Response Compliance Check are enabled, you need to configure indexes to facilitate the searching, accessing, and processing of prompt or response content.

The following indexes need to be configured for AI model checks:

Prompt Index: If Injection Detection and Compliance Check are enabled, this index is used to locate the position of the prompt in the JSONPath of the request body. All syntax in the following table is supported.
Response Content Index: If Response Compliance Check is enabled, this index is used to locate the response content in the JSONPath of the response body. Syntax 1, 2, 3, and 5 in the following table are supported. Wildcard extraction is not supported.

**Table 5** Prompt and response content index example
No.	Scenario	JSON structure	Path (Max. Depth: 10 Levels)	Result	Description
1	Single object	{"prompt": {"role": "user","content": "..."}}	$.prompt.content	"..."	Periods (.) are used to access subnodes layer by layer to locate the target field. Each subnode name in the path must be explicitly specified.
2	Obtaining the first element	{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}	$.prompt[0].content	"A"	The array index starts from 0. [0] indicates the first element.
3	Obtaining the last element	{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}	$.prompt[-1].content	["B"]	The negative index -1 is used to indicate the last element.
4	Obtaining elements using the wildcard character	{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"},{"role": "user", "content": "C"}]}	$.prompt[*].content	["A", "B", "C"]	[*] matches all elements in an array. Recursive retrieval is not supported.
5	Obtaining the root node	{"prompt": [{"role": "user","content": "A"}, {"role":"assistant", "content": "B"}]}	$	{"prompt":[{"role": "user","content": "A"},{"role":"assistant","content":"B"}]}	The JSON content of the root node is obtained.

Related Operations

Viewing protection logs: Querying a Protection Event.
After detecting an attack, WAF reports the attack log to SecMaster. You can view and analyze the attack log on SecMaster. For more details, see AI Risk Overview.
Unsubscribing from LLM Content Security
1. In the navigation pane on the left, click Dashboard.
2. In the Product Details card, click Details in the Cloud mode area.
3. In the Cloud Mode Details panel, choose Advanced Functions, and click Unsubscribe.
After the unsubscription, LLM content security will become unavailable. You will receive a refund based on your resource usage details.