Help Center/ Log Tank Service/ Best Practices/ Log Jobs (Beta)/ Anonymizing Log Data Using DSL Processing Functions
Updated on 2025-12-04 GMT+08:00

Anonymizing Log Data Using DSL Processing Functions

Data anonymization effectively reduces exposure and leakage risks during processing, transmission, and use, protecting user rights and interests. This section describes common anonymization scenarios, methods, and examples applied during data processing in LTS.

Introduction

Sensitive data includes mobile numbers, bank card numbers, email addresses, IP addresses, access key IDs (AKs), ID card numbers, websites, order numbers, and strings. In LTS data processing, common anonymization methods include regular expression replacement (key function: regex_replace), Base64 transcoding (key function: base64_encoding), MD5 encoding (key function: md5_encoding), and mapping (key function: str_translate). For details, see Regular Expression Functions and Encoding and Decoding Functions.

Scenario 1: Anonymizing Mobile Numbers

For logs containing mobile numbers that should not be exposed, you can use regular expressions and the regex_replace function to anonymize them. Example:

  • Raw log
    {
        "iphone":"13900001234"
    }
  • Processing rule
    e_set(
        "sec_iphone",
        regex_replace(v("iphone"), r"(\d{0,3})\d{4}(\d{4})", replace=r"\1****\2"),
    )
  • Processing result
    {
    	"sec_iphone": "139****1234",
    	"iphone": 13900001234
    }

Scenario 2: Anonymizing Bank Card Information

Use regular expressions and the regex_replace function to anonymize bank card or credit card information in logs.

  • Raw log
    {
        "content":"bank number is 491648411333978312 and credit card number is 4916484113339780"
    }
  • Processing rule
    e_set(
        "bank_number",
        regex_replace(
            v("content"), r"([1-9]{1})(\d{14}|\d{13}|\d{11})(\d{4})", replace=r"****\3"
        ),
    )
  • Processing result
    {
    	"bank_number": "bank number is ****8312 and credit card number is ****9780",
    	"content": "bank number is 491648411333978312 and credit card number is 4916484113339780"
    }

Scenario 3: Anonymizing Email Addresses

Use regular expressions and the regex_replace function to anonymize email addresses contained in logs.

  • Raw log
    {
        "content":"email is username@example.com"
    }
  • Processing rule
    e_set(
        "email_encrypt",
        regex_replace(
            v("content"),
            r"[A-Za-z\d]+([-_.][A-Za-z\d]+)*(@([A-Za-z\d]+[-.])+[A-Za-z\d]{2,4})",
            replace=r"****\2",
        ),
    )
  • Processing result
    {
    	"content": "email is username@example.com",
    	"email_encrypt": "email is ****@example.com"
    }

Scenario 4: Anonymizing AKs

Use regular expressions and the regex_replace function to anonymize AKs in logs.

  • Raw log
    {
        "content":"ak id is <testAccessKey ID> and ak key is <testAccessKey Secret>"
    }
  • Processing rule
    e_set(
        "akid_encrypt",
        regex_replace(
            v("content"),
            r"([a-zA-Z0-9]{4})(([a-zA-Z0-9]{26})|([a-zA-Z0-9]{12}))",
            replace=r"\1****",
        ),
    )
  • Processing result
    {
    	"akid_encrypt": "ak id is jdhc**** and ak key is Jkde****",
    	"content": "ak id is <testAccessKey ID> and ak key is <testAccessKey Secret>"
    }

Scenario 5: Anonymizing IP Addresses

Use the regex_replace function and regular expressions to capture and anonymize IP addresses.

  • Raw log
    {
        "content":"ip is 192.0.2.10"
    }
  • Processing rule
    e_set("ip_encrypt",regex_replace(v('content'), r"((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])", replace=r"****"))
  • Processing result
    {
    	"content": "ip is 2.0.2.10",
    	"ip_encrypt": "ip is ****"
    }

Scenario 6: Anonymizing ID Card Information

Use the regex_replace function and regular expressions to capture and anonymize ID card numbers in logs.

  • Raw log
    content: Id card is 111222190002309999
  • Processing rule
    e_set(
        "id_encrypt", regex_replace(v("content"), r"\b\d{17}(\d|X)\b", replace=r"\1****")
    )
  • Processing result
    {
    	"id_encrypt": "Id card is 9****",
    	"content": "Id card is 111222190002309999"
    }

Scenario 7: Anonymizing Websites

Use Base64 encoding and decoding functions to anonymize websites in logs and convert the anonymized data back to plaintext.

  • Raw log
    {
        "content":"https://www.huaweicloud.com/"
    }
  • Processing rule
    e_set("base64_url",base64_encoding(v("content")))
  • Processing result
    {
    	"base64_url": "aHR0cHM6Ly93d3cuaHVhd2VpY2xvdWQuY29tLw==",
    	"content": "https://www.huaweicloud.com/"
    }

    To decode base64_url, use the following DSL syntax rule: base64_decoding(v("base64_url"))

Scenario 8: Anonymizing Order Numbers

Use MD5 encoding functions to anonymize order numbers in logs and prevent others from decoding them.

  • Raw log
    {
        "orderId": "20210101123456"
    }
  • Processing rule
    e_set("md5_orderId",md5_encoding(v("orderId")))
  • Processing result
    {
    	"orderId": 20210101123456,
    	"md5_orderId": "9c0ab8e4d9f4eb6fbd5c508bbca05951"
    }

Scenario 9: Anonymizing Strings

To prevent key strings in logs from being exposed, you can use the str_translate function to define mapping rules and anonymize key characters or strings.

  • Raw log
    {
        "content": "message level is info_"
    }
  • Processing rule
    e_set("data_translate", str_translate(v("content"),"aeiou","12345"))
  • Processing result
    {
    	"data_translate": "m2ss1g2 l2v2l 3s 3nf4_",
    	"content": "message level is info_"
    }