Updated on 2025-09-07 GMT+08:00

Field Value Extraction Function

This section describes field value extraction functions, including their syntax, parameters, and usage examples.

Function List

Type

Function

Description

Regular expression extraction

e_regex

Extracts the value of a field based on the regular expression and assigns the value to other fields. This function can be used together with other functions.

JSON extraction

e_json

Performs JSON operations on JSON objects in specified fields, including JSON expansion, JMES extraction, and JMES extraction and then expansion. This function can be used together with other functions.

Delimiter extraction

e_csv, e_psv, and e_tsv

Extracts multiple fields from a specified field using user-defined delimiters and predefined field names.

  • e_csv: The default delimiter is a comma (,).
  • e_psv: The default delimiter is a vertical bar (|).
  • e_tsv: The default delimiter is \t.

This function can be used together with other functions.

KV mode extraction

e_kv

Extracts key-value pairs from multiple source fields using quote. This function can be used together with other functions.

e_kv_delimit

Extracts key-value pairs from the source field using delimiters.

e_regex

This function extracts the value of a field based on the regular expression and assigns the value to other fields.

  • Function format
    e_regex(key,regular_expression,fields_info,mode="fill-auto",pack_json=None)
  • Parameter description

    Parameter

    Type

    Mandatory

    Description

    key

    Any

    Yes

    Source field name. If the field does not exist, no operation is performed. For details about how to set special field names, see section "Event Type."

    Regular expression

    String

    Yes

    Regular expression for extracting fields. Capture group and non-capture group regular expressions are supported.

    Non-capture groups need to be used in some cases, and the ?: prefix needs to be used. Example: \w+@\w+\.\w(?:\.\cn)? For details about non-capture groups, see "Non-capture Group."

    fields_info

    String/ List/ Dict

    No

    Name of the target field after matching. This parameter is mandatory when the regular expression parameter is not configured with the name of the named capture.

    mode

    String

    No

    Field overwrite mode. The default value is fill-auto. For details about the field values and meanings, see "Field Extraction Check and Overwriting Mode."

    pack_json

    String

    No

    Pack all matching results of the regular expression into the field specified by pack_json. The default value is None, indicating that the matching results are not packed.

  • Returned result

    Logs with new field values.

  • Function example
    1. Example 1: Extract values that meet the expression from a field.
      • Test data
        {
         "msg": "192.168.0.1 http://... 127.0.0.0"
        }
      • Processing rule
        # Extract the first IP address from the msg field.
        e_regex("msg",r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}","ip")
      • Processing result
        msg: 192.168.0.1 http://... 127.0.0.0 
        ip: 192.168.0.1
    2. Example 2: Extract multiple values that meet the regular expression from a field.
      • Test data
        {
         "msg": "192.168.0.1 http://... 127.0.0.0"
        }
      • Processing rule
        # Extract the two IP addresses in the msg field and assign them to server_ip and client_ip, respectively.
        e_regex("msg",r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}",["server_ip","client_ip"])
      • Processing result
        msg: 192.168.0.1 http://... 127.0.0.0 
        server_ip: 192.168.0.1 
        client_ip: 127.0.0.0
    3. Example 3: Extract values that meet the expression through the capture group.
      • Test data
        {
         "content": "start sys version: deficience, err: 2"
        }
      • Processing rule
        # Use a regular expression to capture the version and error values in content.
        e_regex("content",r"start sys version: (\w+),\s*err: (\d+)",["version","error"])
      • Processing result
        content: start sys version: deficience, err: 2
        error: 2
        version: deficience
    4. Example 4: Extract field values through the named capture group.
      • Test data
        {
         "content": "start sys version: deficience, err: 2"
        }
      • Processing rule
        e_regex("content",r"start sys version: (?P<version>\w+),\s*err: (?P<error>\d+)")
      • Processing result
        content:  start sys version: deficience, err: 2
        error:  2
        version:  deficience
    5. Example 5: Use regular expressions to capture the value in the dict field and dynamically name the field and assign value.
      • Test data
        {
         "dict": "verify:123"
        }
      • Processing rule
        e_regex("dict",r"(\w+):(\d+)",{r"k_\1": r"v_\2"})
      • Processing result
        dict: verify:123
        k_verify: v_123
    6. Example 6: Extract values that match the expression from the field, package them, and assign them to the name field.
      • Test data
        {
         "msg": "192.168.0.1 http://... 127.0.0.0"
        }
      • Processing rule
        e_regex("msg", r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}", "ip", pack_json="name")
      • Processing result
        msg:192.168.0.1 http://... 127.0.0.0
        name:{"ip": "192.168.0.1"}
    7. Example 7: Use regular expressions to extract values from the dict field, dynamically name the field and its value, and pack and assign it to the name field.
      • Test data
        {
         "dict": "x:123, y:456, z:789"
        }
      • Processing rule
        e_regex("dict", r"(\w+):(\d+)", {r"k_\1": r"v_\2"}, pack_json="name")
      • Processing result
        dict:x:123, y:456, z:789
        name:{"k_x": "v_123", "k_y": "v_456", "k_z": "v_789"}
    8. Example 8: Extract the values that match the expression using a capture group and assign them to the name field.
      • Test data
        {
         "content": "start sys version: deficience, err: 2"
        }
      • Processing rule
        e_regex( "content", r"start sys version: (\w+),\s*err: (\d+)", ["version", "error"],pack_json="name")
      • Processing result
        content:start sys version: deficience, err: 2
        name:{"version": "deficience", "error": "2"}
  • More

    This function can be used together with other functions.

e_json

This function performs JSON operations on JSON objects in specified fields, including JSON expansion, JMES extraction, and JMES extraction and then expansion.

  • Function format
    e_json(key, expand=None, depth=100, prefix="__", suffix="__", fmt="simple", sep=".",
           expand_array=true, fmt_array="{parent}_{index}",
           include_node=r"[\u4e00-\u9fa5\u0800-\u4e00a-zA-Z][\w\-\.]*",
            exclude_node="", include_path="", exclude_path="",
         jmes="", output="", jmes_ignore_none=false, mode='fill-auto'
    )
  • Parameter description

    Parameter

    Type

    Mandatory

    Description

    key

    String

    Yes

    Source field name. If the field does not exist, no operation is performed.

    expand

    Boolean

    No

    Whether to expand the field.

    • If the jmes parameter is not set, the default value true is used, indicating that the field is expanded.
    • If the jmes parameter is set, the default value false is used, indicating that the field is not expanded.

    depth

    Number

    No

    Field expansion depth. The value ranges from 1 to 2,000. The value 1 indicates that only the first layer is expanded. The default value is 100.

    prefix

    String

    No

    Prefix added to the field name during expansion.

    suffix

    String

    No

    Suffix added to the field name during expansion.

    fmt

    String

    No

    Formatting mode. Options:

    • simple (default value): Use the node name as the field name. The display format is {prefix}{current}{suffix}.
    • full: The parent node and current node are combined as the field name. The display format is {parent_list_str}{sep}{prefix}{current}{suffix}. The delimiter is specified by the sep parameter. The default value is a period (.).
    • parent: The complete path is used as the field name. The display format is {parent}{sep}{prefix}{current}{suffix}. The delimiter is specified by the sep parameter. The default value is a period (.).
    • root: The root node and current node are combined as the field name. The display format is {parent_list[0]}{sep}{prefix}{current}{suffix}. The delimiter is specified by the sep parameter. The default value is a period (.).

    sep

    String

    No

    Delimiter for formatting parent and child nodes. This parameter is mandatory when fmt is set to full, parent, or root. The default value is a period (.).

    expand_array

    Boolean

    No

    Whether to expand an array. The default value is true, indicating that the array is expanded.

    fmt_array

    String

    No

    Format for expanding an array. The format is {parent_rlist[0]}_{index}. You can also use a maximum of five placeholders to customize a format string: parent_list, current, sep, prefix, and suffix.

    include_node

    String/ Number

    No

    List of allowed nodes, indicating the node names included during filtering. By default, only nodes that contain only Chinese characters, digits, letters, underscores (_), periods (.), and hyphens (-) are automatically expanded.

    exclude_node

    String

    No

    List of restricted nodes, indicating the node names excluded during filtering.

    include_path

    String

    No

    List of allowed nodes, indicating the node paths included during filtering.

    exclude_path

    String

    No

    List of restricted nodes, indicating the node paths excluded during filtering.

    jmes

    String

    No

    Converts the field value to a JSON object and extracts a specific value using JMES.

    output

    String

    No

    Field name output when a specific value is extracted using JMES.

    jmes_ignore_none

    Boolean

    No

    Whether to ignore the value when JMES cannot extract the value. The default value is true, indicating that the value is ignored. Otherwise, an empty string is output.

    mode

    String

    No

    Field overwrite mode. The default value is fill-auto.

    • JSON expansion and filtering
      • If the list of allowed nodes is set, the content must be included in the list of allowed nodes and then appear in the result. Example of the regular expression of the node allowlist: e_json("json_data_filed", ...., include_node=r'key\d+')
      • If the node restriction list is set, the content must be included in the node restriction list and will not be displayed in the result. Example of the regular expression of the node restriction list: e_json("json_data_filed", ...., exclude_node=r'key\d+')
      • Expand the node path: The regular expressions include_path and exclude_path match the path from the beginning. The matched path is separated by periods (.).
    • JMES filtering

      Use JMES to select and calculate.

      • Select the element attribute list under a specific JSON path: e_json(..., jmes="cve.vendors[*].product",output="product")
      • Concatenate element attributes under a specific JSON path with commas (,): e_json(..., jmes="join(',', cve.vendors[*].name)",output="vendors")
      • Calculate the maximum attribute value of elements under a specific JSON path: e_json(..., jmes="max(words[*].score)",output="hot_word")
      • Return an empty string if a specific path does not exist or is empty: e_json(..., jmes="max(words[*].score)",output="hot_word", jmes_ignore_none=false)
    • The following shows how to use parent_list and parent_rlist.

      Test data:

      {
       "data": { "k1": 100,"k2": {"k3": 200,"k4": {"k5": 300}}}
      }

      parent_list arranges the parent nodes from left to right.

      e_json("data", fmt='{parent_list[0]}-{parent_list[1]}#{current}')

      Obtained logs:

      data:{ "k1": 100,"k2": {"k3": 200,"k4": {"k5": 300}}}
      data-k2#k3:200
      data-k2#k5:300

      parent_rlist arranges the parent nodes from right to left.

      e_json("data", fmt='{parent_rlist[0]}-{parent_rlist[1]}#{current}')

      Obtained logs:

      data:{ "k1": 100,"k2": {"k3": 200,"k4": {"k5": 300}}}
      k2-data#k3:200
      k4-k2#k5:300
  • Returned result

    Logs with new field values.

  • Function example
    1. Example 1: Expand fields.
      • Test data
        {
         "data": {"k1": 100, "k2": 200}
        }
      • Processing rule
        e_json("data",depth=1)
      • Processing result
        data: {"k1": 100, "k2": 200}
        k1: 100
        k2: 200
    2. Example 2: Add prefixes and suffixes to field names.
      • Test data
        {
         "data": {"k1": 100, "k2": 200}
        }
      • Processing rule
        e_json("data", prefix="data_", suffix="_end")
      • Processing result
        data: {"k1": 100, "k2": 200}
        data_k1_end: 100
        data_k2_end: 200
    3. Example 3: Expand fields in different formats.
      • Test data
        {
         "data": {"k1": 100, "k2": {"k3": 200, "k4": {"k5": 300} } }
        }
      • fmt=full format
        e_json("data", fmt='full')
        data: {"k1": 100, "k2": {"k3": 200, "k4": {"k5": 300} } } 
        data.k1: 100 
        data.k2.k3: 200 
        data.k2.k4.k5: 300
      • fmt=parent format
        e_json("data", fmt='parent')
        data: {"k1": 100, "k2": {"k3": 200, "k4": {"k5": 300} } } 
        data.k1: 100 
        k2.k3: 200 
        k4.k5: 3000
      • fmt=root format
        e_json("data", fmt='root')
        data: {"k1": 100, "k2": {"k3": 200, "k4": {"k5": 300} } } 
        data.k1: 100 
        data.k3: 200 
        data.k5: 300
    4. Example 4: Extract JSON using the specified delimiter, field name prefix, and field name suffix
      • Test data
        {
         "data": {"k1": 100, "k2": {"k3": 200, "k4": {"k5": 300} } }
        }
      • Processing rule
        e_json("data", fmt='parent', sep="@", prefix="__", suffix="__")
      • Processing result
        data: {"k1": 100, "k2": {"k3": 200, "k4": {"k5": 300} } } 
        data@__k1__: 100
        k2@__k3__: 200
        k4@__k5__: 300
    5. Example 5: Specify the fmt_array parameter and extract JSON in array mode.
      • Test data
        {
         "people": [{"name": "xm", "gender": "boy"}, {"name": "xz", "gender": "boy"}, {"name": "xt", "gender": "girl"}]
        }
      • Processing rule
        e_json("people", fmt='parent', fmt_array="{parent_rlist[0]}-{index}")
      • Processing result
        people: [{"name": "xm", "gender": "boy"}, {"name": "xz", "gender": "boy"}, {"name": "xt", "gender": "girl"}]
        people-0.name: xm 
        people-0.gender: boy 
        people-1.name: xz 
        people-1.gender: boy 
        people-2.name: xt 
        people-2.gender: girl
    6. Example 6: Use JMES to extract JSON objects.
      • Test data
        {
         "data": { "people": [{"first": "James", "last": "d"},{"first": "Jacob", "last": "e"}],"foo": {"bar": "baz"}}
        }
      • Processing rule
        e_json("data", jmes='foo', output='jmes_output0')
        e_json("data", jmes='foo.bar', output='jmes_output1')
        e_json("data", jmes='people[0].last', output='jmes_output2')
        e_json("data", jmes='people[*].first', output='jmes_output3')
      • Processing result
        data: { "people": [{"first": "James", "last": "d"},{"first": "Jacob", "last": "e"}],"foo": {"bar": "baz"}}
        jmes_output0: {"bar": "baz"}
        jmes_output1: baz 
        jmes_output2: d 
        jmes_output3: ["james", "jacob"]
  • More

    This function can be used together with other functions.

e_csv, e_psv, and e_tsv

These functions extract multiple fields from a specified field using user-defined delimiters and predefined field names.

  • e_csv: The default delimiter is a comma (,).
  • e_psv: The default delimiter is a vertical bar (|).
  • e_tsv: The default delimiter is \t.
  • Function format
    e_csv(Source field name, Target field list, sep=",", quote='"', restrict=true, mode="fill-auto")
    e_psv(Source field name, Target field list, sep="|", quote='"', restrict=true, mode="fill-auto")
    e_tsv(Source field name, Target field list, sep="\t", quote='"', restrict=true, mode="fill-auto")
  • Parameter description

    Parameter

    Type

    Mandatory

    Description

    Source field name

    Any

    Yes

    Source field name. If the field does not exist, no operation is performed.

    Target field list

    Any

    Yes

    Field name corresponding to each value after delimiter separation. The value can be a string list, for example, ["error", "message", "result"].

    If the field name does not contain commas (,), you can use commas (,) as delimiters, for example, "error, message, result".

    sep

    String

    No

    Delimiter, which can only be a single character.

    quote

    String

    No

    Quote character used to wrap values. This parameter is required when the value contains delimiters.

    restrict

    Boolean

    No

    Whether to use the strict mode. The default value is false, indicating the non-strict mode. When the number of delimited values is different from the number of target field lists:

    • In strict mode, no operation is performed.
    • In non-strict mode, values are assigned to the first several fields that can be paired.

    mode

    String

    No

    Field overwrite mode. The default value is fill-auto.

  • Returned result

    Logs with new field values.

  • Function example

    The following example use e_csv. The e_psv and e_tsv functions are similar.

    • Test data
      {
       "content": "192.168.0.100,10/Jun/2019:11:32:16 +0800,example.aadoc.com,GET /zf/11874.html HTTP/1.1,200,0.077,6404,192.168.0.100:8001,200,0.060,https://image.developer.aadoc.com/s?q=%E8%9B%8B%E8%8A%B1%E9%BE%99%E9%A1%BB%E9%9D%A2%E7%9A%84%E5%81%9A%E6%B3%95&from=wy878378&uc_param_str=dnntnwvepffrgibijbprsvdsei,-,Mozilla/5.0 (Linux; Android 9; HWI-AL00 Build/HUAWEIHWI-AL00) AppleWebKit/537.36,-,-"
      }
    • Processing rule
      e_csv("content", "remote_addr, time_local,host,request,status,request_time,body_bytes_sent,upstream_addr,upstream_status, upstream_response_time,http_referer,http_x_forwarded_for,http_user_agent,session_id,guid")
    • Processing result
      content:  192.168.0.100,10/Jun/2019:11:32:16 +0800,example.aadoc.com,GET /zf/11874.html HTTP/1.1,200,0.077,6404,192.168.0.100:8001,200,0.060,https://image.developer.aadoc.com/s?q=%E8%9B%8B%E8%8A%B1%E9%BE%99%E9%A1%BB%E9%9D%A2%E7%9A%84%E5%81%9A%E6%B3%95&from=wy878378&uc_param_str=dnntnwvepffrgibijbprsvdsei,-,Mozilla/5.0 (Linux; Android 9; HWI-AL00 Build/HUAWEIHWI-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36,-,-
        body_bytes_sent:  6404
      guid:  -
      host:  example.aadoc.com
      http_referer:  https://image.developer.aadoc.com/s?q=%E8%9B%8B%E8%8A%B1%E9%BE%99%E9%A1%BB%E9%9D%A2%E7%9A%84%E5%81%9A%E6%B3%95&from=wy878378&uc_param_str=dnntnwvepffrgibijbprsvdsei
      http_user_agent:  Mozilla/5.0 (Linux; Android 9; HWI-AL00 Build/HUAWEIHWI-AL00) AppleWebKit/537.36 
      http_x_forwarded_for:  -
      remote_addr:  192.168.0.100 
      request:  GET /zf/11874.html HTTP/1.1 
      request_time:  0.077
      session_id:  -
      status:  200
      time_local:  10/Jun/2019:11:32:16 +0800 
      topic:  syslog-forwarder 
      upstream_addr:  192.168.0.100:800
      1upstream_response_time:  0.060
      upstream_status:  200
  • More

    This function can be used together with other functions.

e_kv

This function extracts key-value pairs from multiple source fields using quote.

  • Function format
    e_kv(source field or source field list, sep="=", quote='"', escape=false, prefix="", suffix="", mode="fill-auto")
  • Parameter description

    Parameter

    Type

    Mandatory

    Description

    Source field or source field list

    String or string list

    Yes

    Field name or a list of multiple field names.

    sep

    String

    No

    Delimiter of the regular expression of the keyword and value. The default value is =. It is not limited to a single character.

    Note: Non-capturing groups can be used, but capturing groups cannot be used.

    quote

    String

    No

    Quotation mark, which is used to enclose values. The default value is ".

    Note: The values of the extracted dynamic key-value pairs need to be enclosed by quote, for example, a="abc" and b="xyz". If the extraction object does not contain, only the values of the following character sets are extracted: Chinese characters, letters, digits, underscores (_), hyphens (-), periods (.), percent signs (%), and tildes (~). For example, if a=Chinese ab12_-.%~|abc b=123, a: Chinese ab12_-.%~ and b: 123 can be extracted.

    escape

    Boolean

    No

    Whether to automatically extract the value of the reverse character. The default value is false, meaning the value of the reverse character is not automatically extracted. For example, for key="abc\"xyz", the value abc\ is extracted from key by default. If escape is set to true, abc"xyz is extracted.

    prefix

    String

    No

    Prefix added to the extracted field name.

    suffix

    String

    No

    Suffix added to the extracted field name.

    mode

    String

    No

    Field overwrite mode. The default value is fill-auto.

  • Returned result

    Logs with new field values.

  • Function example
    1. Example 1: Use the default delimiter = to extract key-value pairs.
      • Test data
        {
         "http_refer": "https://video.developer.aadoc.com/s?q=asd&a=1&b=2"
        }

        If the test data is request_uri: a1=1&a2=&a3=3, the value of a2 is empty. The e_kv() function cannot extract a2. You can use the e_regex() function to extract it, for example, e_regex("request_uri",r'(\w+)=([^=&]*)',{r"\1":r"\2"},mode="overwrite").

      • Processing rule
        e_kv("http_refer")
      • Processing result
        http_refer: https://video.developer.aadoc.com/s?q=asd&a=1&b=2
        q: asd 
        a: 1
        b: 2
    2. Example 2: Add prefixes and suffixes to field names.
      • Test data
        {
         "http_refer": "https://video.developer.aadoc.com/s?q=asd&a=1&b=2"
        }
      • Processing rule
        e_kv(
            "http_refer",
            sep="=",
            quote='"',
            escape=false,
            prefix="data_",
            suffix="_end",
            mode="fill-auto",
        )
      • Processing result
        http_refer: https://video.developer.aadoc.com/s?q=asd&a=1&b=2
        data_q_end: asd 
        data_a_end: 1
        data_b_end: 2
    3. Example 3: Extract key-value pairs from the content2 field and use the escape parameter to extract the value of the reversed character.
      • Test data
        {
         "content2": "k1:\"v1\\"abc\", k2:\"v2\", k3: \"v3\""
        }
      • Processing rule
        e_kv("content2", sep=":", escape=true)
      • Processing result
        content2:  k1:"v1\"abc", k2:"v2", k3: "v3"
        k1: v1"abc 
        k2: v2 
        k3: v3
  • More

    This function can be used together with other functions.

e_kv_delimit

This function extracts key-value pairs from the source field using delimiters.

  • Function format
    e_kv_delimit(Source field or source field list, pair_sep=r"\s", kv_sep="=", prefix="", suffix="", mode="fill-auto")
  • Parameter description

    Parameter

    Type

    Mandatory

    Description

    Source field or source field list

    String or string list

    Yes

    Field name or a list of multiple field names.

    pair_sep

    String

    No

    Regular character set used to separate key-value pairs. The default value is \s. For example, \s\w and abc\s.

    Note: If you need to use a string to separate fields, you are advised to use str_replace or regex_replace to convert the string into a character as the delimiter, and then use the e_kv_delimit function to separate the fields.

    kv_sep

    String

    No

    Regular string used to separate key-value pairs. The default value is =, which is not limited to a single character.

    Non-capturing groups can be used, but capturing groups cannot be used.

    prefix

    String

    No

    Prefix added to the extracted field name.

    suffix

    String

    No

    Suffix added to the extracted field name.

    mode

    String

    No

    Field overwrite mode. The default value is fill-auto.

  • Returned result

    Logs with new field values.

  • Function example
    1. Example 1: Use the default delimiter = to extract key-value pairs.
      • Test data
        {
         "data": "i=c1 k1=v1 k2=v2 k3=v3"
        }

        If the test data is request_uri: a1=1&a2=&a3=3, the value of a2 is empty. The e_kv_delimit() function cannot extract a2. You can use the e_regex() function to extract the value, for example, e_regex("request_uri",r'(\w+)=([^=&]*)',{r"\1":r"\2"}, mode="overwrite").

      • Processing rule
        e_kv_delimit("data")
      • Processing result
        data: i=c1 k1=v1 k2=v2 k3=v3 
        i: c1 
        k2: v2 
        k1: v1 
        k3: v3
    2. Example 2: Use delimiters &? to extract key-value pairs.
      • Test data
        {
         "data": "k1=v1&k2=v2?k3=v3"
        }
      • Processing rule
        e_kv_delimit("data",pair_sep=r"&?")
      • Processing result
        data: k1=v1&k2=v2?k3=v3
        k2: v2 
        k1: v1 
        k3: v3
    3. Example 3: Use regular expressions to extract key-value pairs.
      • Test data
        {
         "data": "k1=v1 k2:v2 k3=v3"
        }
      • Processing rule
        e_kv_delimit("data", kv_sep=r"(?:=|:)")
      • Processing result
        data: k1=v1 k2:v2 k3=v3 
        k2: v2 
        k1: v1 
        k3: v3