Updated on 2024-03-05 GMT+08:00

Configuring DIS Agent

The DIS Agent configuration file is in the YAML format. Configuration parameters and values must be separated by colon (:) and space.

You can obtain the agent.yml file template from the dis-agent package. The following is an example. Table 1 describes the configuration parameters in the file.

---
# cloud region id
region: myregion
## The plaintext storage of the AK and SK used for authentication has great security risks
## You are advised to use the /bin/dis-encrypt.sh script to encrypt the AK and SK before storing them
# you ak (get from 'My Credential')
ak: YOU_AK
# you sk (get from 'My Credential')
sk: YOU_SK
# you_key(encry you ak or sk)
encrypt.key: abc
# you project id (get from 'My Credential')
projectId: YOU_PROJECTID
# the dis endpoint
endpoint: https://dis.myregion.cloud.com
# config each flow to monitor file.
flows:
  ### DIS Stream
  - DISStream: YOU_DIS_STREAM_1
    ## only support specified directory, filename can use * to match some files. eg. * means match all file, test*.log means match test1.log or test-12.log and so on.
    filePattern: /tmp/*.log
    ## from where to start: 'START_OF_FILE' or 'END_OF_FILE'
    initialPosition: START_OF_FILE
    ## upload max interval(ms)
    maxBufferAgeMillis: 5000

  ### If there are other monitor files, continue to follow the above configuration
  ### another dis stream monitor config, uncomment # if you want to use this feature
  #- DISStream: YOU_DIS_STREAM_2
  #  filePattern: /opt/*.log
  #  initialPosition: START_OF_FILE
  #  maxBufferAgeMillis: 5000

  ### OBS Stream: Upload the matching file to OBS and send the file name to DIS, uncomment # if you want to use this feature
  #- OBSStream: YOU_DIS_STREAM_3
  #  filePattern: /opt/*.log
  #  initialPosition: START_OF_FILE
  #  ## bucket name
  #  OBSBucket: YOU_OBS_BUCKET_NAME
  #  ## OBS endpoint
  #  OBSEndpoint: https://obs.myregion.cloud.com
  #  ## the directory(using / separated) where the files are stored under the bucket, automatically created if it does not exist
  #  dumpDirectory: example/dis/

After the configuration is complete, delete unnecessary examples from flows in agent.yml or use # to comment out them. For example, if only one DISStream is configured, delete or comment out the following CustomFileStream and other DISStream modules.

Configuring DIS Agent on a Linux Server

  1. Start PuTTY and log in to the Linux server on which the DIS Agent is installed.
  2. Run the cd /opt/dis-agent-X.X.X/ command to open the dis-agent-X.X.X directory.
  3. Run the vim conf/agent.yml command to open the DIS Agent configuration file agent.yml. Modify parameter values in the file to meet specific requirements. Table 1 describes the configuration parameters in the file.

    Table 1 Parameters in the agent.yml file

    Parameter

    Mandatory

    Description

    Default Value

    region

    Yes

    Region where DIS is deployed.

    NOTE:

    For details about how to obtain the region where DIS is deployed, see Regions and Endpoints.

    -

    AK

    Yes

    User's AK.

    NOTE:

    You can encrypt the AK or use a plaintext AK. For details about how to encrypt the AK, see the AK/SK encryption note.

    For details about how to obtain an AK, see Checking Authentication Information.

    -

    SK

    Yes

    User's SK.

    NOTE:

    You can encrypt the SK or use a plaintext SK. For details about how to encrypt the SK, see the AK/SK encryption note.

    For details about how to obtain an SK, see Checking Authentication Information.

    -

    encrypt.key

    No

    Key used for encryption

    NOTE:

    If you want to use an encrypted AK or SK, you must configure this parameter in the agent.yml file. Ensure that you use the configured key to encrypt the AK/SK. Otherwise, the AK/SK cannot be decrypted.

    -

    projectId

    Yes

    Project ID specific to your region.

    For details about how to obtain a project ID, see Checking Authentication Information.

    -

    endpoint

    Yes

    DIS gateway address.

    Format: https://DIS endpoint

    NOTE:

    For details about how to obtain the DIS endpoint, see Regions and Endpoints.

    -

    body.serialize.type

    No

    Format of the DIS data package to be uploaded (non-original data format).
    • json: The DIS data packet is encapsulated in the format of JSON.
    • protobuf: The DIS data packet is encapsulated in the binary format. After being encapsulated, the volume of the data packet is reduced by 1/3. This format is recommended when a massive amount of data is generated.

    json

    body.compress.enabled

    No

    Specifies whether to enable data compression.

    false

    body.compress.type

    No

    Data compression format selected when compression is enabled. Currently, the following compression formats are supported:

    lz4: a compression algorithm with a fast compression speed and high compression efficiency

    zstd: a new lossless compression algorithm with a fast compression speed and high compression ratio

    lz4

    PROXY_HOST

    No

    Proxy IP address. This parameter is mandatory when requests are sent through the proxy server.

    -

    PROXY_PORT

    No

    Proxy port.

    80

    PROXY_PROTOCOL

    No

    Proxy protocol. http and https are supported.

    http

    PROXY_USERNAME

    No

    Proxy username.

    -

    PROXY_PASSWORD

    No

    Proxy password.

    -

    [flows]

    The [flows] section presents information about the files that will be uploaded to DIS.

    The following upload mode is supported:

    DISStream: DIS Agent monitors text files continuously, collects incremental data in real time, parses the data by delimiter, and uploads it to DIS streams (source data type: BLOB, JSON, and CSV). Table 2 describes configuration parameters.

    The agent.yml file provides example parameter settings.

    To encrypt the AK/SK, perform the following steps:

    Download and install the DIS Agent by referring to Installing DIS Agent and use the script in the bin directory of the DIS Agent package to encrypt the AK/SK. The detailed operations are as follows (Windows):

    1. Go to the bin directory of DIS Agent and right-click git bash here to run the script, for example, ./dis-encrypt.sh {key} {ak} to obtained the encrypted AK. Then configure the encrypted AK in the agent.yml file.
    2. Encrypt the SK in the same way. Then configure the encrypted AK/SK and key in the agent.yml file.
      Figure 1 Encryption example
    Table 2 DISStream configuration parameters

    Parameter

    Mandatory

    Description

    Default Value

    DISStream

    Yes

    Name of the DIS stream.

    Parses the file content matching filePattern by delimiter and uploads the file to the stream.

    -

    filePattern

    Yes

    File monitoring path. Files in only one directory can be monitored. Directories cannot be monitored recursively.

    To monitor multiple directories, configure multiple DIS streams in flows. The file names can be matched by asterisk (*)
    • /tmp/*.log: Matches all files whose names end with .log in the /tmp directory.
    • /tmp/access-*.log: Matches all files whose names start with access- and end with .log in the /tmp directory.
    • In Windows, the example path is D:\logs\*.log.

    -

    directoryRecursionEnabled

    No

    Specifies whether to search for a subdirectory. Possible values:

    • false: Not to search for subdirectories recursively and match only files in the root directory.
    • true: Search for all subdirectories recursively. For example, if filePattern is set to /tmp/*.log, /tmp/one.log, /tmp/child/two.log, and /tmp/child/child/three.log can be matched.

    false

    initialPosition

    No

    Initial position from which the file started to be monitored. Possible values:

    • END_OF_FILE: After monitoring starts, the system does not parse the files that match filePattern. Instead, the newly added file or file content will be parsed by delimiter and uploaded to DIS.
    • START_OF_FILE: All the files that match filePattern will be parsed by delimiter and uploaded to DIS based on the file modification time (from the earliest modified to the latest modified).

    START_OF_FILE

    maxBufferAgeMillis

    No

    The maximum number of milliseconds that must elapse before data can be uploaded to the DIS.

    Unit: ms

    • If the buffer is full with data waiting to be uploaded, data will be immediately uploaded to the DIS.
    • If the record queue is not full, files will be uploaded to DIS only after the specified period of time is reached.

    5000

    maxBufferSizeRecords

    No

    The maximum number of records for which the agent buffers data before sending it to DIS. If the number of records in a queue reaches the value, the data will be uploaded to DIS immediately.

    500

    partitionKeyOption

    No

    Method for generating the partition key. Each record carries a partition key. Records with the same partition key are allocated to the same partition. Possible values:
    • RANDOM_INT: The partition key is a random numeric string. Records with such a key are evenly distributed to each partition.
    • FILE_NAME: The partition key is a file name string. Records with such a key is distributed to a specific partition.
    • FILE_NAME,RANDOM_INT: The partition key is a combination of a file name string and a random numeric string, which are separated by comma (,). Records with such a key carries file names and are evenly distributed to all partitions.

    RANDOM_INT

    recordDelimiter

    No

    Delimiter used to separate records.

    Value range: any character that is enclosed in double quotation marks.

    The value cannot be empty. That is, this parameter cannot be set to "".

    NOTE:

    If the value is a special character, use a backslash (\) to escape. For example, if the value is a quotation mark ("), set this parameter to \". If the value is a backslash (\), set this parameter to \\.

    If the value is a control character, for example, STX, set this parameter to \u0002.

    "\n"

    isRemainRecordDelimiter

    No

    Specifies whether a delimiter is contained in records to be uploaded. Possible values:
    • true: The delimiter is contained in records to be uploaded.
    • false: The delimiter is not contained in records to be uploaded.

    false

    isFileAppendable

    No

    Specifies whether the file contains additional content. Possible values:

    • true: The file may contain additional content. Agent continuously monitors files. If content is added to a file, Agent parses the file by recordDelimiter and uploads records. In this case, ensure that the file ends with recordDelimiter. Otherwise, Agent considers that the content has not been added to the file and waits for recordDelimiter to be written.
    • false: The file will not contain additional content. If the last row of the file does not end with recordDelimiter, Agent still uploads the file as the last record. After the upload is complete, Agent will delete or rename the file based on the configuration of deletePolicy and fileSuffix.

    true

    maxFileCheckingMillis

    No

    Maximum time for checking file changes. If the file size, modification time, and file ID do not change within this period of time, a complete file is generated and starts to be uploaded.

    Set this parameter based on the actual file change frequency to prevent an incomplete file from being uploaded.

    If the file is changed after being uploaded, it will be fully uploaded again.

    Unit: ms

    NOTE:

    This parameter is available only when isFileAppendable is set to false.

    5000

    deletePolicy

    No

    Policy for deleting a file after the file content is uploaded. Possible values:
    • never: The file will not be deleted after the file content is uploaded.
    • immediate: The file will be deleted after the file content is uploaded.
      NOTE:

      This parameter is available only when isFileAppendable is set to false.

    never

    fileSuffix

    No

    Suffix of the file name that is added after the file content is uploaded.

    If the original file name is x.txt and fileSuffix is set to .COMPLETED, the name of the uploaded file is x.txt.COMPLETED.

    NOTE:

    This parameter is available only when isFileAppendable is set to false and deletePolicy is set to never.

    .COMPLETED

    sendingThreadSize

    No

    The number of sender threads. By default, there is only one sender thread.

    NOTICE:

    If multiple threads are used, the following problems may occur:

    • Data may not be sent in order.
    • Some data is lost after the program stops abnormally and restarts.

    1

    fileEncoding

    No

    File encoding format. Possible values: UTF8, GBK, GB2312, and ISO-8859-1.

    UTF8

    resultLogLevel

    No

    Level of the calling result log generated each time when the DIS data sending API is called.

    • OFF: Each API calling result is not logged.
    • INFO: Each API calling result is logged at the INFO level.
    • WARN: Each API calling result is logged at the WARN level.
    • ERROR: Each API calling result is logged at the ERROR level.

    INFO

Configuring DIS Agent on a Windows Server

  1. Use a file manager to open the directory (for example, C:\dis-agent-X.X.X) where the installation package is decompressed.
  2. Open the agent.yml file using an editor and modify parameter values in the file to meet specific requirements.

    The agent.yml file is in the Linux format. You are advised to use the general-purpose text editor to edit the file.

    About log files:

    In the installation path of DIS Agent, the logs directory stores the log files generated during DIS Agent running. The dis-agent.log file records the running status of DIS Agent, and the log files with dates, such as dis-agent-2022-10-28.log, record file upload records. One log file is generated every day.

    You can also customize the storage path of log files in the log4j2.xml file in the conf folder in the DIS Agent installation path.

    Figure 2 log4j2