Configuring DIS Agent
The DIS Agent configuration file is in the YAML format. Configuration parameters and values must be separated by colon (:) and space.
You can obtain the agent.yml file template from the dis-agent package. The following is an example. Table 1 describes the configuration parameters in the file.
--- # cloud region id region: myregion ## The plaintext storage of the AK and SK used for authentication has great security risks ## You are advised to use the /bin/dis-encrypt.sh script to encrypt the AK and SK before storing them # you ak (get from 'My Credential') ak: YOU_AK # you sk (get from 'My Credential') sk: YOU_SK # you_key(encry you ak or sk) encrypt.key: abc # you project id (get from 'My Credential') projectId: YOU_PROJECTID # the dis endpoint endpoint: https://dis.myregion.cloud.com # config each flow to monitor file. flows: ### DIS Stream - DISStream: YOU_DIS_STREAM_1 ## only support specified directory, filename can use * to match some files. eg. * means match all file, test*.log means match test1.log or test-12.log and so on. filePattern: /tmp/*.log ## from where to start: 'START_OF_FILE' or 'END_OF_FILE' initialPosition: START_OF_FILE ## upload max interval(ms) maxBufferAgeMillis: 5000 ### If there are other monitor files, continue to follow the above configuration ### another dis stream monitor config, uncomment # if you want to use this feature #- DISStream: YOU_DIS_STREAM_2 # filePattern: /opt/*.log # initialPosition: START_OF_FILE # maxBufferAgeMillis: 5000 ### OBS Stream: Upload the matching file to OBS and send the file name to DIS, uncomment # if you want to use this feature #- OBSStream: YOU_DIS_STREAM_3 # filePattern: /opt/*.log # initialPosition: START_OF_FILE # ## bucket name # OBSBucket: YOU_OBS_BUCKET_NAME # ## OBS endpoint # OBSEndpoint: https://obs.myregion.cloud.com # ## the directory(using / separated) where the files are stored under the bucket, automatically created if it does not exist # dumpDirectory: example/dis/
After the configuration is complete, delete unnecessary examples from flows in agent.yml or use # to comment out them. For example, if only one DISStream is configured, delete or comment out the following CustomFileStream and other DISStream modules.
Configuring DIS Agent on a Linux Server
- Start PuTTY and log in to the Linux server on which the DIS Agent is installed.
- Run the cd /opt/dis-agent-X.X.X/ command to open the dis-agent-X.X.X directory.
- Run the vim conf/agent.yml command to open the DIS Agent configuration file agent.yml. Modify parameter values in the file to meet specific requirements. Table 1 describes the configuration parameters in the file.
Table 1 Parameters in the agent.yml file Parameter
Mandatory
Description
Default Value
region
Yes
Region where DIS is deployed.
NOTE:For details about how to obtain the region where DIS is deployed, see Regions and Endpoints.
-
AK
Yes
User's AK.
NOTE:You can encrypt the AK or use a plaintext AK. For details about how to encrypt the AK, see the AK/SK encryption note.
For details about how to obtain an AK, see Checking Authentication Information.
-
SK
Yes
User's SK.
NOTE:You can encrypt the SK or use a plaintext SK. For details about how to encrypt the SK, see the AK/SK encryption note.
For details about how to obtain an SK, see Checking Authentication Information.
-
encrypt.key
No
Key used for encryption
NOTE:If you want to use an encrypted AK or SK, you must configure this parameter in the agent.yml file. Ensure that you use the configured key to encrypt the AK/SK. Otherwise, the AK/SK cannot be decrypted.
-
projectId
Yes
Project ID specific to your region.
For details about how to obtain a project ID, see Checking Authentication Information.
-
endpoint
Yes
DIS gateway address.
Format: https://DIS endpoint
NOTE:For details about how to obtain the DIS endpoint, see Regions and Endpoints.
-
body.serialize.type
No
Format of the DIS data package to be uploaded (non-original data format).- json: The DIS data packet is encapsulated in the format of JSON.
- protobuf: The DIS data packet is encapsulated in the binary format. After being encapsulated, the volume of the data packet is reduced by 1/3. This format is recommended when a massive amount of data is generated.
json
body.compress.enabled
No
Specifies whether to enable data compression.
false
body.compress.type
No
Data compression format selected when compression is enabled. Currently, the following compression formats are supported:
lz4: a compression algorithm with a fast compression speed and high compression efficiency
zstd: a new lossless compression algorithm with a fast compression speed and high compression ratio
lz4
PROXY_HOST
No
Proxy IP address. This parameter is mandatory when requests are sent through the proxy server.
-
PROXY_PORT
No
Proxy port.
80
PROXY_PROTOCOL
No
Proxy protocol. http and https are supported.
http
PROXY_USERNAME
No
Proxy username.
-
PROXY_PASSWORD
No
Proxy password.
-
[flows]
The [flows] section presents information about the files that will be uploaded to DIS.
The following upload mode is supported:
DISStream: DIS Agent monitors text files continuously, collects incremental data in real time, parses the data by delimiter, and uploads it to DIS streams (source data type: BLOB, JSON, and CSV). Table 2 describes configuration parameters.
The agent.yml file provides example parameter settings.
To encrypt the AK/SK, perform the following steps:
Download and install the DIS Agent by referring to Installing DIS Agent and use the script in the bin directory of the DIS Agent package to encrypt the AK/SK. The detailed operations are as follows (Windows):
- Go to the bin directory of DIS Agent and right-click git bash here to run the script, for example, ./dis-encrypt.sh {key} {ak} to obtained the encrypted AK. Then configure the encrypted AK in the agent.yml file.
- Encrypt the SK in the same way. Then configure the encrypted AK/SK and key in the agent.yml file.
Figure 1 Encryption example
Table 2 DISStream configuration parameters Parameter
Mandatory
Description
Default Value
DISStream
Yes
Name of the DIS stream.
Parses the file content matching filePattern by delimiter and uploads the file to the stream.
-
filePattern
Yes
File monitoring path. Files in only one directory can be monitored. Directories cannot be monitored recursively.
To monitor multiple directories, configure multiple DIS streams in flows. The file names can be matched by asterisk (*)- /tmp/*.log: Matches all files whose names end with .log in the /tmp directory.
- /tmp/access-*.log: Matches all files whose names start with access- and end with .log in the /tmp directory.
- In Windows, the example path is D:\logs\*.log.
-
directoryRecursionEnabled
No
Specifies whether to search for a subdirectory. Possible values:
- false: Not to search for subdirectories recursively and match only files in the root directory.
- true: Search for all subdirectories recursively. For example, if filePattern is set to /tmp/*.log, /tmp/one.log, /tmp/child/two.log, and /tmp/child/child/three.log can be matched.
false
initialPosition
No
Initial position from which the file started to be monitored. Possible values:
- END_OF_FILE: After monitoring starts, the system does not parse the files that match filePattern. Instead, the newly added file or file content will be parsed by delimiter and uploaded to DIS.
- START_OF_FILE: All the files that match filePattern will be parsed by delimiter and uploaded to DIS based on the file modification time (from the earliest modified to the latest modified).
START_OF_FILE
maxBufferAgeMillis
No
The maximum number of milliseconds that must elapse before data can be uploaded to the DIS.
Unit: ms
- If the buffer is full with data waiting to be uploaded, data will be immediately uploaded to the DIS.
- If the record queue is not full, files will be uploaded to DIS only after the specified period of time is reached.
5000
maxBufferSizeRecords
No
The maximum number of records for which the agent buffers data before sending it to DIS. If the number of records in a queue reaches the value, the data will be uploaded to DIS immediately.
500
partitionKeyOption
No
Method for generating the partition key. Each record carries a partition key. Records with the same partition key are allocated to the same partition. Possible values:- RANDOM_INT: The partition key is a random numeric string. Records with such a key are evenly distributed to each partition.
- FILE_NAME: The partition key is a file name string. Records with such a key is distributed to a specific partition.
- FILE_NAME,RANDOM_INT: The partition key is a combination of a file name string and a random numeric string, which are separated by comma (,). Records with such a key carries file names and are evenly distributed to all partitions.
RANDOM_INT
recordDelimiter
No
Delimiter used to separate records.
Value range: any character that is enclosed in double quotation marks.
The value cannot be empty. That is, this parameter cannot be set to "".
NOTE:If the value is a special character, use a backslash (\) to escape. For example, if the value is a quotation mark ("), set this parameter to \". If the value is a backslash (\), set this parameter to \\.
If the value is a control character, for example, STX, set this parameter to \u0002.
"\n"
isRemainRecordDelimiter
No
Specifies whether a delimiter is contained in records to be uploaded. Possible values:- true: The delimiter is contained in records to be uploaded.
- false: The delimiter is not contained in records to be uploaded.
false
isFileAppendable
No
Specifies whether the file contains additional content. Possible values:
- true: The file may contain additional content. Agent continuously monitors files. If content is added to a file, Agent parses the file by recordDelimiter and uploads records. In this case, ensure that the file ends with recordDelimiter. Otherwise, Agent considers that the content has not been added to the file and waits for recordDelimiter to be written.
- false: The file will not contain additional content. If the last row of the file does not end with recordDelimiter, Agent still uploads the file as the last record. After the upload is complete, Agent will delete or rename the file based on the configuration of deletePolicy and fileSuffix.
true
maxFileCheckingMillis
No
Maximum time for checking file changes. If the file size, modification time, and file ID do not change within this period of time, a complete file is generated and starts to be uploaded.
Set this parameter based on the actual file change frequency to prevent an incomplete file from being uploaded.
If the file is changed after being uploaded, it will be fully uploaded again.
Unit: ms
NOTE:This parameter is available only when isFileAppendable is set to false.
5000
deletePolicy
No
Policy for deleting a file after the file content is uploaded. Possible values:- never: The file will not be deleted after the file content is uploaded.
- immediate: The file will be deleted after the file content is uploaded.
NOTE:
This parameter is available only when isFileAppendable is set to false.
never
fileSuffix
No
Suffix of the file name that is added after the file content is uploaded.
If the original file name is x.txt and fileSuffix is set to .COMPLETED, the name of the uploaded file is x.txt.COMPLETED.
NOTE:This parameter is available only when isFileAppendable is set to false and deletePolicy is set to never.
.COMPLETED
sendingThreadSize
No
The number of sender threads. By default, there is only one sender thread.
NOTICE:If multiple threads are used, the following problems may occur:
- Data may not be sent in order.
- Some data is lost after the program stops abnormally and restarts.
1
fileEncoding
No
File encoding format. Possible values: UTF8, GBK, GB2312, and ISO-8859-1.
UTF8
resultLogLevel
No
Level of the calling result log generated each time when the DIS data sending API is called.
- OFF: Each API calling result is not logged.
- INFO: Each API calling result is logged at the INFO level.
- WARN: Each API calling result is logged at the WARN level.
- ERROR: Each API calling result is logged at the ERROR level.
INFO
Configuring DIS Agent on a Windows Server
- Use a file manager to open the directory (for example, C:\dis-agent-X.X.X) where the installation package is decompressed.
- Open the agent.yml file using an editor and modify parameter values in the file to meet specific requirements.
The agent.yml file is in the Linux format. You are advised to use the general-purpose text editor to edit the file.
About log files:
In the installation path of DIS Agent, the logs directory stores the log files generated during DIS Agent running. The dis-agent.log file records the running status of DIS Agent, and the log files with dates, such as dis-agent-2022-10-28.log, record file upload records. One log file is generated every day.
You can also customize the storage path of log files in the log4j2.xml file in the conf folder in the DIS Agent installation path.
Figure 2 log4j2
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot