Ingesting Self-Built Kubernetes Application Logs to LTS

Kubernetes is an open-source container orchestration engine. It provides automatic deployment, large-scale scalability, and containerized application management. Application logs from services or specific file paths on self-built Kubernetes cluster nodes can be reported to LTS for storage and analysis.

Follow these steps to complete the ingestion configuration:

Step 1: Select a Log Stream: Store various log types in separate log streams for better categorization and management.

Step 2: Check Dependencies: The system automatically checks whether the dependencies meet the requirements.

Step 3: Install the Log Collection Component: Install the ICAgent on the host.

Step 4: (Optional) Select a Host Group: Define the range of hosts for log ingestion. Host groups are virtual groups of hosts. They help you organize and categorize hosts, making it easier to configure log ingestion for multiple hosts simultaneously. You can add one or more hosts whose logs are to be collected to a single host group, and associate it with the same ingestion configuration.

Step 5: Configure the Collection: Configure the log collection details, including collection paths and policies.

Step 6: Configure Indexing: An index is a storage structure used to query log data. Configuring indexing makes log searches and analysis faster and easier.

Step 7: Complete the Ingestion Configuration: After a log ingestion configuration is created, manage it in the ingestion list.

Setting Multiple Ingestion Configurations in a Batch: Select this mode to collect logs from multiple scenarios.

Prerequisites

Ensure that the Helm v3 installation command has been executed in the Kubernetes cluster.
Ensure that kubectl has been configured for the Kubernetes cluster.
Ensure that the kubelet process of the node listens to port 10255 (read-only port used to obtain pod information).
A log group and a log stream have been created. For details, see Managing Log Groups and Managing Log Streams.
Before configuring log ingestion, ensure that ICAgent collection is enabled by referring to Setting LTS Log Collection Quota and Usage Alarms.
- This function is enabled by default. If you do not need to collect logs, disable this function to reduce resource usage.
- After this function is disabled, ICAgent will stop collecting logs, and the log collection function on the AOM console will also be disabled.

Step 1: Select a Log Stream

Log in to the LTS console.
Choose Log Ingestion > Ingestion Center in the navigation pane and click Self-built K8s - Application Logs.
You can also choose Log Ingestion > Ingestion Management in the navigation pane and click Create. On the displayed page, click Self-built K8s - Application Logs.

Choose a collection mode between Fixed log stream (recommended) and Custom log stream.

**Table 1** Collection modes
Collection Mode	Description
Fixed log stream	Advantage: Logs of the same type are stored in the same stream for easy search. Disadvantages: Since logs of different structures are collected into the same stream, log structuring and SQL visual analysis are unavailable. Since each stream has a limited write rate of 100 MB/s, there may be performance bottlenecks in heavy traffic.
Custom log stream	Advantages: Since logs of different structures are collected to different streams, log structuring and SQL visual analysis are available. Since the cumulative write rate of multiple streams is not limited, there are no performance bottlenecks in heavy traffic. Disadvantage: Complex management of a large number of log streams

If you set Collect to Fixed log stream, perform the following steps:
Logs will be collected to a fixed log stream. The default log streams for a kubernetes cluster are stdout-{ClusterID} for standard output/errors, hostfile-{ClusterID} for node files, containerfile-{ClusterID} for container files, and event-{ClusterID} for Kubernetes events. Log streams are automatically named with a cluster ID. For example, if the cluster ID is Cluster01, the standard output/error log stream is stdout-Cluster01.

Log streams that can be created for a kubernetes cluster are stdout-{ClusterID} for standard output/errors, hostfile-{ClusterID} for node files, containerfile-{ClusterID} for container files, and event-{ClusterID} for Kubernetes events. If one of them has been created in a log group, the log stream will no longer be created in the same log group or other log groups.
1. Select Fixed log stream for Collect.
2. Enter the cluster name and ID.
3. Select a log group.
  If there is no such group, the system displays the following message: This log group does not exist and will be automatically created to start collecting logs.
If you set Collect to Custom log stream, perform the following steps:
1. Select Custom log stream.
2. Enter the cluster name and ID.
3. Select a log group from the Log Group drop-down list. If there are no desired log groups, click Create Log Group to create one.
4. Select a log stream from the Log Stream drop-down list. If there are no desired log streams, click Create Log Stream to create one.
  Figure 1 Custom log stream

Click Next: Check Dependencies.

Step 2: Check Dependencies

The system automatically checks for the following:
- There is a host group with the custom identifier k8s-log-ClusterID.
- There is a log group named k8s-log-ClusterID. The log retention period and description of the log group can be modified. If Fixed log stream is selected, this item is checked.
- The recommended log stream exists. The log retention period and description of the log stream can be modified. If Fixed log stream is selected, this item is checked.
You need to meet all the requirements before moving on. If not, click Auto Correct.
- Auto Correct: a one-click option to finish the previous settings.
- Check Again: Recheck dependencies.
Click Next: Install ICAgent.

Step 3: Install the Log Collection Component

On any host in the Kubernetes cluster, perform the following steps as prompted on the LTS console:

Figure 2 Installing the log collection component
Click to enlarge

On the Install ICAgent page, download the ICAgent installation package.
Copy the decompression command on the LTS page to decompress the ICAgent installation package.
Run the following command to go to the icagentK8s directory:
```
cd icagentK8s
```
Under Generate installation commands, select the region where logs are to be collected and the project ID of the account to which logs are to be collected.
If you select Intra-Region for Kubernetes Cluster:
1. Copy the ICAgent installation command under Procedure. You must replace the AK/SK with the obtained one. Manually replace it when copying the command, or replace it as prompted when running the command.
  Figure 3 Installing ICAgent
2. Use a remote login tool (such as PuTTY) to log in to the target host as the root user and run the copied ICAgent installation command.
  If the message "ICAgent install success" is displayed, the installation is successful and the ICAgent status displayed in the host list is Running.
  
  After the ICAgent is installed, click ICAgent Already Installed.
If you select Extra-Region for Kubernetes Cluster:
1. Set Network Connectivity to Public network or Private line. For extra-region clusters to report logs to LTS in the current region, you are advised to select Private line for higher stability and reliability.
  - If you select Public network, start from 6.c.
  - If you select Private line, start from 6.b.
2. Set LTS Backend Connection to VPCEP or Jump server. In this scenario, extra-region clusters cannot communicate with LTS in the current region by default, and ICAgent installed on these hosts cannot directly access the network segment used by the management plane to report logs. Therefore, you need to configure a network connection solution using either a VPCEP or a jump server to connect to LTS.
  - If you set LTS Backend Connection to VPCEP:
    Configure a VPCEP domain name. With the assistance of network engineers, configure DNS domain name resolution rules in other regions to resolve VPCEP domain names to specified IP addresses. After the configuration is complete, copy the command as prompted, select a host whose logs are to be collected, and run the command. If the ping command succeeds, the network configuration is correct.
  - If you set LTS Backend Connection to Jump server:
    1. Create a Linux ECS as a jump server and enter the private IP address of the jump server.
    2. Obtain an AK/SK. Enter the DC and the connection IP address of the jump server.
      DC: Specify a name for the data center of the host so it is easier to find the host. Enter up to 64 characters. Use only digits, letters, hyphens (-), and underscores (_).
      
      Connection IP: If the jump server communicates with the extra-region host via EIP connection, enter the EIP of the jump server. Conversely, when using a VPC peering connection, enter the internal IP address (private IP address) of the VPC where the jump server locates.
3. Copy the ICAgent installation command under Procedure. You must replace the AK/SK with the obtained one. Manually replace it when copying the command, or replace it as prompted when running the command.
4. Use a remote login tool (such as PuTTY) to log in to the target host as the root user and run the copied ICAgent installation command.
  If the message "Install ICAgent success" is displayed, the installation is successful and the ICAgent status displayed in the host list is Running.
  
  After the ICAgent is installed, click ICAgent Already Installed.

Step 4: (Optional) Select a Host Group

In the host group list, the host group to which the cluster belongs is selected by default. You can also select host groups as required.
Click Next: Configurations.

Step 5: Configure the Collection

Collection configuration items include the log collection scope, collection mode, and format processing. Configure them as follows.

Collection Configuration Name: Enter 1 to 64 characters. Only letters, digits, hyphens (-), underscores (_), and periods (.) are allowed. Do not start with a period or underscore, or end with a period.

Data Source: Select a data source type and configure it.

**Table 2** Collection configuration parameters
Type	Description
Container standard output	Collects stderr and stdout logs of a specified container in the cluster. Collects stderr and stdout logs of a specified container in the cluster. Either Container Standard Output (stdout) or Container Standard Error (stderr) must be enabled. If you enable Container Standard Error (stderr), select your collection destination path: Collect standard output and standard error to different files (stdout.log and stderr.log) or Collect standard output and standard error to the same file (stdout.log). The standard output of the matched container is collected to the specified log stream. Standard output to AOM stops. Allow Repeated File Collection (not available to Windows) After you enable this function, one host log file can be collected to multiple log streams. This function is available only to certain ICAgent versions. For details, see Checking the ICAgent Version. After you disable this function, each collection path must be unique. That is, the same log file in the same host cannot be collected to different log streams.
Container file	Collects file logs of a specified container path in a cluster. Collection Paths: Specify the paths from which LTS will collect logs. If a container mount path has been configured for the CCE cluster workload, the paths added for this field are invalid. The collection paths take effect only after the mount path is deleted. Add Custom Wrapping Rule: ICAgent determines whether a file is wrapped based on the file name rule. If your wrapping rule does not comply with the built-in rules, you can add a custom wrap rule to prevent log loss during repeated collection and wrapping. The built-in rules are {basename}{connector}{wrapping identifier}.{suffix} and {basename}.{suffix}{connector}{wrapping identifier}. Connectors can be hyphens (-), periods (.), or underscores (_), wrapping identifiers can contain only non-letter characters, and the suffix can contain only letters. A custom wrapping rule consists of {basename} and the feature regular expression of the wrapped file. Example: If your log file name is test.out.log and the names after wrapping are test.2024-01-01.0.out.log and test.2024-01-01.1.out.log, configure the collection path to */opt/.log*, and add a custom wrapping rule: {basename}*\.\d{4}-\d{2}-\d{2}\.\d{1}.out.log. You can verify collection paths to ensure that logs can be properly collected. Click use path verification, enter the collection paths and absolute paths of the log files, and click OK. You can add up to 30 collection paths. If the paths are correct, a success message will be displayed. You can verify wrapping rules to ensure that logs can be properly collected. Click wrapping rule verification, enter the name of the collected file, file name after wrapping, and wrapping rule, and click OK. If the wrapping rule is correct, a success message will be displayed. Allow Repeated File Collection (not available to Windows) After you enable this function, one host log file can be collected to multiple log streams. This function is available only to certain ICAgent versions. For details, see Checking the ICAgent Version. After you disable this function, each collection path must be unique. That is, the same log file in the same host cannot be collected to different log streams. Set Collection Filters: Blacklisted directories or files will not be collected. If you specify a directory, all files in the directory are filtered out.
Node file	Collects files of a specified node path in a cluster. Collection Paths: Specify the paths from which LTS will collect logs. Add Custom Wrapping Rule: ICAgent determines whether a file is wrapped based on the file name rule. If your wrapping rule does not comply with the built-in rules, you can add a custom wrap rule to prevent log loss during repeated collection and wrapping. The built-in rules are {basename}{connector}{wrapping identifier}.{suffix} and {basename}.{suffix}{connector}{wrapping identifier}. Connectors can be hyphens (-), periods (.), or underscores (_), wrapping identifiers can contain only non-letter characters, and the suffix can contain only letters. A custom wrapping rule consists of {basename} and the feature regular expression of the wrapped file. Example: If your log file name is test.out.log and the names after wrapping are test.2024-01-01.0.out.log and test.2024-01-01.1.out.log, configure the collection path to */opt/.log*, and add a custom wrapping rule: {basename}*\.\d{4}-\d{2}-\d{2}\.\d{1}.out.log. You can verify collection paths to ensure that logs can be properly collected. Click use path verification, enter the collection paths and absolute paths of the log files, and click OK. You can add up to 30 collection paths. If the paths are correct, a success message will be displayed. You can verify wrapping rules to ensure that logs can be properly collected. Click wrapping rule verification, enter the name of the collected file, file name after wrapping, and wrapping rule, and click OK. If the wrapping rule is correct, a success message will be displayed. Allow Repeated File Collection (not available to Windows) After you enable this function, one host log file can be collected to multiple log streams. This function is available only to certain ICAgent versions. For details, see Checking the ICAgent Version. After you disable this function, each collection path must be unique. That is, the same log file in the same host cannot be collected to different log streams. Set Collection Filters: Blacklisted directories or files will not be collected. If you specify a directory, all files in the directory are filtered out.
Kubernetes event	Collects event logs of the Kubernetes cluster. You do not need to configure this parameter. Only ICAgent 5.12.150 or later is supported. Kubernetes events of a Kubernetes cluster can be collected to only one log stream.

Set kubernetes matching rules only when the data source type is set to Container standard output or Container file.
After entering a regular expression, click Verify to verify it.
Structuring Parsing:
LTS offers various log parsing rules, including Single-Line - Full-Text Log, Multi-Line - Full-Text Log, JSON, Delimiter, Single-Line - Completely Regular, Multi-Line - Completely Regular, and Combined Parsing. Select a parsing rule that matches your log content. Once collected, structured logs are sent to your specified log stream, enabling field searching.
- If you enable Structuring Parsing, configure it by referring to Configuring ICAgent Structuring Parsing.
  Figure 4 ICAgent structuring parsing configuration
- If you disable Structuring Parsing, log data will not be structured. Raw logs will be sent to the specified log stream, allowing only keyword-based searches.

Other: After setting the collection paths, you can also set log splitting, binary file collection, and custom metadata.

**Table 3** Other configurations
Parameter	Description	Example Value
Max Directory Depth	Specify the number of directory levels that can be traversed when using double asterisks () for fuzzy matching of log collection paths. LTS supports a maximum of 20 directory levels. For example, to collect logs from /var/logs/department/app/a.log, set the collection path to /var/logs//a.log and Max Directory Depth to 5. To collect logs in container file paths, you need to upgrade ICAgent to 7.3.3 or later to support fuzzy match.	5
Split Logs	To prevent individual logs from being too large or being truncated and discarded, you can split logs based on file size. If Split Logs is enabled, logs exceeding the specified size will be split into multiple logs for collection. Specify the size in the range from 500 KB to 1,024 KB. For example, if you set the size to 500 KB, a 600 KB log will be split into a 500 KB log and a 100 KB log. This restriction is applicable to single-line logs only, not multi-line logs. If Split Logs is disabled, any log exceeding 500 KB will have its excess content truncated and discarded.	Enable
Collect Binary Files	Specify whether to collect log data stored in binary format. You can run the following command to check the file type. Log files containing charset=binary are binary files. file -i File name If this option is enabled, binary log files that match the ingestion rule will be collected, but only UTF-8 strings are supported. Other strings will be garbled on the LTS console. If this option is disabled, binary log files will not be collected.	Enable
Log File Code	Select the storage format of characters in log files. You can select UTF-8 or GBK encoding. GBK is not supported in the Windows OS. Set the encoding format properly to ensure that log content can be correctly read and parsed, preventing garbled characters or data damage. UTF-8 encoding is a variable-length encoding mode and represents Unicode character sets. GBK, an acronym for Chinese Internal Code Extension Specification, is a Chinese character encoding standard that extends both the ASCII and GB2312 encoding systems.	UTF-8
Collection Policy	Set whether ICAgent reads a file from the end or the beginning when collecting new log files. Incremental: When collecting a new file, ICAgent reads the file from the end of the file. All: When collecting a new file, ICAgent reads the file from the beginning of the file.	Incremental
Custom Metadata	If this option is disabled, the ICAgent system's default fields are used to report logs to LTS. You do not need to and cannot configure the fields. If this option is enabled, ICAgent will report logs based on your selected built-in fields and fields created with custom key-value pairs. Built-in Fields: Select built-in fields as required. Custom Key-Value Pairs: Click Add and set a key and value.	Enable

Configure the log format and time by referring to Table 4. If Structuring Parsing is enabled, these parameters are unavailable.

**Table 4** Log collection settings
Parameter	Description
Log Format	Single-line: Each log line is displayed as a single log event. Multi-line: Multiple lines of exception logs can be displayed as a single log event and each line of regular logs is displayed as a log event. This is helpful when you check logs to locate problems.
Log Time	System time: log collection time by default. It is displayed at the beginning of each log event. Log collection time is the time when logs are collected and sent by ICAgent to LTS. Log printing time is the time when logs are printed. ICAgent collects and sends logs to LTS every second. Restriction on log collection time: Logs are collected within 24 hours before and after the system time.
Log Time	Time wildcard: You can set a time wildcard so that ICAgent will look for the log printing time as the beginning of a log event. If the time format in a log event is 2019-01-01 23:59:59.011, the time wildcard should be set to YYYY-MM-DD hh:mm:ss.SSS. If the time format in a log event is 19-1-1 23:59:59.011, the time wildcard should be set to YY-M-D hh:mm:ss.SSS. If a log event does not contain year information, ICAgent regards it as printed in the current year. Example: YY - year (19) YYYY - year (2019) M - month (1) MM - month (01) D - day (1) DD - day (01) hh - hours (23) mm - minutes (59) ss - seconds (59) SSS - millisecond (999) hpm - hours (03PM) h:mmpm - hours:minutes (03:04PM) h:mm:sspm - hours:minutes:seconds (03:04:05PM) hh:mm:ss ZZZZ (16:05:06 +0100) hh:mm:ss ZZZ (16:05:06 CET) hh:mm:ss ZZ (16:05:06 +01:00)
Log Segmentation	This parameter needs to be specified if the Log Format is set to Multi-line. By generation time indicates that a time wildcard is used to detect log boundaries, whereas By regular expression indicates that a regular expression is used.
By regular expression	You can set a regular expression to look for a specific pattern to indicate the beginning of a log event. This parameter needs to be specified when you select Multi-line for Log Format and By regular expression for Log Segmentation. The time wildcard and regular expression will look for the specified pattern right from the beginning of each log line. If no match is found, the system time, which may be different from the time in the log event, is used. In general cases, you are advised to select Single-line for Log Format and System time for Log Time. ICAgent supports only RE2 regular expressions. For details, see Syntax.

Click Next: Index Settings.

Step 6: Configure Indexing

An index is a storage structure used to query log data. Configuring indexing makes log searches and analysis faster and easier. Different index settings generate different query and analysis results. Configure index settings to fit your service requirements.

If you do not want to query or analyze logs using specific fields, you can skip configuring indexing when configuring log ingestion. This will not affect log collection. You can also configure indexing after creating the log ingestion configuration. However, index settings will only apply to newly ingested logs. For details, see Configuring Log Indexing. If you choose to skip this step, retain the default settings on the Index Settings page and click Skip and Submit. The message "Logs ingested" will appear.
To query or analyze logs using specific fields, configure indexing on the Index Settings page when creating an ingestion configuration. For details, see Configuring Log Indexing.
On this page, click Auto Configure to have LTS generate index fields based on the first log event in the last 15 minutes or common system reserved fields (such as hostIP, hostName, and pathFile), and manually add structured fields. After completing the settings, click Submit. The message "Logs ingested" will appear. You can also adjust the index settings after the ingestion configuration is created. However, the changes will only affect newly ingested logs.

Step 7: Complete the Ingestion Configuration

The created ingestion configuration will be displayed.

Click its name to view its details.
Click Modify in the Operation column to modify the ingestion configuration.
Click More > Configure Tag in the Operation column to add a tag.
Click More > Copy in the Operation column to copy the ingestion configuration.
Click Delete in the Operation column to delete the ingestion configuration.

Deleting an ingestion configuration may lead to log collection failures, potentially resulting in service exceptions related to user logs. In addition, the deleted ingestion configuration cannot be restored. Exercise caution when performing this operation.
To stop log collection of an ingestion configuration, toggle off the switch in the Ingestion Configuration column to disable the configuration. To restart log collection, toggle on the switch in the Ingestion Configuration column.

Disabling an ingestion configuration may lead to log collection failures, potentially resulting in service exceptions related to user logs. Exercise caution when performing this operation.
Click More > ICAgent Collect Diagnosis in the Operation column of the ingestion configuration to monitor the exceptions, overall status, and collection status of ICAgent. If this function is not displayed, enable ICAgent diagnosis by referring to Setting ICAgent Collection.

Setting Multiple Ingestion Configurations in a Batch

You can set multiple ingestion configurations for multiple scenarios in a batch, avoiding repetitive setups.

On the Ingestion Management page, click Batch Create to go to the configuration details page.
1. Ingestion Type: Select Self-built K8s - Application Logs.
2. Rule List:
  - Enter the number of ingestion configurations in the text box and click Add.
  - Enter a rule name under Configuration Items on the right. You can also double-click the name of the ingestion configuration on the left to replace it with a custom name after setting the configuration items. A rule name can contain 1 to 64 characters, including only letters, digits, hyphens (-), underscores (_), and periods (.). It cannot start with a period or underscore or end with a period.
  - To copy an ingestion configuration, move the cursor to it and click .
  - To delete an ingestion configuration, move the cursor to it and click . In the displayed dialog box, click Yes.
3. Configuration Items:
  - The ingestion configurations are displayed on the left. You can add up to 99 more configurations.
  - The ingestion configuration items are displayed on the right. Set them by referring to Step 5: Configure the Collection.
  - After an ingestion configuration is complete, you can click Apply to Other Configurations to copy its settings to other configurations.
Click Check Parameters. After the check is successful, click Submit.

The added ingestion configurations will be displayed on the Ingestion Management page after the batch creation is successful.
(Optional) Perform the following operations on ingestion configurations:
- Select multiple existing ingestion configurations and click Edit. On the displayed page, select an ingestion type to modify the corresponding ingestion configurations.
- Select multiple disabled ingestion configurations, click Enable/Disable Ingestion Configuration, and select Enable to enable them in a batch.
- Select multiple enabled ingestion configurations, click Enable/Disable Ingestion Configuration, and select Disable. Logs will not be collected for disabled ingestion configurations. Exercise caution when disabling these configurations.
- Select multiple existing ingestion configurations and click Delete.

Helpful Links

After logs are ingested to LTS, you can use the search and analysis functions of LTS to quickly gain insights into log data. For details, see Log Search and Analysis (SQL Analysis Offline Soon).
You can use the log alarm function of LTS to set alarm rules to notify you of exceptions in the ingested logs. For details, see Configuring Log Alarm Rules.
If you have any questions when installing ICAgent, see Host Management in the FAQ.
If you have any questions when configuring log ingestion, see Log Ingestion in the FAQ.
You can call APIs to create, query, and delete log ingestion configurations. For details, see Log Ingestion.
You are advised to learn several best practices of data collection. For details, see Log Ingestion in the Best Practices.
1. To collect logs across clouds or regions, see Collecting Host Logs from Third-Party Clouds, Internet Data Centers, and Other Huawei Cloud Regions to LTS.
2. To collect data from multiple channels, see Collecting Logs from Multiple Channels to LTS.