Updated on 2025-08-27 GMT+08:00

Cloud Native Log Collection

Introduction

The Cloud Native Log Collection add-on (log-agent) is developed based on Fluent Bit and OpenTelemetry for collecting logs and Kubernetes events. After the add-on is installed, the fluent-bit container component is automatically created for log collection. This add-on supports CRD-based log collection policies. It collects and forwards standard output logs, container file logs, and Kubernetes events in a cluster based on configured policies. It also reports all abnormal Kubernetes events and some normal Kubernetes events to AOM. For details about how to collect logs, see Collecting Logs.

The fluent-bit container component is provided by Huawei Cloud for free.

Log Collection Reliability

The log system's main purpose is to record all stages of data for service components, including startup, initialization, exit, runtime details, and exceptions. It is primarily employed in O&M scenarios for tasks like checking component status and analyzing fault causes.

Standard streams (stdout and stderr) and local log files use non-persistent storage. However, data integrity may be compromised due to the following risks:

  • Log rotation and compression potentially deleting old files
  • Temporary storage volumes being cleared when Kubernetes pods end
  • Automatic OS cleanup triggered by limited node storage space

While the Cloud Native Log Collection add-on employs techniques like multi-level buffering, priority queues, and resumable uploads to enhance log collection reliability, logs could still be lost in the following situations:

  • The service log throughput surpasses the collector's processing capacity.
  • The service pod is abruptly terminated and reclaimed by CCE.
  • The log collector pod experiences exceptions.

The following lists some recommended best practices for cloud native log management. You can review and implement them thoughtfully.

  • Use dedicated, high-reliable streams to record critical service data (for example, financial transactions) and store the data in persistent storage.
  • Avoid storing sensitive information like customer details, payment credentials, and session tokens in logs.

Constraints

The constraints on using the log-agent add-on are as follows:
  • A maximum of 50 log collection rules can be configured for each cluster.
  • log-agent cannot collect .gz, .tar, or .zip log files.
  • If the container runtime is containerd, container standard output logs cannot be in multiple lines.
  • In each cluster, up to 10,000 single-line logs can be collected per second, and up to 2,000 multi-line logs can be collected per second.

Permissions

The fluent-bit component reads and collects the standard output logs and container file logs based on the collection configuration.

The following permissions are required for running the fluent-bit component:

  • CAP_DAC_OVERRIDE: ignores the discretionary access control (DAC) restrictions on files.
  • CAP_FOWNER: ignores the restrictions that the file owner ID must match the process user ID.
  • DAC_READ_SEARCH: ignores the DAC restrictions on file reading and catalog research.
  • SYS_PTRACE: allows all processes to be traced.

Installing the Add-on

  1. Log in to the CCE console and click the cluster name to access the cluster console.
  2. In the navigation pane, choose Add-ons. On the right of the displayed page, find the Cloud Native Log Collection add-on and click Install.
  3. On the Install Add-on page, configure the specifications.

    Table 1 Add-on specifications

    Parameter

    Description

    Pods

    Number of pods that will be created to match the selected add-on specifications.

    If you select Custom, you can adjust the number of pods as required.

    Containers

    The add-on contains the following container components, whose specifications can be adjusted as required:

    • log-operator: parses and updates log rules.
    • otel-collector: forwards logs collected by fluent-bit to LTS.

  4. Click Install.

Components

Table 2 log-agent components

Component

Description

Resource Type

fluent-bit

A lightweight log collector and forwarder for collecting logs.

Pod

log-operator

Used to generate internal configuration files

Deployment

otel-collector

Used to collect logs from applications and services and report the logs to LTS

Deployment