Updated on 2024-07-18 GMT+08:00

Data Collection Overview

SecMaster is also a log analysis system. Its data sources include basic security log data on the cloud and security log data of your products. You can connect security logs of your own products to SecMaster for log analysis, or transfer security logs on the cloud to your own storage and other related services.

Data collection is the process during which logs of many types are collected through Logstash. After data is collected, historical data analysis and comparison, data association analysis, and unknown threat discovery can be quickly implemented.

Figure 1 Data Collection

Data Collection Principles

The basic principle of data collection is as follows: SecMaster uses a component controller (isap-agent) that is installed on your ECSs to manage the collection component Logstash, and Logstash transfer security data in your organization or between you and SecMaster.

Figure 2 Functional architecture of the collection system

Description

  • Collector: custom Logstash. A collector node is a custom combination of Logstash+ component controller (isap-agent).
  • Node: If you install SecMaster component controller isap-agent on an ECS, and use IAM to authorize SecMaster to manage the ECS, the ECS is called a node. You need to deliver data collection engine Logstash to managed nodes on the Components page.
  • Component: A component is a custom Logstash that works as a data aggregation engine to receive and send security log data.
  • Connector: A connector is a basic element for Logstach. It defines the way Logstash receives source data and the standards it follows during the process. Each connector has a source end and a destination end. Soure ends and destination ends are used for data inputs and outputs, respective. The SecMaster pipeline is used for log data transmission between SecMaster and your devices.
  • Parser: A parser is a basic element for configuring custom Logstash. Parsers mainly work as filters in Logstash. SecMaster preconfigures varied types of filters and provides them as parsers. In just a few clicks on the SecMaster console, you can use parsers to generate native scripts to set complex filters for Logstach. In doing this, you can convert raw logs into the format you need.
  • Collection channel: A collection channel is equivalent to a Logstash pipeline. Multiple pipelines can be configured in Logstash. Each pipeline consists of the input, filter, and output parts. Pipelines work independently and do not affect each other. You can deploy a pipeline for multiple nodes. A pipeline is considered one collection channel no matter how many nodes it is configured for.

Limitations and Constraints

  • Currently, the data collection component controller can run on ECSs running the Linux x86_64 or Arm64 architecture.
  • Only IAM users can be used to install component controller and check details on the console. The IAM user can have only the minimum permissions assigned. For details, see Preparations.

Collector Specifications

The following table describes the specifications of the ECSs that are selected as nodes in collection management.

Table 1 Collector Specifications

CPU Cores

Memory

System Disk

Data Disk

Referenced Processing Capability

4U

8G

50G

100G

2000 EPS @ 1KB

4000 EPS @ 500B

8U

16G

50G

100G

5000 EPS @ 1KB

10000 EPS @ 500B

16U

32G

50G

100G

10000 EPS @ 1KB

20000 EPS @ 500B

32U

64G

50G

100G

20000 EPS @ 1KB

40000 EPS @ 500B

64U

128G

50G

100G

40000 EPS @ 1KB

80000 EPS @ 500B

NOTE:

The ECS must have at least two vCPUs and 4 GB of memory. A disk of at least 100 GB must be attached as the directory disk.

The log volume usually increases in proportion to the server specifications. Generally, you are advised to increase the log volume based on the specifications in the table. If there is huge pressure on a collector, you can deploy multiple collectors and manage them in a unified manner through collection channels. This can distribute the log forwarding pressure across collectors.

Before installing the component controller, you are advised to mount a disk and use the disk partitioning script to allocate the disk. To ensure the installation and running of Logstash, the directory partition must have more than 100 GB of free space.

Log Source Limit

You can add as many as log sources you need to the collectors as long as your cloud resources can accommodate those logs. You can scale cloud resources anytime to meet your needs.

Data Collection Process

Figure 3 Data collection process
Table 2 Description of the data collection process

No.

Step

Description

1

Managing Nodes

Select or purchase an ECS and install the component controller on the ECS to complete node management.

2

Installing Components

Install data collection engine Logstash on the Components tab to complete component installation.

3

Configuring Connectors

Configure the source and destination connectors. Select a connector as required and set parameters.

4

(Optional) Configuring a Parser

Configure codeless parsers on the console based on your needs.

5

Configuring a Collection Channel

Configure the connection channels, associate it with a node, and deliver the Logstash pipeline configuration to complete the data collection configuration.

6

Verifying the Collection Result

After the collection channel is configured, check whether data is collected.

If logs are sent to the SecMaster pipeline, you can query the result on the SecMaster Security Analysis page.

Data Collection Configuration Removal Process

Figure 4 Data collection configuration removal process
Table 3 Description of the data collection configuration removal process

No.

Step

Description

1

Deleting a collection channel

On the Collection Channels page, stop and delete the Logstash pipeline configuration.

Note: All collection channels on related nodes must be stopped and deleted first.

2

(Optional) Deleting a parser

If a parser is configured, delete it on the Parsers tab.

3

(Optional) Deleting a data connection

If a data connection is added, delete the source and destination connectors on the Connections tab.

4

Removing a component

Delete the collection engine Logstash installed on the node and remove the component.

5

Deregistering a node

Remove the component controller to complete node deregistration.

Note: Deregistering a node does not delete the ECS and endpoint resources. If the data collection function is no longer used, you need to manually release the resources. For details, see and .