ALM-38012 Number of Broker Partitions Exceeds the Threshold
Alarm Description
The system checks the number of partitions on each Broker instance of the Kafka service every 30 seconds. You can view the number of partitions on the Broker instance dashboard page. This alarm is generated when the number of partitions on a Broker instance exceeds the threshold. You can choose O&M > Alarm > Thresholds > Kafka and change the threshold. This alarm is cleared when the number of partitions is less than or equal to the threshold.
This alarm applies only to MRS 3.5.0 or later.
Alarm Attributes
Alarm ID |
Alarm Severity |
Auto Cleared |
---|---|---|
38012 |
Critical (default threshold: 6000) Major (default threshold: 3000) |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster for which the alarm was generated. |
ServiceName |
Specifies the service for which the alarm was generated. |
|
RoleName |
Specifies the role for which the alarm was generated. |
|
HostName |
Specifies the host for which the alarm was generated. |
Impact on the System
The number of Broker partitions exceeds the threshold. Too many partitions increase the Broker load and cause bottlenecks in memory, disk I/O, and CPU resources. As a result, the request response becomes slow or even times out.
Possible Causes
- Broker partitions are unevenly distributed, or the Kafka cluster usage exceeds the specifications.
- There are many useless topics.
Handling Procedure
Check whether partitions are evenly distributed on Broker.
- Log in to FusionInsight Manager and choose O&M > Alarm > Alarms. In the Location field of the alarm details, view the service instance and host for which the alarm is generated.
- Choose Cluster > Services > Kafka > Chart, select Partition from the Chart Category area, zoom in Number of Partitions-All Instances in the upper right corner, and click Distribution to check whether partitions are evenly distributed on Broker.
Figure 1 Example of uneven partition distribution on Broker
- If the partitions on Broker are balanced, the Kafka cluster exceeds the specifications. In this case, add Broker instances. Then, go to 5.
On FusionInsight Manager, choose Cluster > Services > Kafka > Instances, click Add Instance, and add Broker instances as prompted.
- Click the uneven distribution bar on the rightmost. If the number of partitions on only the Broker node for which the alarm is generated is too large, perform data balancing.
- Wait 5 minutes and check whether the alarm is automatically cleared.
- If yes, no further action is required.
- If no, go to 6.
Check whether there are many useless topics.
- Check whether useless topics exist in the cluster.
- If yes, perform the following steps to delete useless topics: Deleting topics is a high-risk operation. Before deleting topics, ensure that the topics are not used.
- Log in to Manager as a user who has the permission to access the Kafka web UI.
- Choose Cluster > Services > Kafka. On the right of KafkaManager Web UI, click the URL to access the Kafka web UI.
- Choose Topics.
- In the Operation column of the target topic, click Action and select Delete.
- In the dialog box that is displayed, click OK.
The default built-in topics cannot be deleted.
- If no, go to 8.
- If yes, perform the following steps to delete useless topics: Deleting topics is a high-risk operation. Before deleting topics, ensure that the topics are not used.
- Wait 5 minutes and check whether the alarm is automatically cleared.
- If yes, no further action is required.
- If no, go to 8.
Collect fault information.
- On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, and select Kafka for the target cluster.
- Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
This alarm is automatically cleared after the fault is rectified.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot