ALM-38007 Status of Kafka Default User Is Abnormal

Alarm Description

The system checks the default user of Kafka every 60 seconds. This alarm is generated when the system detects that the user status is abnormal.

Trigger Count is set to 1. This alarm is cleared when the user status becomes normal.

Alarm Attributes

Alarm ID	Alarm Severity	Alarm Type	Service Type	Auto Cleared
38007	Critical	Quality of service	Kafka	Yes

Alarm Parameters

Type	Parameter	Description
Location Information	Source	Specifies the cluster for which the alarm is generated.
	ServiceName	Specifies the service for which the alarm is generated.
	RoleName	Specifies the role for which the alarm is generated.
	HostName	Specifies the host name for which the alarm is generated.
Additional Information	Trigger Condition	Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.

Impact on the System

If the Kafka default user status is abnormal, metadata synchronization between Brokers and interaction between Kafka and ZooKeeper will be affected, affecting service production, consumption, and topic creation and deletion.

Possible Causes

The Sssd service is abnormal.
Some Broker instances stop running.

Handling Procedure

Check whether the Sssd service is abnormal.

On the FusionInsight Manager portal, choose O&M > Alarm > Alarms > Status of Kafka Default User Is Abnormal > Location to check the host name of the instance for which the alarm is generated.
Find the host information in the alarm information and log in to the host.
Run the id -Gn kafka command and check whether "No such user" is displayed in the command output.
- If yes, record the host name of the node and go to 4.
- If no, go to 6.
On the FusionInsight Manager home page, choose O&M > Alarm > Alarms. Check whether there is Sssd Service Exception in the alarm information. If there is, handle the alarm based on alarm information.

Check the running status of the Broker instance.

On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > Kafka > Instance. The Kafka instance page is displayed.
Check whether there are stopped nodes on all Broker instances.
- If yes, go to 7.
- If no, go to 8.
Select all stopped Broker instances and click Start Instance.
Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 9.

Collect fault information.

On FusionInsight Manager, choose O&M > Log > Download.
In the Service area, select Kafka in the required cluster.
Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
Contact the O&M engineers and send the collected logs.