ALM-12186 CGroup Task Usage Exceeds the Threshold
Alarm Description
The system checks the CGroup task usage of user omm every 5 minutes. This alarm is generated when the CGroup task usage exceeds 90%. This alarm is cleared when the CGroup task usage is less than or equal to 90%.
CGroup task usage = Number of used CGroup tasks/Maximum number of CGroup tasks
You can run the systemctl status user-$(id -u).slice | grep limit | awk -F ' ' '{print $2}' command as user omm to obtain the number of used CGroup tasks of this user and run the echo $(systemctl status user-$(id -u).slice | grep limit | awk -F ' ' '{print $4}') | sed -e 's/)//g' command to obtain the maximum number of CGroup tasks allowed for this user.
Alarm Attributes
Alarm ID |
Alarm Severity |
Auto Cleared |
---|---|---|
12186 |
Major |
Yes |
Alarm Parameters
Parameter |
Description |
---|---|
Source |
Specifies the cluster or system for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
Impact on the System
- Failed to switch to user omm.
- Failed to create new omm processes.
- A faulty service or process cannot be restarted.
Possible Causes
The CGroup task usage exceeds 90%.
Handling Procedure
Check the maximum number of threads that can be concurrently opened by user omm is properly set.
- Log in to FusionInsight Manager and choose O&M > Alarm > Alarms. On the page that is displayed, click in the row containing the alarm, and view the name of the host for which the alarm is generated in Location. Click the host name to view its IP address.
- Log in to the host for which the alarm is generated as user omm.
- Run the following command to obtain the maximum number of threads that can be concurrently opened by user omm and check whether this number is greater than or equal to 60000:
systemctl status user-$(id -u).slice | grep limit
- Switch to user root and run the following command to change the value for user omm to 60000:
systemctl set-property user-2000.slice TasksMax=60000
- Change the value of UserTasksMax in the /etc/systemd/logind.conf file to 60000. (If the parameter is commented out, uncomment it.) Save the file, wait 5 minutes, and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 6.
Collect fault information.
- On FusionInsight Manager of the cluster, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, select OmmServer and NodeAgent for the target cluster, and click OK.
- Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M personnel and provide the collected logs.
Alarm Clearance
This alarm is automatically cleared after the fault is rectified.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot