ALM-43617 Number of Waiting Queues for Real-Time Data Import to GraphBase Exceeds the Threshold
Alarm Description
The system checks whether the number of waiting queues imported to GraphBase in real time exceeds the threshold every 30 seconds. This alarm is generated when the number of waiting queues exceeds the threshold.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
43617 |
Minor |
Quality of service |
GraphBase |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
|
RoleName |
Specifies the role for which the alarm is generated. |
|
HostName |
Specifies the host for which the alarm is generated. |
|
Additional Information |
TaskId |
Specifies the ID of a Yarn job. |
Impact on the System
- As a result, real-time data import is blocked and the import time becomes longer.
- The real-time data import may be suspended.
Possible Causes
- The imported data size is too large and the configuration for data import is improper.
- The threshold is too low. The default value is 100.
Handling Procedure
Check whether the number of waiting queues for real-time data import exceeds the threshold.
- On the FusionInsight Manager homepage, choose Cluster > Name of the desired cluster > Service > Yarn > ResourceManager(Active). On the Yarn web UI, check the GraphBase-related SparkStreaming Yarn jobs.
By default, the admin user does not have the permissions to manage other components. If the page cannot be opened or the displayed content is incomplete when you access the native UI of a component due to insufficient permissions, you can manually create a user with the permissions to manage that component.
- Click ApplicationMaster on the Yarn job page to go to the Spark job page.
- Click Streaming to view the status of the suspended queues.
- Enable the client not to send other Yarn tasks that are imported in real time. After a period of time, check whether the alarm is automatically cleared. If the alarm is cleared, the blocked queue is restored.
- If the queues are still suspended, download fault logs to analyze the cause. On FusionInsight Manager, choose O&M > Alarm > Threshold Configuration > Name of the desired cluster > GraphBase > Threshold to set the threshold of graphStreaming Real-time import waiting queue.
- Wait for one minute and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 7.
Collect the fault information.
- On the FusionInsight Manager homepage, choose O&M > Log > Download.
- Expand the Service drop-down list, and select GraphBase for the target cluster.
- Click in the upper right corner, and set Start Date and End Date for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
After the fault is rectified, the system automatically clears this alarm. No manual operation is required.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot