ALM-12110 Failed to get ECS temporary AK/SK
Alarm Description
The meta component calls the ECS API to obtain AK/SK information every 5 minutes and caches the information. Before the AK/SK expires, the component calls the API again to update it. This alarm is generated when the component fails to call the API for three consecutive times.
This alarm is cleared when Meta successfully calls the ECS API.
Alarm Attributes
Alarm ID |
Alarm Severity |
Auto Cleared |
---|---|---|
12110 |
Major |
Yes |
Alarm Parameters
Parameter |
Description |
---|---|
Source |
Specifies the cluster for which the alarm was generated. |
ServiceName |
Specifies the service for which the alarm was generated. |
RoleName |
Specifies the role for which the alarm was generated. |
HostName |
Specifies the host for which the alarm was generated. |
Impact on the System
The cluster cannot obtain the latest temporary AK/SK. For a storage-compute decoupled system, OBS files may fail to be accessed. As a result, upper-layer component services cannot process data.
Possible Causes
- The meta role of the MRS cluster is abnormal.
- The cluster has been bound to an agency and accessed OBS, but later it was unbound from the agency.
Handling Procedure
Check the status of the meta role.
- On FusionInsight Manager of the cluster, choose O&M > Alarm > Alarms. On the page that is displayed, click
in the row containing the alarm, and determine the IP address of the host for which the alarm is generated.
- Choose Cluster > Services > meta. On the page that is displayed, click the Instances tab, and check whether the meta role corresponding to the host for which the alarm is generated is normal.
- Select the abnormal role, click More, and select Restart Instance to restart the abnormal meta role.
Services may be affected or interrupted during the restart. You are advised to perform the restart during off-peak hours.
- After the role is restarted, check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to Step 5.
- Log in to the required host as user root using the IP address obtained in Step 1 and check whether the /var/log/Bigdata/meta/mrs-meta.log file contains error information. If yes, rectify the fault based on the log information.
cat /var/log/Bigdata/meta/mrs-meta.log
- Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to Step 7.
Rebind an IAM agency to the cluster.
- Log in to the MRS management console.
- In the navigation pane on the left, choose Active Clusters. On the page that is displayed, click the cluster name to go to its overview page. Then, check whether the cluster is bound to an IAM agency in the O&M management area.
- Click Select Agency. On the page that is displayed, rebind an IAM agency that the permissions to access OBS cluster to the cluster. Then check whether the alarm is cleared a few minutes later.
- If yes, no further action is required.
- If no, go to Step 10.
Collect fault information.
- On FusionInsight Manager of the active cluster, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, select meta for the target cluster, and click OK.
- Click
in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M personnel and provide the collected logs.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot