ALM-12110 Failed to get ECS temporary AK/SK

Alarm Description

Meta calls the ECS API to obtain the AK/SK information every 5 minutes and caches the information. Before the AK/SK expires, Meta calls the API again to update it. This alarm is generated when Meta fails to call the API for three consecutive times.

This alarm is cleared when Meta successfully calls the ECS API.

Alarm Attributes

Alarm ID	Alarm Severity	Alarm Type	Service Type	Auto Cleared
12110	Major	Quality of service	FusionInsight Manager	Yes

Alarm Parameters

Type	Parameter	Description
Location Information	Source	Specifies the cluster for which the alarm is generated.
	ServiceName	Specifies the service for which the alarm is generated.
	RoleName	Specifies the role for which the alarm is generated.
	HostName	Specifies the host for which the alarm is generated.
Additional Information	Detail	Specifies the details for which the alarm is generated.

Impact on the System

The cluster cannot obtain the latest temporary AK/SK. In the storage and compute separation scenario, OBS may fail to be accessed. As a result, component services cannot be properly processed.

Possible Causes

The meta role of the MRS cluster is abnormal.
The cluster has been bound to an agency and accessed OBS but has been unbound from the agency. As a result, the cluster has not been bound to any agency.

Handling Procedure

Check the status of the meta role.

On FusionInsight Manager of the cluster, choose O&M > Alarm > Alarms. On the page that is displayed, click in the row containing the alarm, and determine the IP address of the host for which the alarm is generated.
On FusionInsight Manager of the cluster, choose Cluster > Services > meta. On the page that is displayed, click the Instance tab, and check whether the meta role corresponding to the host for which the alarm is generated is normal.
- If yes, go to 5.
- If no, go to 3.
Select the abnormal role, click More, and select Restart Instance to restart the abnormal meta role.
Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 5.
Log in to the host obtained in 1 and check whether the /var/log/Bigdata/meta/mrs-meta.log file contains error information. If yes, rectify the fault based on the log information.
Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 7.

Rebind the cluster to an agency.

Log in to the MRS management console.
In the navigation pane on the left, choose Clusters > Active Clusters. On the page that is displayed, click the cluster name to go to its overview page. Then, check whether the cluster is bound to an agency in the O&M management area.
- If yes, go to 10.
- If no, go to 9.
Click Manage Agency. On the page that is displayed, rebind the cluster to an agency. Then check whether the alarm is cleared a few minutes later.
- If yes, no further action is required.
- If no, go to 10.

Collect fault information.

On FusionInsight Manager of the active cluster, choose O&M. In the navigation pane on the left, choose Log > Download.
Expand the Service drop-down list, select meta for the target cluster, and click OK.
Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
Contact O&M engineers and provide the collected logs.