Help Center> MapReduce Service> User Guide> Managing Clusters> Cluster Overview> Managing Components and Monitoring Hosts
Updated on 2024-01-17 GMT+08:00

Managing Components and Monitoring Hosts

You can manage the following status and metrics of all components (including role instances) and hosts on the MRS console:

  • Status information: includes operation, health, configuration, and role instance status.
  • Indicator information: includes key monitoring indicators for each component.
  • Export monitoring metrics. (This function is not supported in MRS 3.x or later.)
  • For MRS 3.x or later, see Procedure.
  • You can set the interval for automatically refreshing the page or click to refresh the page immediately.
  • Component management supports the following parameter values:
    • Refresh every 30 seconds
    • Refresh every 60 seconds
    • Stop refreshing

Prerequisites

You have synchronized IAM users. (On the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users.)

Procedure

Manage components.

For details about how to perform operations on MRS Manager, see Managing Service Monitoring.

  1. On the MRS cluster details page, click Components.

    • Table 1 describes the service operating status.
      Table 1 Service operating status

      Status

      Description

      Started

      The service is started.

      Stopped

      The service is stopped.

      Failed to start

      Failed to start the role instance.

      Failed to stop

      Failed to stop the service.

      Unknown

      Indicates initial service status after the background system restarts.

    • Table 2 describes the service health status.
      Table 2 Service health status

      Status

      Description

      Good

      Indicates that all role instances in the service are running properly.

      Faulty

      Indicates that the running status of at least one role instance is Faulty or the status of the service on which the current service depends is abnormal.

      Unknown

      Indicates that all role instances in the service are in the Unknown state.

      Restoring

      Indicates that the background system is restarting the service.

      Partially Healthy

      Indicates that the status of the service on which the service depends is abnormal, and APIs related to the abnormal service cannot be invoked by external systems.

    • Table 3 describes the service health status.
      Table 3 Service configuration status

      Status

      Description

      Synchronized

      The latest configuration takes effect.

      Configuration expired

      The latest configuration does not take effect after the parameter modification. Related services need to be restarted.

      Configuration failed

      The communication is incorrect or data cannot be read or written during the parameter configuration. Use Synchronize Configuration to rectify the fault.

      Configuring

      Parameters are being configured.

      Unknown

      Indicates that configuration status cannot be obtained.

  1. Click a specified service in the list to view its status and metric information.
  2. Customize and view monitoring graphs.

    1. In the Charts area, click Customize to customize service monitoring metrics.
    2. In Period area, select a time of period and click View to view the monitoring data within the time period.

Manage role instances.

For versions earlier than MRS 3.x, see Managing Role Instances.

  1. On the MRS cluster details page, click Components. In the component list, click the specified service name.

    Figure 1 Components tab page

  2. Click Instances to view the role status.

    Figure 2 Instances tab page

    The role instance list contains the Role, Host Name, Management IP Address, Service IP Address, Rack, Running Status, and Configuration Status of each instance.

    • Table 4 shows the running status of a role instance.
      Table 4 Role instance running status

      Status

      Description

      Good

      Indicates that the instance is running properly.

      Bad

      Indicates that the instance cannot run properly.

      Decommissioned

      Indicates that the instance is out of service.

      Not started

      Indicates that the instance is stopped.

      Unknown

      Indicates that the initial status of the instance cannot be detected.

      Starting

      Indicates that the instance is being started.

      Stopping

      Indicates that the instance is being stopped.

      Restoring

      Indicates that an exception may occur in the instance and the instance is being automatically rectified.

      Decommissioning

      Indicates that the instance is being decommissioned.

      Recommissioning

      Indicates that the instance is being recommissioned.

      Failed to start

      Indicates that the service fails to be started.

      Failed to stop

      Indicates that the service fails to be stopped.

    • Table 5 shows the configuration status of a role instance.
      Table 5 Role instance configuration status

      Status

      Description

      Synchronized

      The latest configuration takes effect.

      Configuration expired

      The latest configuration does not take effect after the parameter modification. Related services need to be restarted.

      Configuration failed

      The communication is incorrect or data cannot be read or written during the parameter configuration. Use Synchronize Configuration to rectify the fault.

      Configuring

      Parameters are being configured.

      Unknown

      Current configuration status cannot be obtained.

    By default, the Role column is sorted in ascending order. You can click the sorting icon next to Role, Host Name, OM IP Address, Business IP Address, Rack, Running Status, or Configuration Status to change the sorting mode.

    You can filter out all instances of the same role in the Role column.

    You can set search criteria in the role search area by clicking Advanced Search, and click Search to view specified role information. You can click Reset to reset the search criteria. Fuzzy search is supported.

  3. Click the target role instance to view its status and metric information.
  4. Customize and view monitoring graphs.

    1. In the Charts area, click Customize to customize service monitoring metrics.
    2. In Period area, select a time of period and click View to view the monitoring data within the time period.

Manage hosts.

For versions earlier than MRS 3.x, see Managing Hosts.

  1. On the MRS cluster details page, click the Nodes tab and expand a node group to view the host status.

    The host list of a group contains the Node Name/Resource ID, IP, Status, Specifications, Disks, and AZ.

    • Table 6 shows the host operating status.
      Table 6 Host operating status

      Status

      Description

      Normal

      The host and service roles on the host are running properly.

      Isolated

      The host is isolated, and the service roles on the host stop running.

    • Table 7 describes the host health status.
      Table 7 Host health status

      Status

      Description

      Good

      The host can properly send heartbeats.

      Bad

      The host fails to send heartbeats due to timeout.

      Unknown

      The host initial status is unknown during the operation of adding or deleting a host.

    By default, data is sorted in ascending order by node name. You can click to change the order.

  2. Click the target node in the list to view its status and metric information.