Help Center/ MapReduce Service/ User Guide/ Managing Clusters/ Managing MRS Cluster Components/ Checking the Running Status of an MRS Cluster Component
Updated on 2024-09-23 GMT+08:00

Checking the Running Status of an MRS Cluster Component

After creating an MRS cluster, you can check the status of each service component and its role instance on the console or Manager. This will assist you in detecting any performance issues with the component.

Prerequisites

  • The IAM users have been synchronized in advance. You can do this by clicking Synchronize next to IAM User Sync on the Dashboard page of the cluster details.
  • You have logged in to MRS Manager. For how to log in, see Accessing MRS Manager.

Viewing Component Status on the Console

  1. Log in to the MRS console.
  2. On the Active Clusters page, select a running cluster and click its name to switch to the cluster details page.
  3. On the MRS cluster details page, click Components to view the service operation status, health status, and configuration status.

    Figure 1 Checking component status
    Table 1 Component status description

    Status Type

    Status

    Description

    Operating Status

    Started

    The service is started.

    Stopped

    The service is stopped.

    Failed to start

    Failed to start the role instance.

    Failed to stop

    Failed to stop the role instance.

    Unknown

    Initial service status after the background system restarts.

    Health Status

    Good

    Indicates that all role instances in the service are running properly.

    Faulty

    The running status of at least one role instance is Faulty or the status of the service on which the current service depends is abnormal.

    If the running status of a service is Faulty, an alarm is generated. Rectify the fault based on the alarm information.

    Unknown

    All role instances in the service are in the Unknown state.

    Restoring

    The background system is restarting the service.

    Partially Healthy

    The status of the service on which the service depends is abnormal, and APIs related to the abnormal service cannot be called by external systems.

    HBase, Hive, Spark, and Loader may be in the Partially Healthy state.
    • If YARN is installed but is abnormal, HBase is in the Partially Healthy state.
    • If HBase is installed but is abnormal, Hive, Spark, and Loader are in the Partially Healthy state.

    Configuration Status

    Synchronized

    Indicates that the latest configuration takes effect.

    Configuration expired

    The latest configuration does not take effect after the parameter modification. You need to restart related services.

    Configuration failed

    If a communication or read/write exception occurs during parameter configuration, you can use the synchronization configuration function to rectify the fault.

    Configuring

    Parameters are being configured.

    Unknown

    Current configuration status cannot be obtained.

  4. Click a component name to go to the component details page and view the detailed running information about the component.

    Figure 2 Viewing cluster component details

  5. Click Instances to view the detailed running information about each role instance in the service.

    • The list of role instances shows all the instances for each role in the cluster, including their running and configuration status, hosts, and IP addresses.
    • You can click an instance name to go to the instance details page and view the basic information, configuration file, instance logs, and monitoring metric graphs of the instance.
    Figure 3 Checking the status of cluster component instances
    Table 2 Instance status

    Status Type

    Status

    Description

    Running Status

    Good

    The instance is running properly.

    Bad

    The instance cannot run properly.

    Decommissioned

    The instance is out of service.

    Not started

    The instance is stopped.

    Unknown

    The initial status of the instance cannot be detected.

    Starting

    The instance is being started.

    Stopping

    The instance is being stopped.

    Restoring

    An exception may occur in the instance and the instance is being automatically rectified.

    Decommissioning

    The instance is being decommissioned.

    Recommissioning

    The instance is being recommissioned.

    Failed to start

    The instance fails to be started.

    Failed to stop

    The instance fails to be stopped.

    Configuration Status

    Synchronized

    The latest configuration takes effect.

    Configuration expired

    The latest configuration does not take effect after the parameter modification. You need to restart related services.

    Configuration failed

    If a communication or read/write exception occurs during parameter configuration, you can use the synchronization configuration function to rectify the fault.

    Configuring

    Parameters are being configured.

    Unknown

    Current configuration status cannot be obtained.

Checking the Component Status on Manager

  1. Log in to Manager and choose Cluster > Services to access the component management page.

    The service list displays the running status, configuration status, role type, and number of instances of each component.

    Figure 4 Checking the status of cluster components

    On the Manager of MRS 2.x or earlier, click Services to open the component management page.

    Table 3 Manager component status description

    Status Type

    Status

    Description

    Running Status

    Normal

    The component is running properly.

    Faulty

    The component cannot work properly.

    Partially Healthy

    Some enhanced functions of the component cannot work properly.

    Not started

    The component is stopped.

    Unknown

    The initial status of the component cannot be detected.

    Starting

    The component is being started.

    Stopping

    The component is being stopped.

    Failed to start

    The component fails to be started.

    Failed to stop

    The component fails to be stopped.

    Configuration Status

    Synchronized

    The latest configuration takes effect.

    • Expired (Manager 2.x or earlier)
    • Expired (Manager 3.x and later)

    The latest configuration does not take effect after the parameter modification. You need to restart related services.

    Failed

    If a communication or read/write exception occurs during parameter configuration, you can use the synchronization configuration function to rectify the fault.

    • Configuring (Manager 2.x or earlier)
    • Synchronizing (Manager 3.x and later)

    Parameters are being configured.

    Unknown

    Current configuration status cannot be obtained.

  2. Click a component name to view its details.
  3. Click Instances to view the detailed running information about each role instance in the service.

    Figure 5 Checking the status of cluster component instances
    • The list of role instances shows all the instances for each role in the cluster, including their running and configuration status, hosts, and IP addresses.
    • You can click an instance name to go to the instance details page and view the basic information, configuration file, instance logs, and monitoring metric graphs of the instance.
    Table 4 Manager instance status description (3.x and later versions)

    Status Type

    Status

    Description

    Running Status

    Normal

    The instance is running properly.

    Faulty

    The instance cannot run properly.

    Decommissioned

    The instance is out of service.

    Not started

    The instance is stopped.

    Unknown

    The initial status of the instance cannot be detected.

    Starting

    The instance is being started.

    Stopping

    The instance is being stopped.

    Restoring

    An exception may occur in the instance and the instance is being automatically rectified.

    Decommissioning

    The instance is being decommissioned.

    Recommissioning

    The instance is being recommissioned.

    Failed to start

    The instance fails to be started.

    Failed to stop

    The instance fails to be stopped.

    Configuration Status

    Synchronized

    The latest configuration takes effect.

    Expired

    The latest configuration does not take effect after the parameter modification. You need to restart related services.

    Failed

    If a communication or read/write exception occurs during parameter configuration, you can use the synchronization configuration function to rectify the fault.

    Synchronizing

    Parameters are being configured.

    Unknown

    Current configuration status cannot be obtained.

    Table 5 Manager instance status description (2.x and earlier versions)

    Status Type

    Status

    Description

    Operating Status

    Started

    The role instance has been started.

    Stopped

    The role instance has been stopped.

    Failed to start

    Failed to start the role instance.

    Failed to stop

    Failed to stop the role instance.

    Decommissioning

    The role instance is being decommissioned.

    Decommissioned

    The role instance has been decommissioned.

    Recommissioning

    The role instance is being recommissioned.

    Unknown

    Initial role instance status after the background system restarts.

    Health Status

    Good

    The role instance is running properly.

    Restoring

    The background system is restarting a role instance.

    Bad

    The instance role is experiencing an abnormality, such as the inability to access a port due to a non-existent PID.

    Unknown

    The host where a role instance resides does not connect to the background system.

    Partially Healthy

    The role instance is partially running properly.

    Configuration Status

    Synchronized

    The latest configuration takes effect.

    Expired

    The latest configuration does not take effect after the parameter modification. Related services need to be restarted.

    Failed

    The communication is incorrect or data cannot be read or written during the parameter configuration. Click Synchronize Configuration to rectify the fault.

    Configuring

    Parameters are being configured.

    Unknown

    Current configuration status cannot be obtained.