Cluster O&M
Cluster resources provided by MRS are completely owned by users who thereby can use various methods to maintain cluster running.
Manager
MRS manages and analyzes massive amounts of data and quickly mines valuable data from structured and unstructured data. In view of the complex structure of open-source components and time- and labor-consuming installation, configuration, and management, Manager provides a unified enterprise-level big data cluster management platform with the following features:
- Cluster monitoring: enables you to quickly learn the running status of hosts and services.
- Graphical metric monitoring and customization: enable you to quickly obtain key information about the system.
- Service property configuration: allows you to configure service properties based on the performance requirements of your services.
- Cluster, service, and role instance operations: allow you to start or stop services and clusters with just a few clicks.
For more information about MRS Manager, see Introduction to MRS Manager.
Alarm Management
MRS can monitor big data clusters in real time and identify system health status based on alarms and events. In addition, MRS allows you to customize monitoring and alarm thresholds to focus on the health status of each metric. When monitoring data reaches the alarm threshold, the system triggers an alarm.
MRS can connect to Huawei Cloud Simple Message Notification (SMN) to send alarm information to users by SMS messages or emails. For details, see Cluster Status Notification.
Audit
All operations on MRS Manager are recorded for post-event tracing, fault locating, and responsibility division in security events.
For more information about MRS cluster audit logs, see Viewing MRS Cluster Audit Logs.
Patch Management
MRS supports cluster patching operations and will release patches for open source big data components in a timely manner. On the MRS cluster management page, you can view patch release information related to running clusters, including the detailed description of the resolved issues and impacts. You can determine whether to install a patch based on the service running status. One-click patch installation involves no manual intervention, and will not cause service interruption through rolling installation, ensuring long-term availability of the clusters.
MRS can display the detailed patch installation process, and supports patch uninstallation and rollback.
For more information about how to install a patch for MRS clusters, see Patching an MRS Cluster.
Cluster Health Checks
MRS provides automatic inspection on system running environments for you to check and audit system running health status in one click, ensuring proper system running and lowering system operation and maintenance costs. After viewing inspection results, you can export reports for archiving and fault analysis.
For more information about MRS cluster health check, see Performing a Health Check for an MRS Cluster.
Performing a Rolling Restart of Services
To apply configuration changes to a big data component, you must restart it. However, the common restart will restart all services or instances, which may cause service interruption.
To minimize or eliminate the impact on services during a component restart, you can perform rolling restarts to restart components or instances in batches. For instances in active/standby mode, the standby instance is restarted first, followed by the active instance.
Rolling restart takes longer than normal restart.
For details about how to restart MRS cluster components, see Restarting an MRS Cluster Component.
O&M Support
Cluster resources provided by MRS belong to users. Generally, when O&M personnel's support is required for troubleshooting of a cluster, O&M personnel cannot directly access the cluster. To better serve customers, MRS provides the following two methods to improve communication efficiency during fault locating:
- Log sharing: You can initiate log sharing on the MRS management console to share a specified log scope with O&M personnel, so that O&M personnel can locate faults without accessing the cluster.
- O&M authorization: If a problem occurs when you use an MRS cluster, you can initiate O&M authorization on the MRS management console. O&M personnel can help you quickly locate the problem, and you can revoke the authorization at any time.
For more information about MRS remote cluster O&M support, see Configuring Remote O&M for an MRS Cluster.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot