MapReduce Service
MapReduce Service
- What's New
- Function Overview
-
Service Overview
- Infographics
- What Is MRS?
- Advantages
- Application Scenarios
- MRS Cluster Version Overview
- List of MRS Component Versions
- Components
- Functions
- Constraints
- Technical Support
- Billing
- Permissions Management
- Related Services
- Quota Description
- Common Concepts
- Billing
-
Getting Started
- Creating and Using a Hadoop Cluster for Offline Analysis
- Creating and Using a Kafka Cluster for Stream Processing
- Creating and Using an HBase Cluster for Offline Query
- Creating and Using a ClickHouse Cluster for Columnar Store
- Creating and Using an MRS Cluster Requiring Security Authentication
- Best Practices for Beginners
-
User Guide
- Preparations
- MRS Cluster Planning
- Buying MRS Clusters
- Installing an MRS Cluster Client
- Submitting an MRS Job
-
Managing Clusters
- Overview
- Introduction to MRS Manager
- Accessing MRS FusionInsight Manager
-
Managing an MRS Cluster
- Viewing Basic Information About an MRS Cluster
- Checking the Running Status of an MRS Cluster
- Starting and Stopping an MRS Cluster
- Restarting an MRS Cluster
- Exporting MRS Cluster Configuration Parameters
- Synchronizing the MRS Cluster Configuration
- Transforming a Pay-per-Use MRS Cluster to a Yearly/Monthly Cluster
- Deleting an MRS Cluster
- Changing the VPC Subnet of an MRS Cluster
- Replacing the NTP Server for an MRS Cluster
- Modifying the OMS Service Configuration
- Modifying MRS Manager Routing Table
-
Managing MRS Cluster Components
- Checking the Running Status of an MRS Cluster Component
- Starting and Stopping an MRS Cluster Component
- Restarting an MRS Cluster Component
- Adding and Deleting an MRS Cluster Component
- Modifying the Configuration Parameters of an MRS Cluster Component
- Viewing the Modified Component Configuration Parameters of an MRS Cluster
- Synchronizing MRS Component Configuration Parameters
- Adding Custom MRS Component Parameters
- Managing MRS Role Instances
- Managing MRS Role Instance Groups
- Modifying MRS Role Instance Parameters
- Perform an Active/Standby Switchover for MRS Role Instances
- Decommissioning and Recommissioning an MRS Role Instance
- Enabling and Disabling Ranger Authentication for an MRS Component
- Accessing Web Pages of Open Source Components Managed in MRS Clusters
-
Managing MRS Cluster Nodes
- Checking the Running Status of an MRS Cluster Node
- Starting and Stopping All Roles on an MRS Cluster Node
- Isolating an MRS Cluster Node
- Modifying the Rack Information of an MRS Cluster Node
- Scaling Up Master Node Specifications in an MRS Cluster
- Adding a Tag to an MRS Cluster/Node
- Configuring Bootstrap Actions for an MRS Cluster Node
- Managing the MRS Cluster Client
- Managing MRS Cluster Jobs
-
Managing MRS Cluster Tenants
- Introduction to MRS Multi-Tenancy
- Using MRS Multi-Tenancy
- Configuring MRS Tenants
-
Managing MRS Tenant Resources
- Managing the MRS Tenant Resource Directory
- Managing MRS Tenant Resource Pools
- Clearing the MRS Tenant Queue Configuration
- Restoring MRS Tenant Data After YARN Is Reinstalled
- Deleting an MRS Tenant
- Managing Global User Policies When Using Superior Scheduler
- Clearing Tenant's Non-Associated Queues Using Capacity Scheduler
- Switching the MRS Tenant Resource Scheduler
- Managing MRS Cluster Users
- Managing MRS Cluster Metadata
- Managing Static Service Resources in an MRS Cluster
- Managing SQL Inspection Rules for an MRS Cluster
-
MRS Cluster O&M
- Cluster O&M
- Logging In to an MRS Cluster
- Viewing MRS Cluster Monitoring Metrics
- Checking MRS Cluster Health
- Adjusting the Capacity of an MRS Cluster
-
MRS Cluster Data Backup and Restoration
- Backing Up and Restoring MRS Cluster Data
- Enabling MRS Inter-Cluster Replication
- Creating an MRS Cluster Data Backup Task
- Creating an MRS Cluster Data Restoration Task
-
Backing Up MRS Cluster Component Data
- Backing Up Manager Data (MRS 2.x and Earlier)
- Backing Up Manager Data (MRS 3.x and Later Versions)
- Backing Up CDL Service Data
- Backing Up ClickHouse Metadata
- Backing Up ClickHouse Service Data
- Backing Up DBService Data
- Backing Up Flink Metadata
- Backing Up HBase Metadata
- Backing Up HBase Service Data
- Backing Up HDFS NameNode Data
- Backing Up HDFS Service Data
- Backing Up Hive Service Data
- Backing Up IoTDB Metadata
- Backing Up IoTDB Service Data
- Backing Up Kafka Metadata
-
Restoring MRS Cluster Component Data
- Restoring Manager Data (MRS2.x and Earlier)
- Restoring Manager Data (MRS 3.x and Later Versions)
- Restoring CDL Service Data
- Restoring ClickHouse Metadata
- Restoring ClickHouse Service Data
- Restoring DBService Metadata
- Restoring Flink Metadata
- Restoring HBase Metadata
- Restoring HBase Service Data
- Restoring HDFS NameNode Metadata
- Restoring HDFS Service Data
- Restoring Hive Service Data
- Restoring IoTDB Metadata
- Restoring IoTDB Service Data
- Restoring Kafka Metadata
- Managing MRS Cluster Backup and Restoration Tasks
- Using HDFS Snapshots to Quickly Restore Component Service Data
- MRS Cluster Patching
-
MRS Cluster Patch Description
- MRS 3.2.0-LTS.1 Patch Description
- MRS 3.0.5.1 Patch Description
- MRS 2.1.0.11 Patch Description
- MRS 2.1.0.10 Patch Description
- MRS 2.1.0.9 Patch Description
- MRS 2.1.0.8 Patch Description
- MRS 2.1.0.7 Patch Description
- MRS 2.1.0.6 Patch Description
- MRS 2.1.0.3 Patch Description
- MRS 2.1.0.2 Patch Description
- MRS 2.1.0.1 Patch Description
- MRS 2.0.6.1 Patch Description
- MRS 2.0.1.3 Patch Description
- MRS 2.0.1.2 Patch Description
- MRS 2.0.1.1 Patch Description
- MRS 1.9.3.3 Patch Description
- MRS 1.9.3.1 Patch Description
- MRS 1.9.2.2 Patch Description
- MRS 1.9.0.8, 1.9.0.9, and 1.9.0.10 Patch Description
- MRS 1.9.0.7 Patch Description
- MRS 1.9.0.6 Patch Description
- MRS 1.9.0.5 Patch Description
- MRS 1.8.10.1 Patch Description
-
Viewing Logs of an MRS Cluster
- Overview of MRS Cluster Logs
- Viewing MRS Operation Logs
- Viewing MRS Cluster History
- Viewing MRS Cluster Audit Logs
- Viewing Role Instance Logs of MRS Components
- Searching for MRS Cluster Logs Online
- Downloading MRS Cluster Logs
- Collecting MRS Cluster Service Stack Information
- Configuring Default Log Level and Archive File Size for MRS Components
- Configuring the Number of Local Backups of MRS Cluster Audit Logs
- Configuring Dumping for MRS Cluster Audit Logs
-
MRS Cluster Security Configuration
- Cluster Mutual Trust Management
- Replacing MRS Cluster Certificates
-
MRS Cluster Security Hardening
- MRS Cluster Security Hardening Policies
- Configuring Hadoop Data Encryption During Transmission
- Configuring Kafka Data Encryption During Transmission
- Configuring HDFS Data Encryption During Transmission
- Configuring Spark Data Encryption During Transmission
- Configuring ZooKeeper Data Encryption During Transmission
- Encrypting Data Transmission Between the Controller and Agent
- Configuring a Trusted IP Address to Access LDAP
- HFile and WAL Encryption
- Configuring the IP Address Whitelist for Modifying Data in an HBase Read-Only Cluster
- Configuring LDAP Output Audit Logs
- Updating Encryption Keys of an MRS Cluster
- Updating the SSH Key of User omm on MRS Cluster Nodes
- Enabling and Disabling Permission Verification on MRS Cluster Components
- Allowing External Users to Access MRS Clusters in Normal Mode
- Changing the Timeout Duration of the Manager Page
- Configuring Secure Communication Authorization for an MRS Cluster
-
Changing the Passwords for System Users of an MRS Cluster
- Changing or Resetting the Password for User admin of an MRS Cluster
- Changing the Passwords for OS Users of an MRS Cluster Node
- Changing the Password for the Kerberos Administrator of an MRS Cluster
- Changing the Passwords for Manager Users of an MRS Cluster
- Changing the Password for a Regular LDAP User of an MRS Cluster
- Changing the LDAP Administrator Password for an MRS Cluster
- Changing the Passwords for MRS Cluster Component Running Users
-
Changing the Passwords for Database Users of an MRS Cluster
- Changing the Password for the OMS Database Administrator
- Changing the Password for an OMS Database Access User
- Changing the Passwords for Database Users of MRS Cluster Components
- Resetting the MRS Component Database User Password
- Resetting the Password for User omm in DBService
- Changing the Password for User compdbuser of the DBService Database
-
Viewing and Configuring MRS Alarm Events
- Viewing MRS Cluster Events
- Viewing Alarms of an MRS Cluster
- Configuring Alarm Thresholds for an MRS Cluster
- Configuring Alarm Masking for an MRS Cluster
- Connecting an MRS Cluster to SNMP to Report Alarms
- Connecting an MRS Cluster to the Syslog Server to Report Alarms
- Periodically Backing Up Alarm and Audit Information
- Enabling the MRS Cluster Maintenance Mode to Disable Alarm Reporting
- Configuring Notifications for MRS Cluster Alarms and Events
-
MRS Cluster Alarm Handling Reference
- ALM-12001 Audit Log Dumping Failure
- ALM-12004 OLdap Resource Abnormal
- ALM-12005 OKerberos Resource Abnormal
- ALM-12006 Node Fault
- ALM-12007 Process Fault
- ALM-12010 Manager Heartbeat Interruption Between the Active and Standby Nodes
- ALM-12011 Manager Data Synchronization Exception Between the Active and Standby Nodes
- ALM-12012 NTP Service Is Abnormal
- ALM-12014 Partition Lost
- ALM-12015 Partition Filesystem Readonly
- ALM-12016 CPU Usage Exceeds the Threshold
- ALM-12017 Insufficient Disk Capacity
- ALM-12018 Memory Usage Exceeds the Threshold
- ALM-12027 Host PID Usage Exceeds the Threshold
- ALM-12028 Number of Processes in the D State and Z State on a Host Exceeds the Threshold
- ALM-12033 Slow Disk Fault
- ALM-12034 Periodical Backup Failure
- ALM-12035 Unknown Data Status After Recovery Task Failure
- ALM-12037 NTP Server Abnormal
- ALM-12038 Monitoring Indicator Dumping Failure
- ALM-12039 Active/Standby OMS Databases Not Synchronized
- ALM-12040 Insufficient System Entropy
- ALM-12041 Incorrect Permission on Key Files
- ALM-12042 Incorrect Configuration of Key Files
- ALM-12045 Read Packet Dropped Rate Exceeds the Threshold
- ALM-12046 Write Packet Dropped Rate Exceeds the Threshold
- ALM-12047 Read Packet Error Rate Exceeds the Threshold
- ALM-12048 Write Packet Error Rate Exceeds the Threshold
- ALM-12049 Network Read Throughput Rate Exceeds the Threshold
- ALM-12050 Network Write Throughput Rate Exceeds the Threshold
- ALM-12051 Disk Inode Usage Exceeds the Threshold
- ALM-12052 TCP Temporary Port Usage Exceeds the Threshold
- ALM-12053 Host File Handle Usage Exceeds the Threshold
- ALM-12054 Invalid Certificate File
- ALM-12055 Certificate File Is About to Expire
- ALM-12057 Metadata Not Configured with the Task to Periodically Back Up Data to a Third-Party Server
- ALM-12061 Process Usage Exceeds the Threshold
- ALM-12062 OMS Parameter Configurations Mismatch with the Cluster Scale
- ALM-12063 Unavailable Disk
- ALM-12064 Host Random Port Range Conflicts with Cluster Used Port
- ALM-12066 Trust Relationships Between Nodes Become Invalid
- ALM-12067 Tomcat Resource Is Abnormal
- ALM-12068 ACS Resource Exception
- ALM-12069 AOS Resource Exception
- ALM-12070 Controller Resource Is Abnormal
- ALM-12071 Httpd Resource Is Abnormal
- ALM-12072 FloatIP Resource Is Abnormal
- ALM-12073 CEP Resource Is Abnormal
- ALM-12074 FMS Resource Is Abnormal
- ALM-12075 PMS Resource Is Abnormal
- ALM-12076 GaussDB Resource Is Abnormal
- ALM-12077 User omm Expired
- ALM-12078 Password of User omm Expired
- ALM-12079 User omm Is About to Expire
- ALM-12080 Password of User omm Is About to Expire
- ALM-12081User ommdba Expired
- ALM-12082 User ommdba Is About to Expire
- ALM-12083 Password of User ommdba Is About to Expire
- ALM-12084 Password of User ommdba Expired
- ALM-12085 Service Audit Log Dump Failure
- ALM-12087 System Is in the Upgrade Observation Period
- ALM-12089 Inter-Node Network Is Abnormal
- ALM-12091 Abnormal disaster Resources
- ALM-12099 core dump Occurred
- ALM-12100 AD Service Connection Failed
- ALM-12101 AZ Unhealthy
- ALM-12102 AZ HA Component Is Not Deployed Based on DR Requirements
- ALM-12103 Executor Resource Exception
- ALM-12104 Abnormal Knox Resources
- ALM-12110 Failed to get ECS temporary AK/SK
- ALM-12172 Failed to Report Metrics to Cloud Eye
- ALM-12180 Suspended Disk I/O
- ALM-12186 CGroup Task Usage Exceeds the Threshold
- ALM-12187 Failed to Expand Disk Partition Capacity
- ALM-12188 diskmgt Disk Monitoring Unavailable
- ALM-12190 Number of Knox Connections Exceeds the Threshold
- ALM-13000 ZooKeeper Service Unavailable
- ALM-13001 Available ZooKeeper Connections Are Insufficient
- ALM-13002 ZooKeeper Direct Memory Usage Exceeds the Threshold
- ALM-13003 GC Duration of the ZooKeeper Process Exceeds the Threshold
- ALM-13004 ZooKeeper Heap Memory Usage Exceeds the Threshold
- ALM-13005 Failed to Set the Quota of Top Directories of ZooKeeper Components
- ALM-13006 Znode Number or Capacity Exceeds the Threshold
- ALM-13007 Available ZooKeeper Client Connections Are Insufficient
- ALM-13008 ZooKeeper Znode Usage Exceeds the Threshold
- ALM-13009 ZooKeeper Znode Capacity Usage Exceeds the Threshold
- ALM-13010 Znode Usage of a Directory with Quota Configured Exceeds the Threshold
- ALM-14000 HDFS Service Unavailable
- ALM-14001 HDFS Disk Usage Exceeds the Threshold
- ALM-14002 DataNode Disk Usage Exceeds the Threshold
- ALM-14003 Number of Lost HDFS Blocks Exceeds the Threshold
- ALM-14006 Number of HDFS Files Exceeds the Threshold
- ALM-14007 NameNode Heap Memory Usage Exceeds the Threshold
- ALM-14008 DataNode Heap Memory Usage Exceeds the Threshold
- ALM-14009 Number of Dead DataNodes Exceeds the Threshold
- ALM-14010 NameService Service Is Abnormal
- ALM-14011 DataNode Data Directory Is Not Configured Properly
- ALM-14012 JournalNode Is Out of Synchronization
- ALM-14013 Failed to Update the NameNode FsImage File
- ALM-14014 NameNode GC Time Exceeds the Threshold
- ALM-14015 DataNode GC Time Exceeds the Threshold
- ALM-14016 DataNode Direct Memory Usage Exceeds the Threshold
- ALM-14017 NameNode Direct Memory Usage Exceeds the Threshold
- ALM-14018 NameNode Non-heap Memory Usage Exceeds the Threshold
- ALM-14019 DataNode Non-heap Memory Usage Exceeds the Threshold
- ALM-14020 Number of Entries in the HDFS Directory Exceeds the Threshold
- ALM-14021 NameNode Average RPC Processing Time Exceeds the Threshold
- ALM-14022 NameNode Average RPC Queuing Time Exceeds the Threshold
- ALM-14023 Percentage of Total Reserved Disk Space for Replicas Exceeds the Threshold
- ALM-14024 Tenant Space Usage Exceeds the Threshold
- ALM-14025 Tenant File Object Usage Exceeds the Threshold
- ALM-14026 Blocks on DataNode Exceed the Threshold
- ALM-14027 DataNode Disk Fault
- ALM-14028 Number of Blocks to Be Supplemented Exceeds the Threshold
- ALM-14029 Number of Blocks in a Replica Exceeds the Threshold
- ALM-14030 HDFS Allows Write of Single-Replica Data
- ALM-14031 DataNode Process Is Abnormal
- ALM-14032 JournalNode Process Is Abnormal
- ALM-14033 ZKFC Process Is Abnormal
- ALM-14034 Router Process Is Abnormal
- ALM-14035 HttpFS Process Is Abnormal
- ALM-16000 Percentage of Sessions Connected to the HiveServer to Maximum Number Allowed Exceeds the Threshold
- ALM-16001 Hive Warehouse Space Usage Exceeds the Threshold
- ALM-16002 Hive SQL Execution Success Rate Is Lower Than the Threshold
- ALM-16003 Background Thread Usage Exceeds the Threshold
- ALM-16004 Hive Service Unavailable
- ALM-16005 The Heap Memory Usage of the Hive Process Exceeds the Threshold
- ALM-16006 The Direct Memory Usage of the Hive Process Exceeds the Threshold
- ALM-16007 Hive GC Time Exceeds the Threshold
- ALM-16008 Non-Heap Memory Usage of the Hive Process Exceeds the Threshold
- ALM-16009 Map Number Exceeds the Threshold
- ALM-16045 Hive Data Warehouse Is Deleted
- ALM-16046 Hive Data Warehouse Permission Is Modified
- ALM-16047 HiveServer Has Been Deregistered from ZooKeeper
- ALM-16048 Tez or Spark Library Path Does Not Exist
- ALM-17003 Oozie Service Unavailable
- ALM-17004 Oozie Heap Memory Usage Exceeds the Threshold
- ALM-17005 Oozie Non Heap Memory Usage Exceeds the Threshold
- ALM-17006 Oozie Direct Memory Usage Exceeds the Threshold
- ALM-17007 Garbage Collection (GC) Time of the Oozie Process Exceeds the Threshold
- ALM-17008 Abnormal Connection Between Oozie and ZooKeeper
- ALM-17009 Abnormal Connection Between Oozie and DBService
- ALM-17010 Abnormal Connection Between Oozie and HDFS
- ALM-17011 Abnormal Connection Between Oozie and Yarn
- ALM-18000 Yarn Service Unavailable
- ALM-18002 NodeManager Heartbeat Lost
- ALM-18003 NodeManager Unhealthy
- ALM-18008 Heap Memory Usage of ResourceManager Exceeds the Threshold
- ALM-18009 Heap Memory Usage of JobHistoryServer Exceeds the Threshold
- ALM-18010 ResourceManager GC Time Exceeds the Threshold
- ALM-18011 NodeManager GC Time Exceeds the Threshold
- ALM-18012 JobHistoryServer GC Time Exceeds the Threshold
- ALM-18013 ResourceManager Direct Memory Usage Exceeds the Threshold
- ALM-18014 NodeManager Direct Memory Usage Exceeds the Threshold
- ALM-18015 JobHistoryServer Direct Memory Usage Exceeds the Threshold
- ALM-18016 Non Heap Memory Usage of ResourceManager Exceeds the Threshold
- ALM-18017 Non Heap Memory Usage of NodeManager Exceeds the Threshold
- ALM-18018 NodeManager Heap Memory Usage Exceeds the Threshold
- ALM-18019 Non Heap Memory Usage of JobHistoryServer Exceeds the Threshold
- ALM-18020 Yarn Task Execution Timeout
- ALM-18021 Mapreduce Service Unavailable
- ALM-18022 Insufficient Yarn Queue Resources
- ALM-18023 Number of Pending Yarn Tasks Exceeds the Threshold
- ALM-18024 Pending Yarn Memory Usage Exceeds the Threshold
- ALM-18025 Number of Terminated Yarn Tasks Exceeds the Threshold
- ALM-18026 Number of Failed Yarn Tasks Exceeds the Threshold
- ALM-19000 HBase Service Unavailable
- ALM-19006 HBase Replication Sync Failed
- ALM-19007 HBase GC Time Exceeds the Threshold
- ALM-19008 Heap Memory Usage of the HBase Process Exceeds the Threshold
- ALM-19009 Direct Memory Usage of the HBase Process Exceeds the Threshold
- ALM-19011 RegionServer Region Number Exceeds the Threshold
- ALM-19012 HBase System Table Directory or File Lost
- ALM-19013 Duration of Regions in transaction State Exceeds the Threshold
- ALM-19014 Capacity Quota Usage on ZooKeeper Exceeds the Threshold Severely
- ALM-19015 Quantity Quota Usage on ZooKeeper Exceeds the Threshold
- ALM-19016 Quantity Quota Usage on ZooKeeper Exceeds the Threshold Severely
- ALM-19017 Capacity Quota Usage on ZooKeeper Exceeds the Threshold
- ALM-19018 HBase Compaction Queue Size Exceeds the Threshold
- ALM-19019 Number of HBase HFiles to Be Synchronized Exceeds the Threshold
- ALM-19020 Number of HBase WAL Files to Be Synchronized Exceeds the Threshold
- ALM-19021 Handler Usage of RegionServer Exceeds the Threshold
- ALM-19022 HBase Hotspot Detection Is Unavailable
- ALM-19023 Region Traffic Restriction for HBase
- ALM-19024 RPC Requests P99 Latency on RegionServer Exceeds the Threshold
- ALM-19025 Damaged StoreFile in HBase
- ALM-19026 Damaged WAL Files in HBase
- ALM-20002 Hue Service Unavailable
- ALM-23001 Loader Service Unavailable
- ALM-23003 Loader Task Execution Failure
- ALM-23004 Loader Heap Memory Usage Exceeds the Threshold
- ALM-23005 Loader Non-Heap Memory Usage Exceeds the Threshold
- ALM-23006 Loader Direct Memory Usage Exceeds the Threshold
- ALM-23007 Garbage Collection (GC) Time of the Loader Process Exceeds the Threshold
- ALM-24000 Flume Service Unavailable
- ALM-24001 Flume Agent Exception
- ALM-24003 Flume Client Connection Interrupted
- ALM-24004 Exception Occurs When Flume Reads Data
- ALM-24005 Exception Occurs When Flume Transmits Data
- ALM-24006 Heap Memory Usage of Flume Server Exceeds the Threshold
- ALM-24007 Flume Server Direct Memory Usage Exceeds the Threshold
- ALM-24008 Flume Server Non Heap Memory Usage Exceeds the Threshold
- ALM-24009 Flume Server Garbage Collection (GC) Time Exceeds the Threshold
- ALM-24010 Flume Certificate File Is Invalid or Damaged
- ALM-24011 Flume Certificate File Is About to Expire
- ALM-24012 Flume Certificate File Has Expired
- ALM-24013 Flume MonitorServer Certificate File Is Invalid or Damaged
- ALM-24014 Flume MonitorServer Certificate Is About to Expire
- ALM-24015 Flume MonitorServer Certificate File Has Expired
- ALM-25000 LdapServer Service Unavailable
- ALM-25004 Abnormal LdapServer Data Synchronization
- ALM-25005 nscd Service Exception
- ALM-25006 Sssd Service Exception
- ALM-25007 Number of SlapdServer Connections Exceeds the Threshold
- ALM-25008 SlapdServer CPU Usage Exceeds the Threshold
- ALM-25500 KrbServer Service Unavailable
- ALM-26051 Storm Service Unavailable
- ALM-26052 Number of Available Supervisors of the Storm Service Is Less Than the Threshold
- ALM-26053 Storm Slot Usage Exceeds the Threshold
- ALM-26054 Nimbus Heap Memory Usage Exceeds the Threshold
- ALM-27001 DBService Service Unavailable
- ALM-27003 DBService Heartbeat Interruption Between the Active and Standby Nodes
- ALM-27004 Data Inconsistency Between Active and Standby DBServices
- ALM-27005 Database Connections Usage Exceeds the Threshold
- ALM-27006 Disk Space Usage of the Data Directory Exceeds the Threshold
- ALM-27007 Database Enters the Read-Only Mode
- ALM-29000 Impala Service Unavailable
- ALM-29004 Impalad Process Memory Usage Exceeds the Threshold
- ALM-29005 Number of JDBC Connections to Impalad Exceeds the Threshold
- ALM-29006 Number of ODBC Connections to Impalad Exceeds the Threshold
- ALM-29010 Number of Queries Being Submitted by Impalad Exceeds the Threshold
- ALM-29011 Number of Queries Being Executed by Impalad Exceeds the Threshold
- ALM-29012 Number of Queries Being Waited by Impalad Exceeds the Threshold
- ALM-29013 Impalad FGC Time Exceeds the Threshold
- ALM-29014 Catalog FGC Time Exceeds the Threshold
- ALM-29015 Catalog Process Memory Usage Exceeds the Threshold
- ALM-29016 Impalad Instance in the Sub-healthy State
- ALM-29100 Kudu Service Unavailable
- ALM-29104 Tserver Process Memory Usage Exceeds the Threshold
- ALM-29106 Tserver Process CPU Usage Exceeds the Threshold
- ALM-29107 Tserver Process Memory Usage Exceeds the Threshold
- ALM-38000 Kafka Service Unavailable
- ALM-38001 Insufficient Kafka Disk Capacity
- ALM-38002 Kafka Heap Memory Usage Exceeds the Threshold
- ALM-38004 Kafka Direct Memory Usage Exceeds the Threshold
- ALM-38005 GC Duration of the Broker Process Exceeds the Threshold
- ALM-38006 Percentage of Kafka Partitions That Are Not Completely Synchronized Exceeds the Threshold
- ALM-38007 Status of Kafka Default User Is Abnormal
- ALM-38008 Abnormal Kafka Data Directory Status
- ALM-38009 Busy Broker Disk I/Os (Applicable to Versions Later Than MRS 3.1.0)
- ALM-38009 Kafka Topic Overload (Applicable to MRS 3.1.0 and Earlier Versions)
- ALM-38010 Topics with Single Replica
- ALM-38011 User Connection Usage on Broker Exceeds the Threshold
- ALM-43001 Spark2x Service Unavailable
- ALM-43006 Heap Memory Usage of the JobHistory2x Process Exceeds the Threshold
- ALM-43007 Non-Heap Memory Usage of the JobHistory2x Process Exceeds the Threshold
- ALM-43008 The Direct Memory Usage of the JobHistory2x Process Exceeds the Threshold
- ALM-43009 JobHistory2x Process GC Time Exceeds the Threshold
- ALM-43010 Heap Memory Usage of the JDBCServer2x Process Exceeds the Threshold
- ALM-43011 Non-Heap Memory Usage of the JDBCServer2x Process Exceeds the Threshold
- ALM-43012 Direct Heap Memory Usage of the JDBCServer2x Process Exceeds the Threshold
- ALM-43013 JDBCServer2x Process GC Time Exceeds the Threshold
- ALM-43017 JDBCServer2x Process Full GC Number Exceeds the Threshold
- ALM-43018 JobHistory2x Process Full GC Number Exceeds the Threshold
- ALM-43019 Heap Memory Usage of the IndexServer2x Process Exceeds the Threshold
- ALM-43020 Non-Heap Memory Usage of the IndexServer2x Process Exceeds the Threshold
- ALM-43021 Direct Memory Usage of the IndexServer2x Process Exceeds the Threshold
- ALM-43022 IndexServer2x Process GC Time Exceeds the Threshold
- ALM-43023 IndexServer2x Process Full GC Number Exceeds the Threshold
- ALM-44000 Presto Service Unavailable
- ALM-44004 Presto Coordinator Resource Group Queuing Tasks Exceed the Threshold
- ALM-44005 Presto Coordinator Process GC Time Exceeds the Threshold
- ALM-44006 Presto Worker Process GC Time Exceeds the Threshold
- ALM-45000 HetuEngine Service Unavailable
- ALM-45001 Faulty HetuEngine Compute Instances
- ALM-45003 HetuEngine QAS Disk Capacity Is Insufficient
- ALM-45175 Average Time for Calling OBS Metadata APIs Is Greater than the Threshold
- ALM-45176 Success Rate of Calling OBS Metadata APIs Is Lower than the Threshold
- ALM-45177 Success Rate of Calling OBS Data Read APIs Is Lower than the Threshold
- ALM-45178 Success Rate of Calling OBS Data Write APIs Is Lower Than the Threshold
- ALM-45179 Number of Failed OBS readFully API Calls Exceeds the Threshold
- ALM-45180 Number of Failed OBS read API Calls Exceeds the Threshold
- ALM-45181 Number of Failed OBS write API Calls Exceeds the Threshold
- ALM-45182 Number of Throttled OBS Operations Exceeds the Threshold
- ALM-45275 Ranger Service Unavailable
- ALM-45276 Abnormal RangerAdmin Status
- ALM-45277 RangerAdmin Heap Memory Usage Exceeds the Threshold
- ALM-45278 RangerAdmin Direct Memory Usage Exceeds the Threshold
- ALM-45279 RangerAdmin Non Heap Memory Usage Exceeds the Threshold
- ALM-45280 RangerAdmin GC Duration Exceeds the Threshold
- ALM-45281 UserSync Heap Memory Usage Exceeds the Threshold
- ALM-45282 UserSync Direct Memory Usage Exceeds the Threshold
- ALM-45283 UserSync Non Heap Memory Usage Exceeds the Threshold
- ALM-45284 UserSync Garbage Collection (GC) Time Exceeds the Threshold
- ALM-45285 TagSync Heap Memory Usage Exceeds the Threshold
- ALM-45286 TagSync Direct Memory Usage Exceeds the Threshold
- ALM-45287 TagSync Non Heap Memory Usage Exceeds the Threshold
- ALM-45288 TagSync Garbage Collection (GC) Time Exceeds the Threshold
- ALM-45289 PolicySync Heap Memory Usage Exceeds the Threshold
- ALM-45290 PolicySync Direct Memory Usage Exceeds the Threshold
- ALM-45291 PolicySync Non-Heap Memory Usage Exceeds the Threshold
- ALM-45292 PolicySync GC Duration Exceeds the Threshold
- ALM-45325 Presto Service Unavailable
- ALM-45326 Number of Presto Coordinator Threads Exceeds the Threshold
- ALM-45327 Presto Coordinator Process GC Time Exceeds the Threshold
- ALM-45328 Presto Worker Process GC Time Exceeds the Threshold
- ALM-45329 Presto Coordinator Resource Group Queuing Tasks Exceed the Threshold
- ALM-45330 Number of Presto Worker Threads Exceeds the Threshold
- ALM-45331 Number of Presto Worker1 Threads Exceeds the Threshold
- ALM-45332 Number of Presto Worker2 Threads Exceeds the Threshold
- ALM-45333 Number of Presto Worker3 Threads Exceeds the Threshold
- ALM-45334 Number of Presto Worker4 Threads Exceeds the Threshold
- ALM-45335 Presto Worker1 Process GC Time Exceeds the Threshold
- ALM-45336 Presto Worker2 Process GC Time Exceeds the Threshold
- ALM-45337 Presto Worker3 Process GC Time Exceeds the Threshold
- ALM-45338 Presto Worker4 Process GC Time Exceeds the Threshold
- ALM-45425 ClickHouse Service Unavailable
- ALM-45426 ClickHouse Service Quantity Quota Usage in ZooKeeper Exceeds the Threshold
- ALM-45427 ClickHouse Service Capacity Quota Usage in ZooKeeper Exceeds the Threshold
- ALM-45428 ClickHouse Disk I/O Exception
- ALM-45429 Table Metadata Synchronization Failed on the Added ClickHouse Node
- ALM-45430 Permission Metadata Synchronization Failed on the Added ClickHouse Node
- ALM-45431 Improper ClickHouse Instance Distribution for Topology Allocation
- ALM-45432 ClickHouse User Synchronization Process Fails
- ALM-45433 ClickHouse AZ Topology Exception
- ALM-45434 A Single Replica Exists in the ClickHouse Data Table
- ALM-45435 Inconsistent Metadata of ClickHouse Tables
- ALM-45436 Skew ClickHouse Table Data
- ALM-45437 Excessive Parts in the ClickHouse Table
- ALM-45438 ClickHouse Disk Usage Exceeds 80%
- ALM-45439 ClickHouse Node Enters the Read-Only Mode
- ALM-45440 Inconsistency Between ClickHouse Replicas
- ALM-45441 Zookeeper Disconnected
- ALM-45442 Too Many Concurrent SQL Statements
- ALM-45443 Slow SQL Queries in the Cluster
- ALM-45444 Abnormal ClickHouse Process
- ALM-45475 A Single Replica Exists in the Kudu Data Table
- ALM-45476 Number of Tablets of the KuduTServer Process Exceeds the Threshold
- ALM-45477 Failed to Restore Data After a Disk of Kudu Is Replaced
- ALM-45478 Kudu Failed Data Balancing
- ALM-45479 Number of Tablets of the Tserver Process Exceeds the Threshold
- ALM-45480 Tablet Leaders of a Tserver Process Are Unevenly Distributed
- ALM-45481 KuduTserver Has Full Disks
- ALM-45585 IoTDB Service Unavailable
- ALM-45586 IoTDBServer Heap Memory Usage Exceeds the Threshold
- ALM-45587 IoTDBServer GC Duration Exceeds the Threshold
- ALM-45588 IoTDBServer Direct Memory Usage Exceeds the Threshold
- ALM-45589 ConfigNode Heap Memory Usage Exceeds the Threshold
- ALM-45590 ConfigNode GC Duration Exceeds the Threshold
- ALM-45591 ConfigNode Direct Memory Usage Exceeds the Threshold
- ALM-45592 IoTDBServer RPC Execution Duration Exceeds the Threshold
- ALM-45593 IoTDBServer Flush Execution Duration Exceeds the Threshold
- ALM-45594 IoTDBServer Intra-Space Merge Duration Exceeds the Threshold
- ALM-45595 IoTDBServer Cross-Space Merge Duration Exceeds the Threshold
- ALM-45596 Procedure Execution Failed
- ALM-45615 CDL Service Unavailable
- ALM-45616 CDL Job Execution Exception
- ALM-45617 Data Queued in the CDL Replication Slot Exceeds the Threshold
- ALM-45635 FlinkServer Job Execution Failure
- ALM-45636 Flink Job Checkpoints Keep Failing
- ALM-45636 Number of Consecutive Checkpoint Failures of a Flink Job Exceeds the Threshold
- ALM-45637 FlinkServer Task Is Continuously Under Back Pressure
- ALM-45638 Number of Restarts After FlinkServer Job Failures Exceeds the Threshold
- ALM-45638 Number of Restarts After Flink Job Failures Exceeds the Threshold
- ALM-45639 Checkpointing of a Flink Job Times Out
- ALM-45640 FlinkServer Heartbeat Interruption Between the Active and Standby Nodes
- ALM-45641 Data Synchronization Exception Between the Active and Standby FlinkServer Nodes
- ALM-45642 RocksDB Continuously Triggers Write Traffic Limiting
- ALM-45643 MemTable Size of RocksDB Continuously Exceeds the Threshold
- ALM-45644 Number of SST Files at Level 0 of RocksDB Continuously Exceeds the Threshold
- ALM-45645 Pending Flush Size of RocksDB Continuously Exceeds the Threshold
- ALM-45646 Pending Compaction Size of RocksDB Continuously Exceeds the Threshold
- ALM-45647 Estimated Pending Compaction Size of RocksDB Continuously Exceeds the Threshold
- ALM-45648 RocksDB Frequently Encounters Write-Stopped
- ALM-45649 P95 Latency of RocksDB Get Requests Continuously Exceeds the Threshold
- ALM-45650 P95 Latency of RocksDB Write Requests Continuously Exceeds the Threshold
- ALM-45652 Flink Service Unavailable
- ALM-45653 Invalid Flink HA Certificate File
- ALM-45654 Flink HA Certificate Is About to Expire
- ALM-45655 Flink HA Certificate File Has Expired
- ALM-45736 Guardian Service Unavailable
- ALM-45737 TokenServer Heap Memory Usage Exceeds the Threshold
- ALM-45738 TokenServer Direct Memory Usage Exceeds the Threshold
- ALM-45739 TokenServer Non-Heap Memory Usage Exceeds the Threshold
- ALM-45740 TokenServer GC Duration Exceeds the Threshold
- ALM-45741 Failed to Call the ECS securitykey API
- ALM-45742 Failed to Call the ECS Metadata API
- ALM-45743 Failed to Call the IAM API
- ALM-50201 Doris Service Unavailable
- ALM-50202 FE CPU Usage Exceeds the Threshold
- ALM-50203 FE Memory Usage Exceeds the Threshold
- ALM-50205 BE CPU Usage Exceeds the Threshold
- ALM-50206 BE Memory Usage Exceeds the Threshold
- ALM-50207 Ratio of Connections to the FE MySQL Port to the Maximum Connections Allowed Exceeds the Threshold
- ALM-50208 Failures to Clear Historical Metadata Image Files Exceed the Threshold
- ALM-50209 Failures to Generate Metadata Image Files Exceed the Threshold
- ALM-50210 Maximum Compaction Score of All BE Nodes Exceeds the Threshold
- ALM-50211 FE Queue Length of BE Periodic Report Tasks Exceeds the Threshold
- ALM-50212 Accumulated Old-Generation GC Duration of the FE Process Exceeds the Threshold
- ALM-50213 Number of Tasks Queuing in the FE Thread Pool for Interacting with BE Exceeds the Threshold
- ALM-50214 Number of Tasks Queuing in the FE Thread Pool for Task Processing Exceeds the Threshold
- ALM-50215 Longest Duration of RPC Requests Received by Each FE Thrift Method Exceeds the Threshold
- ALM-50216 Memory Usage of the FE Node Exceeds the Threshold
- ALM-50217 Heap Memory Usage of the FE Node Exceeds the Threshold
- ALM-50219 Length of the Queue in the Thread Pool for Query Execution Exceeds the Threshold
- ALM-50220 Error Rate of TCP Packet Receiving Exceeds the Threshold
- ALM-50221 BE Data Disk Usage Exceeds the Threshold
- ALM-50222 Disk Status of a Specified Data Directory on BE Is Abnormal
- ALM-50223 Maximum Memory Required by BE Is Greater Than the Remaining Memory of the Machine
- ALM-50224 Failures a Certain Task Type on BE Are Increasing
- ALM-50225 FE Instance Fault
- ALM-50226 BE Instance Fault
- ALM-50401 Number of JobServer Jobs Waiting to Be Executed Exceeds the Threshold
- ALM-50402 JobGateway Service Unavailable
- ALM-12001 Audit Log Dump Failure (For MRS 2.x or Earlier)
- ALM-12002 HA Resource Abnormal (For MRS 2.x or Earlier)
- ALM-12004 OLdap Resource Abnormal (For MRS 2.x or Earlier)
- ALM-12005 OKerberos Resource Abnormal (For MRS 2.x or Earlier)
- ALM-12006 Node Fault (For MRS 2.x or Earlier)
- ALM-12007 Process Fault (For MRS 2.x or Earlier)
- ALM-12010 Manager Heartbeat Interruption Between the Active and Standby Nodes (For MRS 2.x or Earlier)
- ALM-12011 Data Synchronization Exception Between the Active and Standby Manager Nodes (For MRS 2.x or Earlier)
- ALM-12012 NTP Service Abnormal (For MRS 2.x or Earlier)
- ALM-12014 Device Partition Lost (For MRS 2.x or Earlier)
- ALM-12015 Device Partition File System Read-Only (For MRS 2.x or Earlier)
- ALM-12016 CPU Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12017 Insufficient Disk Capacity (For MRS 2.x or Earlier)
- ALM-12018 Memory Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12027 Host PID Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12028 Number of Processes in the D State on the Host Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12031 User omm or Password Is About to Expire (For MRS 2.x or Earlier)
- ALM-12032 User ommdba or Password Is About to Expire (For MRS 2.x or Earlier)
- ALM-12033 Slow Disk Fault (For MRS 2.x or Earlier)
- ALM-12034 Periodic Backup Failure (For MRS 2.x or Earlier)
- ALM-12035 Unknown Data Status After Recovery Task Failure (For MRS 2.x or Earlier)
- ALM-12037 NTP Server Abnormal (For MRS 2.x or Earlier)
- ALM-12038 Monitoring Indicator Dump Failure (For MRS 2.x or Earlier)
- ALM-12039 GaussDB Data Is Not Synchronized (For MRS 2.x or Earlier)
- ALM-12040 Insufficient System Entropy (For MRS 2.x or Earlier)
- ALM-12041 Permission of Key Files Is Abnormal (For MRS 2.x or Earlier)
- ALM-12042 Key File Configurations Are Abnormal (For MRS 2.x or Earlier)
- ALM-12043 DNS Parsing Duration Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12045 Read Packet Dropped Rate Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12046 Write Packet Dropped Rate Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12047 Read Packet Error Rate Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12048 Write Packet Error Rate Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12049 Read Throughput Rate Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12050 Write Throughput Rate Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12051 Disk Inode Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12052 Usage of Temporary TCP Ports Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12053 File Handle Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-12054 Invalid Certificate File (For MRS 2.x or Earlier)
- ALM-12055 Certificate File Is About to Expire (For MRS 2.x or Earlier)
- ALM-12180 Disk Card I/O (For MRS 2.x or Earlier)
- ALM-12357 Failed to Export Audit Logs to OBS (For MRS 2.x or Earlier)
- ALM-13000 ZooKeeper Service Unavailable (For MRS 2.x or Earlier)
- ALM-13001 Available ZooKeeper Connections Are Insufficient (For MRS 2.x or Earlier)
- ALM-13002 ZooKeeper Memory Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14000 HDFS Service Unavailable (For MRS 2.x or Earlier)
- ALM-14001 HDFS Disk Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14002 DataNode Disk Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14003 Number of Lost HDFS Blocks Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14004 Number of Damaged HDFS Blocks Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14006 Number of HDFS Files Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14007 HDFS NameNode Memory Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14008 HDFS DataNode Memory Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14009 Number of Faulty DataNodes Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-14010 NameService Is Abnormal (For MRS 2.x or Earlier)
- ALM-14011 HDFS DataNode Data Directory Is Not Configured Properly (For MRS 2.x or Earlier)
- ALM-14012 HDFS Journalnode Data Is Not Synchronized (For MRS 2.x or Earlier)
- ALM-16000 Percentage of Sessions Connected to the HiveServer to the Maximum Number Allowed Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-16001 Hive Warehouse Space Usage Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-16002 Hive SQL Execution Success Rate Is Lower Than the Threshold (For MRS 2.x or Earlier)
- ALM-16004 Hive Service Unavailable (For MRS 2.x or Earlier)
- ALM-16005 Number of Failed Hive SQL Executions in the Last Period Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-18000 Yarn Service Unavailable (For MRS 2.x or Earlier)
- ALM-18002 NodeManager Heartbeat Lost (For MRS 2.x or Earlier)
- ALM-18003 NodeManager Unhealthy (For MRS 2.x or Earlier)
- ALM-18004 NodeManager Disk Usability Ratio Is Lower Than the Threshold (For MRS 2.x or Earlier)
- ALM-18006 MapReduce Job Execution Timeout (For MRS 2.x or Earlier)
- ALM-18008 Heap Memory Usage of Yarn ResourceManager Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-18009 Heap Memory Usage of MapReduce JobHistoryServer Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-18010 Number of Pending Yarn Tasks Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-18011 Memory of Pending Yarn Tasks Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-18012 Number of Terminated Yarn Tasks in the Last Period Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-18013 Number of Failed Yarn Tasks in the Last Period Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-19000 HBase Service Unavailable (For MRS 2.x or Earlier)
- ALM-19006 HBase Replication Sync Failed (For MRS 2.x or Earlier)
- ALM-19007 HBase Merge Queue Exceeds the Threshold (for 2.x and Earlier Versions)
- ALM-20002 Hue Service Unavailable (For MRS 2.x or Earlier)
- ALM-23001 Loader Service Unavailable (For MRS 2.x or Earlier)
- ALM-24000 Flume Service Unavailable (For MRS 2.x or Earlier)
- ALM-24001 Flume Agent Is Abnormal (For MRS 2.x or Earlier)
- ALM-24003 Flume Client Connection Interrupted (For MRS 2.x or Earlier)
- ALM-24004 Flume Fails to Read Data (For MRS 2.x or Earlier)
- ALM-24005 Data Transmission by Flume Is Abnormal (For MRS 2.x or Earlier)
- ALM-25000 LdapServer Service Unavailable (For MRS 2.x or Earlier)
- ALM-25004 Abnormal LdapServer Data Synchronization (For MRS 2.x or Earlier)
- ALM-25500 KrbServer Service Unavailable (For MRS 2.x or Earlier)
- ALM-26051 Storm Service Unavailable (For MRS 2.x or Earlier)
- ALM-26052 Number of Available Supervisors in Storm Is Lower Than the Threshold (For MRS 2.x or Earlier)
- ALM-26053 Slot Usage of Storm Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-26054 Heap Memory Usage of Storm Nimbus Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-27001 DBService Unavailable (For MRS 2.x or Earlier)
- ALM-27003 DBService Heartbeat Interruption Between the Active and Standby Nodes (For MRS 2.x or Earlier)
- ALM-27004 Data Inconsistency Between Active and Standby DBServices (For MRS 2.x or Earlier)
- ALM-28001 Spark Service Unavailable (For MRS 2.x or Earlier)
- ALM-38000 Kafka Service Unavailable (For MRS 2.x or Earlier)
- ALM-38001 Insufficient Kafka Disk Capacity (For MRS 2.x or Earlier)
- ALM-38002 Heap Memory Usage of Kafka Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43001 Spark Service Unavailable (For MRS 2.x or Earlier)
- ALM-43006 Heap Memory Usage of the JobHistory Process Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43007 Non-Heap Memory Usage of the JobHistory Process Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43008 Direct Memory Usage of the JobHistory Process Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43009 JobHistory GC Time Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43010 Heap Memory Usage of the JDBCServer Process Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43011 Non-Heap Memory Usage of the JDBCServer Process Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43012 Direct Memory Usage of the JDBCServer Process Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-43013 JDBCServer GC Time Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-44004 Presto Coordinator Resource Group Queuing Tasks Exceed the Threshold (For MRS 2.x or Earlier)
- ALM-44005 Presto Coordinator Process GC Time Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-44006 Presto Worker Process GC Time Exceeds the Threshold (For MRS 2.x or Earlier)
- ALM-45325 Presto Service Unavailable (For MRS 2.x or Earlier)
- Configuring Remote O&M for an MRS Cluster
- Common Ports for MRS Cluster Services
-
Configuring Storage-Compute Decoupling for an MRS Cluster
- Configuration Process
-
Interconnecting an MRS Cluster with OBS Using an IAM Agency
- Interconnecting an MRS Cluster with OBS Using an IAM Agency
- Configuring the Policy for Clearing Recycle Bin Directories of MRS Cluster Components
-
Example for Interconnecting an MRS Cluster with OBS
- Interconnecting Flink with OBS Using an IAM Agency
- Interconnecting Flume with OBS Using an IAM Agency
- Interconnecting HDFS with OBS Using an IAM Agency
- Interconnecting Hive with OBS Using an IAM Agency
- Interconnecting Hudi with OBS Using an IAM Agency
- Interconnecting MapReduce with OBS Using an IAM Agency
- Interconnecting Presto with OBS Using an IAM Agency
- Interconnecting Spark with OBS Using an IAM Agency
- Interconnecting Sqoop with OBS Using an IAM Agency
- Configuring Fine-Grained OBS Access Permissions for MRS Cluster Users
-
Interconnecting an MRS Cluster with OBS Through Guardian
- Interconnecting Guardian with OBS
-
Example for Interconnecting an MRS Cluster with OBS
- Accessing OBS Using Flink Through Guardian
- Accessing OBS Using HDFS Through Guardian
- Accessing OBS Using HetuEngine Through Guardian
- Accessing OBS Using Hive Through Guardian
- Accessing OBS Using Hudi Through Guardian
- Accessing OBS Using MapReduce Through Guardian
- Accessing OBS Using Spark Through Guardian
- Accessing OBS Using YARN Through Guardian
-
FAQ About Decoupled Storage and Compute
- How Do I Read Encrypted OBS Data When Running an MRS Job?
- Example Application Development for Interconnecting HDFS with OBS
- How Do I Connect an MRS Cluster Client to OBS Using an AK/SK Pair?
- How Do I Access OBS Using an MRS Client Installed Outside a Cluster?
- Accessing an MRS Cluster's Manager (Version 2.x or Earlier)
- How Do I Handle Abnormal Status of Core Nodes in an MRS Cluster After Successful Expansion?
- Quickly Buying a Hadoop Analysis Cluster
- Quickly Buying a Kafka Streaming Cluster
-
Component Operation Guide (Normal)
- Using Alluxio
- Using CarbonData (for Versions Earlier Than MRS 3.x)
-
Using CarbonData (for MRS 3.x or Later)
- CarbonData Data Types
- CarbonData Table User Permissions
- Creating a CarbonData Table Using the Spark Client
- CarbonData Data Analytics
- CarbonData Performance Tuning
- Typical CarbonData Configuration Parameters
- CarbonData Syntax Reference
- CarbonData Troubleshooting
-
CarbonData FAQs
- Why Is Incorrect Output Displayed When I Perform Query with Filter on Decimal Data Type Values?
- How to Avoid Minor Compaction for Historical Data?
- How to Change the Default Group Name for CarbonData Data Loading?
- Why Does INSERT INTO CARBON TABLE Command Fail?
- Why Is the Data Logged in Bad Records Different from the Original Input Data with Escape Characters?
- Why Data Load Performance Decreases due to Bad Records?
- Why INSERT INTO/LOAD DATA Task Distribution Is Incorrect and the Opened Tasks Are Less Than the Available Executors when the Number of Initial ExecutorsIs Zero?
- Why Does CarbonData Require Additional Executors Even Though the Parallelism Is Greater Than the Number of Blocks to Be Processed?
- Why Data loading Fails During off heap?
- Why Do I Fail to Create a Hive Table?
- How Do I Logically Split Data Across Different Namespaces?
- Why Does the Missing Privileges Exception Occur When the Database Is Dropped?
- Why the UPDATE Command Cannot Be Executed in Spark Shell?
- How Do I Configure Unsafe Memory in CarbonData?
- Why Does CarbonData Become Abnormal After the Disk Space Quota of the HDFS Storage Directory Is Set?
- Why Does Data Query or Loading Fail and "org.apache.carbondata.core.memory.MemoryException: Not enough memory" Is Displayed?
- Why Do Files of a Carbon Table Exist in the Recycle Bin Even If the drop table Command Is Not Executed When Mis-deletion Prevention Is Enabled?
-
Using ClickHouse
- ClickHouse Overview
- ClickHouse User Permission Management
- Using the ClickHouse Client
- Creating a ClickHouse Table
- ClickHouse Data Import
- Enterprise-Class Enhancements of ClickHouse
- ClickHouse Performance Tuning
- ClickHouse O&M Management
-
Common ClickHouse SQL Syntax
- CREATE DATABASE: Creating a Database
- CREATE TABLE: Creating a Table
- INSERT INTO: Inserting Data into a Table
- SELECT: Querying Table Data
- ALTER TABLE: Modifying a Table Structure
- ALTER TABLE: Modifying Table Data
- DESC: Querying a Table Structure
- DROP: Deleting a Table
- SHOW: Displaying Information About Databases and Tables
-
ClickHouse FAQ
- How Do I Do If the Disk Status Displayed in the System.disks Table Is fault or abnormal?
- How Do I Migrate Data from Hive/HDFS to ClickHouse?
- An Error Is Reported in Logs When the Auxiliary ZooKeeper or Replica Data Is Used to Synchronize Table Data
- How Do I Grant the Select Permission at the Database Level to ClickHouse Users?
- Using DBService
-
Using Flink
- Flink Job Engine
- Flink User Permission Management
- Using the Flink Client
- Preparing for Creating a FlinkServer Job
- Creating a FlinkServerJob
- Managing FlinkServer Jobs
- Flink O&M Management
- Flink Performance Tuning
- Typical Commands of the Flink Client
- Common Issues About Flink
- Example of Issuing a Certificate
-
Using Flume
- Flume Log Collection Overview
- Flume Service Model Configuration
- Installing the Flume Client
- Quickly Using Flume to Collect Node Logs
-
Configuring a Non-Encrypted Flume Data Collection Task
- Generating Configuration Files for the Flume Server and Client
- Using Flume Server to Collect Static Logs from Local Host to Kafka
- Using Flume Server to Collect Static Logs from Local Host to HDFS
- Using Flume Server to Collect Dynamic Logs from Local Host to HDFS
- Using Flume Server to Collect Logs from Kafka to HDFS
- Using Flume Client to Collect Logs from Kafka to HDFS
- Using Cascaded Agents to Collect Static Logs from Local Host to HBase
- Configuring an Encrypted Flume Data Collection Task
- Enterprise-Class Enhancements of Flume
- Flume O&M Management
- Common Issues About Flume
-
Using HBase
- Creating HBase Roles
- Using the HBase Client
- Quickly Using HBase for Offline Data Analysis
- Migrating Data to HBase Using BulkLoad
- HBase Data Operations
- Enterprise-Class Enhancements of HBase
- HBase Performance Tuning
- HBase O&M Management
-
Common Issues About HBase
- Operation Failures Occur in Stopping BulkLoad On the Client
- How Do I Restore a Region in the RIT State for a Long Time?
- What Should I Do If HMaster Exits Due to Timeout When Waiting for the Namespace Table to Go Online?
- Why Does SocketTimeoutException Occur When a Client Queries HBase?
- What Should I Do If Error Message "java.lang.UnsatisfiedLinkError: Permission denied" Is Displayed When I Start the HBase Shell?
- When Will the" Dead Region Servers" Information Displayed on the HMaster Web UI Be Cleared After a RegionServer Is Stopped?
- What Can I Do If a Message Indicating Insufficient Permission Is Displayed When I Access HBase Phoenix?
- How Do I Restore an HBase Region in Overlap State?
- Phoenix BulkLoad Use Restrictions
- Why a Message Is Displayed Indicating that the Permission is Insufficient When CTBase Connects to the Ranger Plug-ins?
-
HBase Troubleshooting
- The HBase Client Failed to Connect to the Server for a Long Time
- An Exception Occurred When HBase Deletes and Creates a Table Consecutively
- Other Services Are Unstable When Too Many HBase Connections Occupy the Network Ports
- HBase BulkLoad Tasks of 210,000 Map Tasks and 10,000 Reduce Tasks failed To Be Executed
- Modified and Deleted Data Can Still Be Queried by the Scan Command
- Failed to Create Tables When the Region is in FAILED_OPEN State
- How to Delete the residual Table Name on the ZooKeeper table-lock Node After a Table Creation Failure
- HBase Become Faulty When I Set a Quota for the Directory Used by HBase in HDFS
- HMaster Failed to Be Started After the OfflineMetaRepair Tool Is Used to Rebuild Metadata
- FileNotFoundException Is Frequently Printed in HMaster Logs
- Data Is Successfully Imported Using HBase BulkLoad, but Different Results May Be Returned To the Same Query
- HBase Data Restoration Task Failed to Be Rolled Back
- RegionServer Failed to Be Started When GC Parameters Xms and Xmx of HBase RegionServer Are Set to 31 GB
- When LoadIncrementalHFiles Is Used to Import Data in Batches on Cluster Nodes, the Insufficient Permission Error Is Reported
- "import argparse" Is Reported When the Phoenix Sqlline Script Is Used
-
Using HDFS
- Overview of HDFS File System Directories
- HDFS User Permission Management
- Using the HDFS Client
- Using Hadoop
- Configuring the Recycle Bin Mechanism
- Configuring HDFS DataNode Data Balancing
- Configuring HDFS Disk Balancing
- Using HDFS Mover to Migrate Data
- Configuring the Label Policy (NodeLabel) for HDFS File Directories
- Configuring NameNode Memory Parameters
- Setting the Number Limit of HBase and HDFS Handles
- Configuring the Number of Files in a Single HDFS Directory
- Enterprise-Class Enhancements of HDFS
-
HDFS Performance Tuning
- Improving HDFS Write Performance
- Improving Read Performance By HDFS Client Metadata Caching
- Improving the HDFS Client Connection Performance with Active NameNode Caching
- Optimization for Unstable HDFS Network
- Optimizing HDFS NameNode RPC QoS
- Optimizing HDFS DataNode RPC QoS
- Performing Concurrent Operations on HDFS Files
- Using the LZC Compression Algorithm to Store HDFS Files
-
HDFS O&M Management
- HDFS Common Configuration Parameters
- HDFS Log Overview
- Viewing the HDFS Capacity
- Changing the DataNode Storage Directory
- Adjusting Parameters Related to Damaged DataNode Disk Volumes
- Configuring the Maximum Lifetime of an HDFS Token
- Using DistCp to Copy HDFS Data Across Clusters
- Configuring the NFS Server to Store NameNode Metadata
-
Common Issues About HDFS
- What Should I Do If an Error Is Reported When I Run DistCp Commands?
- When Does a Balance Process in HDFS, Shut Down and Fail to be Executed Again?
- "This page can't be displayed" Is Displayed When Internet Explorer Fails to Access the Native HDFS UI
- What Should I Do If the HDFS Web UI Cannot Update the Information About the Damaged Data?
- What Should I Do If the HDFS Client Is Irresponsive When the NameNode Is Overloaded for a Long Time?
- Why are There Two Standby NameNodes After the active NameNode Is Restarted?
- Why Does DataNode Fail to Report Data Blocks?
- Can I Modify the DataNode Data Storage Directory?
- What Can I Do If the DataNode Capacity Is Incorrectly Calculated?
- Why Is Data in the Cache Lost When Small Files Are Stored?
- Why Is the Storage Type of File Copies DISK When the Tiered Storage Policy Is LAZY_PERSIST?
- Why Some Blocks Are Missing on the NameNode UI?
-
HDFS Troubleshooting
- Why Is "java.net.SocketException" Reported When Data Is Written to HDFS
- It Takes a Long Time to Restart NameNode After a Large Number of Files Are Deleted
- NameNode Fails to Be Restarted Due to EditLog Discontinuity
- The standby NameNode Fails to Be Started After It Is Powered Off During Metadata Storage
- DataNode Fails to Be Started When the Number of Disks Defined in dfs.datanode.data.dir Equals the Value of dfs.datanode.failed.volumes.tolerated
- "ArrayIndexOutOfBoundsException: 0" Occurs When HDFS Invokes getsplit of FileInputFormat
-
Using Hive
- Hive User Permission Management
- Using the Hive Client
- Using Hive for Data Analysis
- Configuring Hive Data Storage and Encryption
- Hive on HBase
- Using Hive to Read Data in a Relational Database
-
Enterprise-Class Enhancement of Hive
- Configuring Automatic Removal of Old Data in the Hive Directory to the Recycle Bin
- Configuring Hive to Insert Data to a Directory That Does Not Exist
- Forbidding Location Specification When Hive Internal Tables Are Created
- Creating a Foreign Table in a Directory (Read and Execute Permission Granted)
- Configuring HTTPS/HTTP-based REST APIs
- Configuring Hive Transform
- Switching the Hive Execution Engine to Tez
- Hive Load Balancing
- Configuring Access Control Permission for the Dynamic View of a Hive Single Table
- Allowing Users without ADMIN Permission to Create Temporary Functions
- Allowing Users with Select Permission to View the Table Structure
- Allowing Only the Hive Administrator to Create Databases and Tables in the Default Database
- Configuring Hive to Support More Than 32 Roles
- Creating User-Defined Hive Functions
- Configuring High Reliability for Hive Beeline
- Hive Performance Tuning
- Hive O&M Management
- Common Hive SQL Syntax
-
Common Issues About Hive
- How Do I Delete All Permanent Functions from HiveServer?
- Why Cannot the DROP Operation Be Performed on a Backed Up Hive Table?
- How to Perform Operations on Local Files with Hive User-Defined Functions
- How Do I Forcibly Stop MapReduce Jobs Executed by Hive?
- What Are the Special Characters Not Supported by Hive in Complex Field Names?
- How Do I Monitor the Hive Table Size?
- How Do I Prevent Data Loss Caused by Misoperations of the insert overwrite Statement?
- How Do I Handle a Slow Hive on Spark Task When HBase Is Not Installed?
- What Should I Do If an Error Is Reported When the WHERE Condition Is Used to Query Tables with Excessive Partitions in Hive?
- Why Cannot I Connect to HiveServer When I Use IBM JDK to Access the Beeline Client?
- Does the Location of a Hive Table Support Cross-OBS and Cross-HDFS Paths?
- What Should I Do If the MapReduce Engine Cannot Query the Data Written by the Union Statement Running on Tez?
- Does Hive Support Concurrent Data Writing to the Same Table or Partition?
- Does Hive Support Vectorized Query?
- What Should I Do If the Task Fails When the HDFS Data Directory of the Hive Table Is Deleted By Mistake, But The Metadata Still Exists?
- How Do I Disable the Logging Function of Hive?
- Why Is the OBS Quick Deletion Directory Not Applied After Being Added to the Custom Hive Configuration?
- Hive Configuration Problems
- Hive Troubleshooting
-
Using Hudi
- Hudi Table Overview
- Creating a Hudi Table Using Spark Shell
- Operating a Hudi Table Using hudi-cli.sh
- Hudi Write Operation
- Hudi Read Operation
- Data Management and Maintenance
- Typical Hudi Configuration Parameters
- Hudi Performance Tuning
-
Common Issues About Hudi
-
Data Write
- Parquet/Avro schema Is Reported When Updated Data Is Written
- UnsupportedOperationException Is Reported When Updated Data Is Written
- SchemaCompatabilityException Is Reported When Updated Data Is Written
- What Should I Do If Hudi Consumes Much Space in a Temporary Folder During Upsert?
- Hudi Fails to Write Decimal Data with Lower Precision
- Data Collection
- Hive Synchronization
-
Data Write
- Using Hue (Versions Earlier Than MRS 3.x)
-
Using Hue (MRS 3.x or Later)
- Accessing the Hue Web UI
- Using Hue WebUI to Operate Hive Tables
- Creating a Hue Job
- Typical Application Scenarios of the Hue Web UI
- Typical Hue Configurations
- Hue Log Overview
-
Common Issues About Hue
- Why Do HQL Statements Fail to Execute in Hue Using Internet Explorer?
- Why Does the use database Statement Become Invalid in Hive?
- Why Do HDFS Files Fail to Access Through the Hue Web UI?
- Why Do Large Files Fail to Upload on the Hue Page
- Why Is the Hue Native Page Cannot Be Properly Displayed If the Hive Service Is Not Installed in a Cluster?
- What Should I Do If It Takes a Long Time to Access the Native Hue UI and the File Browser Reports "Read timed out"?
- Using Impala
-
Using Kafka
- Kafka Data Consumption
- Kafka User Permission Management
- Using the Kafka Client
- Quickly Using Kafka to Produce and Consume Data
- Creating a Kafka Topic
- Checking the Consumption Information of Kafka Topics
- Managing Kafka Topics
- Enterprise-Class Enhancements of Kafka
- Kafka Performance Tuning
- Kafka O&M Management
- Common Issues About Kafka
- Using KafkaManager
-
Using Loader
- Using Loader from Scratch
- How to Use Loader
- Common Loader Parameters
- Creating a Loader Role
- Loader Link Configuration
- Managing Loader Links (Versions Earlier Than MRS 3.x)
- Managing Loader Links (MRS 3.x and Later Versions)
- Source Link Configurations of Loader Jobs
- Destination Link Configurations of Loader Jobs
- Managing Loader Jobs
- Preparing a Driver for MySQL Database Link
-
Importing Data
- Overview
- Importing Data Using Loader
- Typical Scenario: Importing Data from an SFTP Server to HDFS or OBS
- Typical Scenario: Importing Data from an SFTP Server to HBase
- Typical Scenario: Importing Data from an SFTP Server to Hive
- Typical Scenario: Importing Data from an FTP Server to HBase
- Typical Scenario: Importing Data from a Relational Database to HDFS or OBS
- Typical Scenario: Importing Data from a Relational Database to HBase
- Typical Scenario: Importing Data from a Relational Database to Hive
- Typical Scenario: Importing Data from HDFS or OBS to HBase
- Typical Scenario: Importing Data from a Relational Database to ClickHouse
- Typical Scenario: Importing Data from HDFS to ClickHouse
-
Exporting Data
- Overview
- Using Loader to Export Data
- Typical Scenario: Exporting Data from HDFS or OBS to an SFTP Server
- Typical Scenario: Exporting Data from HBase to an SFTP Server
- Typical Scenario: Exporting Data from Hive to an SFTP Server
- Typical Scenario: Exporting Data from HDFS or OBS to a Relational Database
- Typical Scenario: Exporting Data from HBase to a Relational Database
- Typical Scenario: Exporting Data from Hive to a Relational Database
- Typical Scenario: Importing Data from HBase to HDFS or OBS
- Managing Jobs
- Operator Help
-
Client Tools
- Running a Loader Job by Using Commands
- loader-tool Usage Guide
- loader-tool Usage Example
- schedule-tool Usage Guide
- schedule-tool Usage Example
- Using loader-backup to Back Up Job Data
- Open Source sqoop-shell Tool Usage Guide
- Example for Using the Open-Source sqoop-shell Tool (SFTP-HDFS)
- Example for Using the Open-Source sqoop-shell Tool (Oracle-HBase)
- Loader Log Overview
- Example: Using Loader to Import Data from OBS to HDFS
- Common Issues About Loader
- Using Kudu
-
Using MapReduce
- Configuring the Distributed Cache to Execute MapReduce Jobs
- Configuring the MapReduce Shuffle Address
- Configuring the MapReduce Cluster Administrator List
- Submitting a MapReduce Task on Windows
- Configuring the Archiving and Clearing Mechanism for MapReduce Task Logs
-
MapReduce Performance Tuning
- MapReduce Optimization Configuration for Multiple CPU Cores
- Configuring the Baseline Parameters for MapReduce Jobs
- MapReduce Shuffle Tuning
- AM Optimization for Big MapReduce Tasks
- Configuring Speculative Execution for MapReduce Tasks
- Tuning MapReduce Tasks Using Slow Start
- Optimizing the Commit Phase of MapReduce Tasks
- Improving MapReduce Client Task Reliability
- MapReduce Log Overview
-
Common Issues About MapReduce
- How Do I Handle the Problem that MapReduce Task Has No Progress for a Long Time?
- Why Is the Client Unavailable When a Task Is Running?
- What Should I Do If HDFS_DELEGATION_TOKEN Cannot Be Found in the Cache?
- How Do I Set the Task Priority When Submitting a MapReduce Task?
- Why Physical Memory Overflow Occurs If a MapReduce Task Fails?
- What Should I Do If MapReduce Job Information Cannot Be Opened Through Tracking URL on the ResourceManager Web UI?
- Why MapReduce Tasks Fails in the Environment with Multiple NameServices?
- What Should I Do If the Partition-based Task Blacklist Is Abnormal?
- Using OpenTSDB
-
Using Oozie
- Using Oozie Client to Submit an Oozie Job
-
Using Hue to Submit an Oozie Job
- Creating a Workflow Using Hue
- Submitting an Oozie Hive2 Job Using Hue
- Submitting an Oozie HQL Script Using Hue
- Submitting an Oozie Spark2x Job Using Hue
- Submitting an Oozie Java Job Using Hue
- Submitting an Oozie Loader Job Using Hue
- Submitting an Oozie MapReduce Job Using Hue
- Submitting an Oozie Sub Workflow Job Using Hue
- Submitting an Oozie Shell Job Using Hue
- Submitting an Oozie HDFS Job Using Hue
- Submitting an Oozie Streaming Job Using Hue
- Submitting an Oozie Distcp Job Using Hue
- Submitting an Oozie SSH Job Using Hue
- Submitting a Coordinator Periodic Scheduling Job Using Hue
- Submitting a Bundle Batch Processing Job Using Hue
- Querying Oozie Job Results on the Hue Web UI
- Configuring Mutual Trust Between Oozie Nodes
- Enabling Oozie High Availability (HA)
- Oozie Log Overview
- Common Issues About Oozie
- Using Presto
- Using Ranger (MRS 1.9.2)
-
Using Ranger (MRS 3.x)
- Logging In to the Ranger Web UI
- Enabling Ranger Authentication for MRS Cluster Services
- Adding a Ranger Permission Policy
-
Configuration Examples for Ranger Permission Policy
- Adding a Ranger Access Permission Policy for HDFS
- Adding a Ranger Access Permission Policy for HBase
- Adding a Ranger Access Permission Policy for Hive
- Adding a Ranger Access Permission Policy for Impala
- Adding a Ranger Access Permission Policy for Yarn
- Adding a Ranger Access Permission Policy for Spark2x
- Adding a Ranger Access Permission Policy for Kafka
- Adding a Ranger Access Permission Policy for Storm
- Viewing Ranger Audit Information
- Configuring Ranger Security Zone
- Changing the Ranger Data Source to LDAP for a Normal Cluster
- Viewing Ranger User Permission Synchronization Information
- Ranger Log Overview
-
Common Issues About Ranger
- Why Ranger Startup Fails During the Cluster Installation?
- How Do I Determine Whether the Ranger Authentication Is Used for a Service?
- Why Cannot a New User Log In to Ranger After Changing the Password?
- When an HBase Policy Is Added or Modified on Ranger, Wildcard Characters Cannot Be Used to Search for Existing HBase Tables
- Why Can't I View the Created MRS User on the Ranger Management Page?
- What Should I Do If MRS Users Failed to Be Synchronized to the Ranger Web UI?
- Using Spark (for Versions Earlier Than MRS 3.x)
-
Using Spark2x (for MRS 3.x or Later)
- Spark User Permission Management
- Using the Spark Client
- Configuring Spark to Read HBase Data
- Configuring Spark Tasks Not to Obtain HBase Token Information
- Spark Core Enterprise-Class Enhancements
- Spark SQL Enterprise-Class Enhancements
- Spark Streaming Enterprise-Class Enhancements
-
Spark Core Performance Tuning
- Spark Core Data Serialization
- Spark Core Memory Tuning
- Spark Core Memory Tuning
- Configuring Spark Core Broadcasting Variables
- Configuring Heap Memory Parameters for Spark Executor
- Using the External Shuffle Service to Improve Spark Core Performance
- Configuring Spark Dynamic Resource Scheduling in YARN Mode
- Adjusting Spark Core Process Parameters
- Spark DAG Design Specifications
- Experience
-
Spark SQL Performance Tuning
- Optimizing the Spark SQL Join Operation
- Improving Spark SQL Calculation Performance Under Data Skew
- Optimizing Spark SQL Performance in the Small File Scenario
- Optimizing the Spark INSERT SELECT Statement
- Optimizing Memory when Data Is Inserted into Dynamic Partitioned Tables
- Optimizing Small Files
- Optimizing the Aggregate Algorithms
- Optimizing Datasource Tables
- Merging CBO
- SQL Optimization for Multi-level Nesting and Hybrid Join
- Spark Streaming Performance Tuning
-
Spark O&M Management
- Configuring Parameters Rapidly
- Common Parameters
- Spark2x Logs
- Changing Spark Log Levels
- Viewing Container Logs on the Web UI
- Obtaining Container Logs of a Running Spark Application
- Configuring Spark Event Log Rollback
- Configuring the Number of Lost Executors Displayed in WebUI
- Configuring Local Disk Cache for JobHistory
- Enhancing Stability in a Limited Memory Condition
- Configuring Environment Variables in Yarn-Client and Yarn-Cluster Modes
- Broaden Support for Hive Partition Pruning Predicate Pushdown
- Configuring the Column Statistics Histogram to Enhance the CBO Accuracy
- Using CarbonData for First Query
-
Common Issues About Spark2x
-
Spark Core
- How Do I View Aggregated Spark Application Logs?
- Why Is the Return Code of Driver Inconsistent with Application State Displayed on ResourceManager WebUI?
- Why Cannot Exit the Driver Process?
- Why Does FetchFailedException Occur When the Network Connection Is Timed out
- How to Configure Event Queue Size If Event Queue Overflows?
- What Can I Do If the getApplicationReport Exception Is Recorded in Logs During Spark Application Execution and the Application Does Not Exit for a Long Time?
- What Can I Do If "Connection to ip:port has been quiet for xxx ms while there are outstanding requests" Is Reported When Spark Executes an Application and the Application Ends?
- Why Do Executors Fail to be Removed After the NodeManeger Is Shut Down?
- What Can I Do If the Message "Password cannot be null if SASL is enabled" Is Displayed?
- "Failed to CREATE_FILE" Is Displayed When Data Is Inserted into the Dynamic Partitioned Table Again
- Why Tasks Fail When Hash Shuffle Is Used?
- What Can I Do If the Error Message "DNS query failed" Is Displayed When I Access the Aggregated Logs Page of Spark Applications?
- What Can I Do If Shuffle Fetch Fails Due to the "Timeout Waiting for Task" Exception?
- Why Does the Stage Retry due to the Crash of the Executor?
- Why Do the Executors Fail to Register Shuffle Services During the Shuffle of a Large Amount of Data?
- NodeManager OOM Occurs During Spark Application Execution
- Why Does the Realm Information Fail to Be Obtained When SparkBench is Run on HiBench for the Cluster in Security Mode?
-
Spark SQL and DataFrame
- What Do I have to Note When Using Spark SQL ROLLUP and CUBE?
- Why Spark SQL Is Displayed as a Temporary Table in Different Databases?
- How to Assign a Parameter Value in a Spark Command?
- What Directory Permissions Do I Need to Create a Table Using SparkSQL?
- Why Do I Fail to Delete the UDF Using Another Service?
- Why Cannot I Query Newly Inserted Data in a Parquet Hive Table Using SparkSQL?
- How to Use Cache Table?
- Why Are Some Partitions Empty During Repartition?
- Why Does 16 Terabytes of Text Data Fails to Be Converted into 4 Terabytes of Parquet Data?
- How Do I Rectify the Exception Occurred When I Perform an Operation on the Table Named table?
- Why Is a Task Suspended When the ANALYZE TABLE Statement Is Executed and Resources Are Insufficient?
- If I Access a parquet Table on Which I Do not Have Permission, Why a Job Is Run Before "Missing Privileges" Is Displayed?
- Why Do I Fail to Modify MetaData by Running the Hive Command?
- Why Is "RejectedExecutionException" Displayed When I Exit Spark SQL?
- How Do I Do If I Incidentally Kill the JDBCServer Process During Health Check?
- Why No Result Is found When 2016-6-30 Is Set in the Date Field as the Filter Condition?
- Why Does the --hivevar Option I Specified in the Command for Starting spark-beeline Fail to Take Effect?
- Why Is the "Code of method ... grows beyond 64 KB" Error Message Displayed When I Run Complex SQL Statements?
- Why Is Memory Insufficient if 10 Terabytes of TPCDS Test Suites Are Consecutively Run in Beeline/JDBCServer Mode?
- Why Functions Cannot Be Used When Different JDBCServers Are Connected?
- Why Does an Exception Occur When I Drop Functions Created Using the Add Jar Statement?
- Why Does Spark2x Have No Access to DataSource Tables Created by Spark1.5?
- Why Cannot I Query Newly Inserted Data in an ORC Hive Table Using Spark SQL?
-
Spark Streaming
- Same DAG Log Is Recorded Twice for a Streaming Task
- What Can I Do If Spark Streaming Tasks Are Blocked?
- What Should I Pay Attention to When Optimizing Spark Streaming Task Parameters?
- Why Does the Spark Streaming Application Fail to Be Submitted After the Token Validity Period Expires?
- Why Does the Spark Streaming Application Fail to Be Started from the Checkpoint When the Input Stream Has No Output Logic?
- Why Is the Input Size Corresponding to Batch Time on the Web UI Set to 0 Records When Kafka Is Restarted During Spark Streaming Running?
- Why the Job Information Obtained from the restful Interface of an Ended Spark Application Is Incorrect?
- Why Cannot I Switch from the Yarn Web UI to the Spark Web UI?
- What Can I Do If an Error Occurs when I Access the Application Page Because the Application Cached by HistoryServer Is Recycled?
- Why Is not an Application Displayed When I Run the Application with the Empty Part File?
- Why Does Spark2x Fail to Export a Table with the Same Field Name?
- Why JRE fatal error after running Spark application multiple times?
- Native Spark2x UI Fails to Be Accessed or Is Incorrectly Displayed when Internet Explorer Is Used for Access
- How Does Spark2x Access External Cluster Components?
- Why Does the Foreign Table Query Fail When Multiple Foreign Tables Are Created in the Same Directory?
- Why Is the Native Page of an Application in Spark2x JobHistory Displayed Incorrectly?
- Why Do I Fail to Create a Table in the Specified Location on OBS After Logging to spark-beeline?
- Spark Shuffle Exception Handling
-
Spark Core
-
Using Sqoop
- Using Sqoop from Scratch
- Adapting Sqoop 1.4.7 to MRS 3.x Clusters
- Common Sqoop Commands and Parameters
-
Common Issues About Sqoop
- What Should I Do If Class QueryProvider Is Unavailable?
- What Should I Do If Method getHiveClient Does Not Exist?
- How Do I Do If PostgreSQL or GaussDB Fails to Connect?
- What Should I Do If Data Failed to Be Synchronized to a Hive Table on the OBS Using hive-table?
- What Should I Do If Data Failed to Be Synchronized to an ORC or Parquet Table Using hive-table?
- What Should I Do If Data Failed to Be Synchronized Using hive-table?
- What Should I Do If Data Failed to Be Synchronized to a Hive Parquet Table Using HCatalog?
- What Should I Do If the Data Type of Fields timestamp and data Is Incorrect During Data Synchronization Between Hive and MySQL?
-
Using Storm
- Using Storm from Scratch
- Using the Storm Client
- Submitting Storm Topologies on the Client
- Accessing the Storm Web UI
- Managing Storm Topologies
- Querying Storm Topology Logs
- Storm Common Parameters
- Configuring a Storm Service User Password Policy
- Migrating Storm Services to Flink
- Storm Log Introduction
- Performance Tuning
- Using Tez
-
Using YARN
- YARN User Permission Management
- Submitting a Task Using the Yarn Client
- Configuring Container Log Aggregation
- Enabling Yarn CGroups to Limit the Container CPU Usage
-
Enterprise-Class Enhancement of YARN
- Configuring the Yarn Permission Control
- Specifying the User Who Runs Yarn Tasks
- Configuring the Number of ApplicationMaster Retries
- Configure the ApplicationMaster to Automatically Adjust the Allocated Memory
- Configuring ApplicationMaster Work Preserving
- Configuring the Access Channel Protocol
- Configuring the Additional Scheduler WebUI
- Configuring Resources for a NodeManager Role Instance
- Configuring Yarn Restart
- Yarn Performance Tuning
- YARN O&M Management
-
Common Issues About Yarn
- Why Mounted Directory for Container is Not Cleared After the Completion of the Job While Using CGroups?
- Why the Job Fails with HDFS_DELEGATION_TOKEN Expired Exception?
- Why Are Local Logs Not Deleted After YARN Is Restarted?
- Why the Task Does Not Fail Even Though AppAttempts Restarts for More Than Two Times?
- Why Is an Application Moved Back to the Original Queue After ResourceManager Restarts?
- Why Does Yarn Not Release the Blacklist Even All Nodes Are Added to the Blacklist?
- Why Does the Switchover of ResourceManager Occur Continuously?
- Why Does a New Application Fail If a NodeManager Has Been in Unhealthy Status for 10 Minutes?
- Why Does an Error Occur When I Query the ApplicationID of a Completed or Non-existing Application Using the RESTful APIs?
- Why May A Single NodeManager Fault Cause MapReduce Task Failures in the Superior Scheduling Mode?
- Why Are Applications Suspended After They Are Moved From Lost_and_Found Queue to Another Queue?
- How Do I Limit the Size of Application Diagnostic Messages Stored in the ZKstore?
- Why Does a MapReduce Job Fail to Run When a Non-ViewFS File System Is Configured as ViewFS?
- Why Do Reduce Tasks Fail to Run in Some OSs After the Native Task Feature is Enabled?
-
Using ZooKeeper
- Using ZooKeeper from Scratch
- Configuring the ZooKeeper Permissions
- Common ZooKeeper Parameters
- ZooKeeper Log Overview
-
Common Issues About ZooKeeper
- Why Do ZooKeeper Servers Fail to Start After Many znodes Are Created?
- Why Does the ZooKeeper Server Display the java.io.IOException: Len Error Log?
- Why Four Letter Commands Don't Work With Linux netcat Command When Secure Netty Configurations Are Enabled at Zookeeper Server?
- How Do I Check Which ZooKeeper Instance Is a Leader?
- Why Cannot the Client Connect to ZooKeeper using the IBM JDK?
- What Should I Do When the ZooKeeper Client Fails to Refresh a TGT?
- Why Is Message "Node does not exist" Displayed when A Large Number of Znodes Are Deleted Using the deleteallCommand
- Appendix
-
Component Operation Guide (LTS)
-
Using CarbonData
- CarbonData Data Types
- CarbonData Table User Permissions
- Creating a CarbonData Table Using the Spark Client
- CarbonData Data Analytics
- CarbonData Performance Tuning
- Typical CarbonData Configuration Parameters
-
CarbonData Syntax Reference
- CREATE TABLE
- CREATE TABLE As SELECT
- DROP TABLE
- SHOW TABLES
- ALTER TABLE COMPACTION
- TABLE RENAME
- ADD COLUMNS
- DROP COLUMNS
- CHANGE DATA TYPE
- REFRESH TABLE
- REGISTER INDEX TABLE
- LOAD DATA
- UPDATE CARBON TABLE
- DELETE RECORDS from CARBON TABLE
- INSERT INTO CARBON TABLE
- DELETE SEGMENT by ID
- DELETE SEGMENT by DATE
- SHOW SEGMENTS
- CREATE SECONDARY INDEX
- SHOW SECONDARY INDEXES
- DROP SECONDARY INDEX
- CLEAN FILES
- SET/RESET
- Concurrent CarbonData Table Operations
- CarbonData Segment API
- CarbonData Tablespace Index
-
Common Issues About CarbonData
- Why Is Incorrect Output Displayed When I Perform Query with Filter on Decimal Data Type Values?
- How to Avoid Minor Compaction for Historical Data?
- How to Change the Default Group Name for CarbonData Data Loading?
- Why Does INSERT INTO CARBON TABLE Command Fail?
- Why Is the Data Logged in Bad Records Different from the Original Input Data with Escape Characters?
- Why Data Load Performance Decreases due to Bad Records?
- Why Data loading Fails During off heap?
- Why Do I Fail to Create a Hive Table?
- How Do I Logically Split Data Across Different Namespaces?
- Why the UPDATE Command Cannot Be Executed in Spark Shell?
- How Do I Configure Unsafe Memory in CarbonData?
- Why Does CarbonData Become Abnormal After the Disk Space Quota of the HDFS Storage Directory Is Set?
- Why Do Files of a Carbon Table Exist in the Recycle Bin Even If the drop table Command Is Not Executed When Mis-deletion Prevention Is Enabled?
- How Do I Restore the Latest tablestatus File That Has Been Lost or Damaged When TableStatus Versioning Is Enabled?
-
CarbonData Troubleshooting
- Filter Result Is not Consistent with Hive when a Big Double Type Value Is Used in Filter
- Query Performance Deteriorated Due to Insufficient Executor Memory
- Data Query or Loading Failed, and "org.apache.carbondata.core.memory.MemoryException: Not enough memory" Was Reported
- Why INSERT INTO/LOAD DATA Task Distribution Is Incorrect and the Opened Tasks Are Less Than the Available Executors when the Number of Initial ExecutorsIs Zero?
- Why Does CarbonData Require Additional Executors Even Though the Parallelism Is Greater Than the Number of Blocks to Be Processed?
-
Using CDL
- Integrating CDL Data
- CDL User Permission Management
- Creating a Data Synchronization Job with CDL
- Preparing for Creating a CDL Job
-
Creating a CDL Job
- Creating a CDL Data Synchronization Job
- Creating a CDL Data Comparison Job
- Synchronizing Data from PgSQL to Kafka Using CDL
- Synchronizing Data from PgSQL to Hudi Using CDL
- Synchronizing Data from openGauss to Hudi Using CDL
- Synchronizing Data from Hudi to DWS Using CDL
- Synchronizing Data from Hudi to ClickHouse Using CDL
- Synchronizing openGauss Data to Hudi Using CDL (ThirdKafka)
- Synchronizing drs-oracle-json Database to Hudi Using CDL (ThirdKafka)
- Synchronizing drs-oracle-avro Database to Hudi Using CDL (ThirdKafka)
- CDL Job DDL Changes
- CDL Log Overview
- Common Issues About CDL
-
CDL Troubleshooting
- Error 403 Is Reported When a CDL Job Is Stopped
- Error 104 or 143 Is Reported After a CDL Job Runs for a Period of Time
- Why Is the Value of Task configured for the OGG Source Different from the Actual Number of Running Tasks When Data Is Synchronized from OGG to Hudi?
- Why Are There Too Many Topic Partitions Corresponding to the CDL Synchronization Task Names?
- What Should I When a CDL Task Is Executed to Synchronize Data to the Hudi, an Error Message Indicating that the Current User Does Not Have the Permission to Create Tables?
- Error Is Reported When the Job of Capturing Data From PgSQL to Hudi Is Started
-
Using ClickHouse
- ClickHouse Overview
- ClickHouse User Permission Management
- ClickHouse Client Practices
- ClickHouse Data Import
- Enterprise-Class Enhancements of ClickHouse
- ClickHouse Performance Tuning
-
ClickHouse O&M Management
- ClickHouse Log Overview
- Enabling the Read-Only Mode for ClickHouse Tables
- Migrating Data Between ClickHouseServer Nodes in a Cluster
- Migrating ClickHouse Data from One MRS Cluster to Another
- Backing Up and Restoring ClickHouse Data Using a Data File
- Configuring TTL for the ClickHouse System Table
- Configuring the Default ClickHouse User Password (MRS 3.1.2-LTS)
- Clearing the Passwords of Default ClickHouse Users
-
Common ClickHouse SQL Syntax
- CREATE DATABASE: Creating a Database
- CREATE TABLE: Creating a Table
- INSERT INTO: Inserting Data into a Table
- DELETE: Lightweight Deleting Table Data
- SELECT: Querying Table Data
- ALTER TABLE: Modifying a Table Structure
- ALTER TABLE: Modifying Table Data
- DESC: Querying a Table Structure
- DROP: Deleting a Table
- SHOW: Displaying Information About Databases and Tables
- UPSERT: Writing Data
-
Common Issues About ClickHouse
- How Do I Do If the Disk Status Displayed in the System.disks Table Is fault or abnormal?
- How Do I Migrate Data from Hive/HDFS to ClickHouse?
- How Do I Migrate Data from OBS/S3 to ClickHouse?
- An Error Is Reported in Logs When the Auxiliary ZooKeeper or Replica Data Is Used to Synchronize Table Data
- How Do I Grant the Select Permission at the Database Level to ClickHouse Users?
- How Do I Quickly Restore ClickHouse When Concurrent Requests Are Stacked for a Long Time?
- Using DBService
-
Using Doris
- Overview of the Doris Data Model
- Managing Doris User Permissions
- Using the MySQL Client to Connect to Doris
- Getting Started with Doris
- Importing Doris Data
- Exporting Doris Data
- Enterprise-Class Enhancements of Doris
- Doris O&M Management
- Typical SQL Syntax of Doris
-
Common Issues About Doris
- What Should I Do If Occasionally Occurs During Table Creation Due to the Configuration of the SSD and HDD Data Directories?
- What Should I Do If RPC Timeout Error Is Reported When Stream Load Is Used?
- What Do I Do If the Error Message "plugin not enabled" Is Displayed When the MySQL Client Is Used to Connect to the Doris Database?
- How Do I Handle the FE Startup Failure?
- How Do I Handle the Startup Failure Due to Incorrect IP Address Matching for the BE Instance?
- What Should I Do If Error Message "Read timed out" Is Displayed When the MySQL Client Connects to the Doris?
- What Should I Do If an Error Is Reported When the BE Runs a Data Import or Query Task?
- What Should I Do If a Timeout Error Is Reported When Broker Load Imports Data?
- What Should I Do If an Error Message Is Displayed When Broker Load Is Used to Import Data?
- Doris Troubleshooting
-
Using Flink
- Flink Job Engine
- Flink User Permission Management
- Using the Flink Client
- Preparing for Creating a FlinkServer Job
-
Creating a FlinkServer Job
- Creating a FlinkServer Job and Writing Data to a ClickHouse Table
- Creating a FlinkServer Job to Interconnect with a GaussDB(DWS) Table
- Creating a FlinkServer Job to Write Data to an HBase Table
- Creating a FlinkServer Job to Write Data to an HDFS
- Creating a FlinkServer Job to Write Data to a Hive Table
- Creating a FlinkServer Job to Write Data to a Hudi Table
- Creating a FlinkServer Job to Write Data to a Kafka Message Queue
- Managing FlinkServer Jobs
- Enterprise-Class Enhancements of Flink
- Flink O&M Management
- Flink Performance Tuning
- Typical Commands of the Flink Client
- Common Flink SQL Syntax
- Common Issues About Flink
- Flink Troubleshooting
-
Using Flume
- Flume Log Collection
- Flume Service Model Configuration
- Installing the Flume Client
- Quickly Using Flume to Collect Node Logs
-
Configuring a Non-Encrypted Flume Data Collection Task
- Generating Configuration Files for the Flume Server and Client
- Using Flume Server to Collect Static Logs from Local Host to Kafka
- Using Flume Server to Collect Static Logs from Local Host to HDFS
- Using Flume Server to Collect Dynamic Logs from Local Host to HDFS
- Using Flume Server to Collect Logs from Kafka to HDFS
- Using Flume Client to Collect Logs from Kafka to HDFS
- Using Cascaded Agents to Collect Static Logs from Local Host to HBase
- Configuring an Encrypted Flume Data Collection Task
- Enterprise-Class Enhancements of Flume
- Flume O&M Management
- Common Issues About Flume
- Using Guardian
-
Using HBase
- Creating an HBase Permission Role
- Using the HBase Client
- Using HBase for Offline Data Analysis
- Migrating Data to HBase Using BulkLoad
- HBase Data Operations
-
Enterprise-Class Enhancements of HBase
-
Configuring HBase Global Secondary Indexes for Faster Queries
- Introduction to HBase Global Secondary Indexes
- Creating an HBase Global Secondary Index
- Querying an HBase Global Secondary Index
- Changing Status of HBase Global Secondary Indexes
- Creating HBase Global Secondary Indexes in Batches
- Checking HBase Global Secondary Index Data Consistency
- Querying HBase Table Data with Global Secondary Indexes
- Configuring HBase Local Secondary Indexes for Faster Queries
- Improving HBase BulkLoad Data Migration
- Using the Spark BulkLoad Tool to Synchronize Data to HBase Tables
- Configuring Hot-Cold Data Separate in HBase
- Configuring RSGroup to Manage RegionServer Resource
- Checking Slow and Oversized HBase Requests
- Configuring HBase Table-Level Overload Control
- Enabling the HBase Multicast Function
-
Configuring HBase Global Secondary Indexes for Faster Queries
-
HBase Performance Tuning
- Improving the Batch Loading Efficiency of HBase BulkLoad
- Improving HBase Continuous Put Performance
- Improving HBase Put and Scan Performance
- Improving HBase Real-Time Write Efficiency
- Improving HBase Real-Time Read Efficiency
- Accelerating HBase Compaction During Off-Peak Hours
- Tuning HBase JVM Parameters
- Optimization for HBase Overload
- Enabling CCSMap Functions
- Enabling Succinct Trie
- HBase O&M Management
-
Common Issues About HBase
- Operation Failures Occur in Stopping BulkLoad On the Client
- How Do I Restore a Region in the RIT State for a Long Time?
- Why Does HMaster Exits Due to Timeout When Waiting for the NameSpace Table to Go Online?
- Why Does SocketTimeoutException Occur When a Client Queries HBase?
- Why "java.lang.UnsatisfiedLinkError: Permission denied" exception thrown while starting HBase shell?
- When does the RegionServers listed under "Dead Region Servers" on HMaster WebUI gets cleared?
- Insufficient Rights When Accessing Phoenix
- How Do I Fix Region Overlapping?
- Restrictions on using the Phoenix BulkLoad Tool
- Why a Message Is Displayed Indicating that the Permission is Insufficient When CTBase Connects to the Ranger Plug-ins?
- Introduction to HBase Global Secondary Index APIs
-
HBase Troubleshooting
- Why Does a Client Keep Failing to Connect to a Server for a Long Time?
- Why May a Table Creation Exception Occur When HBase Deletes or Creates the Same Table Consecutively?
- Why Other Services Become Unstable If HBase Sets up A Large Number of Connections over the Network Port?
- Why Does the HBase BulkLoad Task Consisting of 210,000 Map Tasks and 10,000 Reduce Tasks Fail?
- Why Modified and Deleted Data Can Still Be Queried by Using the Scan Command?
- What Should I Do If I Fail to Create Tables Due to the FAILED_OPEN State of Regions?
- How Do I Delete Residual Table Names in the table-lock Directory of ZooKeeper?
- Why Does HBase Become Faulty When I Set a Quota for the Directory Used by HBase in HDFS?
- HMaster Fails to Be Started After the OfflineMetaRepair Tool Is Used to Rebuild Metadata
- Why Messages Containing FileNotFoundException Frequently Displayed in the HMaster Logs?
- Why Are Different Query Results Returned After I Use Same Query Criteria to Query Data Successfully Imported by HBase bulkload?
- HBase Fails to Recover a Task
- Why Does RegionServer Fail to Be Started When GC Parameters Xms and Xmx of HBase RegionServer Are Set to 31 GB?
- Why Does the LoadIncrementalHFiles Tool Fail to Be Executed and "Permission denied" Is Displayed?
- Why Is the Error Message "import argparse" Displayed When the Phoenix sqlline Script Is Used?
- How Do I View Regions in the CLOSED State in an ENABLED Table?
- How Can I Quickly Recover the Service When HBase Files Are Damaged Due to a Cluster Power-Off?
- How Do I Quickly Restore HBase After HDFS Enters the Safe Mode and the HBase Service Is Abnormal?
-
Using HDFS
- Overview of HDFS File System Directories
- HDFS User Permission Management
- Using the HDFS Client
- Using Hadoop
- Configuring the Recycle Bin Mechanism
- Configuring HDFS DataNode Data Balancing
- Configuring HDFS Disk Balancing
- Using HDFS Mover to Migrate Data
- Configuring the Label Policy (NodeLabel) for HDFS File Directories
- Configuring NameNode Memory Parameters
- Setting the Number Limit of HBase and HDFS Handles
- Configuring the Number of Files in a Single HDFS Directory
- Enterprise-Class Enhancements of HDFS
-
HDFS Performance Tuning
- Improving HDFS Write Performance
- Improving Read Performance By HDFS Client Metadata Caching
- Improving the HDFS Client Connection Performance with Active NameNode Caching
- Optimization for Unstable HDFS Network
- Optimizing HDFS NameNode RPC QoS
- Optimizing HDFS DataNode RPC QoS
- Performing Concurrent Operations on HDFS Files
- Using the LZC Compression Algorithm to Store HDFS Files
-
HDFS O&M Management
- HDFS Common Configuration Parameters
- HDFS Log Overview
- Planning HDFS Capacity
- Changing the DataNode Storage Directory
- Configuring the Damaged Disk Volume
- Configuring the Maximum Lifetime of an HDFS Token
- Using DistCp to Copy HDFS Data Across Clusters
- Configuring the NFS Server to Store NameNode Metadata
-
Common Issues About HDFS
- What Should I Do If an Error Is Reported When I Run DistCp Commands?
- When Does a Balance Process in HDFS, Shut Down and Fail to be Executed Again?
- "This page can't be displayed" Is Displayed When Internet Explorer Fails to Access the Native HDFS UI
- What Should I Do If the HDFS Web UI Cannot Update the Information About the Damaged Data?
- What Should I Do If the HDFS Client Is Irresponsive When the NameNode Is Overloaded for a Long Time?
- Why are There Two Standby NameNodes After the active NameNode Is Restarted?
- Why Does DataNode Fail to Report Data Blocks?
- Can I Modify the DataNode Data Storage Directory?
- What Can I Do If the DataNode Capacity Is Incorrectly Calculated?
- Why Is Data in the Cache Lost When Small Files Are Stored?
- Why Is the Storage Type of File Copies DISK When the Tiered Storage Policy Is LAZY_PERSIST?
- Why Some Blocks Are Missing on the NameNode UI?
-
HDFS Troubleshooting
- Why Is "java.net.SocketException" Reported When Data Is Written to HDFS
- It Takes a Long Time to Restart NameNode After a Large Number of Files Are Deleted
- NameNode Fails to Be Restarted Due to EditLog Discontinuity
- The standby NameNode Fails to Be Started After It Is Powered Off During Metadata Storage
- DataNode Fails to Be Started When the Number of Disks Defined in dfs.datanode.data.dir Equals the Value of dfs.datanode.failed.volumes.tolerated
- "ArrayIndexOutOfBoundsException: 0" Occurs When HDFS Invokes getsplit of FileInputFormat
- The Standby NameNode Fails to Be Started Because It Is Not Started for a Long Time
-
Using HetuEngine
- Overview of HetuEngine Interactive Query
- HetuEngine User Permission Management
- Quickly Using HetuEngine to Access Hive Data Source
- Creating a HetuEngine Compute Instance
-
Adding a HetuEngine Data Source
- Using HetuEngine to Access Data Sources Across Sources and Domains
- Adding a Hive Data Source
- Adding a Hudi Data Source
- Adding a ClickHouse Data Source
- Adding a GaussDB Data Source
- Adding an HBase Data Source
- Adding a Cross-Cluster HetuEngine Data Source
- Adding an IoTDB Data Source
- Adding a MySQL Data Source
-
Configuring HetuEngine Materialized Views
- Overview of HetuEngine Materialized Views
- SQL Examples of HetuEngine Materialized Views
- Rewriting of HetuEngine Materialized Views
- HetuEngine Materialized View Recommendation
- HetuEngine Materialized View Caching
- Validity Period and Data Update of HetuEngine Materialized Views
- HetuEngine Intelligent Materialized Views
- Automatic Tasks of HetuEngine Materialized Views
- HetuEngine SQL Diagnosis
- Developing and Deploying HetuEngine UDFs
- Managing a HetuEngine Data Source
-
Managing HetuEngine Compute Instances
- Configuring HetuEngine Resource Groups
- Configuring the Number of HetuEngine Worker Nodes
- Configuring a HetuEngine Maintenance Instance
- Configuring the Nodes on Which HetuEngine Coordinator Is Running
- Importing and Exporting HetuEngine Compute Instance Configurations
- Viewing the HetuEngine Instance Monitoring Page
- Viewing HetuEngine Coordinator and Worker Logs
- Configuring HetuEngine Query Fault Tolerance
-
HetuEngine Performance Tuning
- Adjusting YARN Resource Allocation
- Adjusting HetuEngine Cluster Node Resource Configurations
- Optimizing HetuEngine INSERT Statements
- Adjusting HetuEngine Metadata Caching
- Enabling Dynamic Filtering in HetuEngine
- Adjusting the Execution of Adaptive Queries in HetuEngine
- Adjusting Timeout for Hive Metadata Loading
- HetuEngine Log Overview
-
Common HetuEngine SQL Syntax
- HetuEngine Data Type
-
HetuEngine DDL SQL Syntax
- CREATE SCHEMA
- CREATE VIRTUAL SCHEMA
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE LIKE
- CREATE VIEW
- CREATE FUNCTION
- CREATE MATERIALIZED VIEW
- ALTER MATERIALIZED VIEW STATUS
- ALTER MATERIALIZED VIEW
- ALTER TABLE
- ALTER VIEW
- ALTER SCHEMA
- DROP SCHEMA
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- DROP MATERIALIZED VIEW
- REFRESH MATERIALIZED VIEW
- TRUNCATE TABLE
- COMMENT
- VALUES
- SHOW Syntax Overview
- SHOW CATALOGS
- SHOW SCHEMAS (DATABASES)
- SHOW TABLES
- SHOW TBLPROPERTIES TABLE|VIEW
- SHOW TABLE/PARTITION EXTENDED
- SHOW STATS
- SHOW FUNCTIONS
- SHOW SESSION
- SHOW PARTITIONS
- SHOW COLUMNS
- SHOW CREATE TABLE
- SHOW VIEWS
- SHOW CREATE VIEW
- SHOW MATERIALIZED VIEWS
- SHOW CREATE MATERIALIZED VIEW
- HetuEngine DML SQL Syntax
- HetuEngine TCL SQL Syntax
- HetuEngine DQL SQL Syntax
-
HetuEngine SQL Functions and Operators
- Logical Operators
- Comparison Functions and Operators
- Condition Expression
- Lambda Expression
- Conversion Functions
- Mathematical Functions and Operators
- Bitwise Functions
- Decimal Functions and Operators
- String Functions and Operators
- Regular Expressions
- Binary Functions and Operators
- JSON Functions and Operators
- Date and Time Functions and Operators
- Aggregate Functions
- Window Functions
- Array Functions and Operators
- Map Functions and Operators
- URL Function
- Geospatial Function
- HyperLogLog Functions
- UUID Function
- Color Function
- Session Information
- Teradata Function
- Data Masking Functions
- IP Address Functions
- Quantile Digest Functions
- T-Digest Functions
- Set Digest Functions
- HetuEngine Auxiliary Command Syntax
- HetuEngine Reserved Keywords
- HetuEngine Implicit Data Type Conversion
- Data Preparation for the Sample Table
- HetuEngine Syntax Compatibility with Common Data Sources
- Common Issues About HetuEngine
- HetuEngine Troubleshooting
-
Using Hive
- Hive User Permission Management
- Using the Hive Client
- Using Hive for Data Analysis
- Configuring Hive Data Storage and Encryption
- Hive on HBase
- Using Hive to Read Data in a Relational Database
- Hive Supporting Reading Hudi Tables
-
Enterprise-Class Enhancements of Hive
- Storing Hive Table Partitions to OBS and HDFS
- Configuring Automatic Removal of Old Data in the Hive Directory to the Recycle Bin
- Configuring Hive to Insert Data to a Directory That Does Not Exist
- Forbidding Location Specification When Hive Internal Tables Are Created
- Creating a Foreign Table in a Directory (Read and Execute Permission Granted)
- Configuring HTTPS/HTTP-based REST APIs
- Configuring Hive Transform
- Switching the Hive Execution Engine to Tez
- Hive Load Balancing
- Configuring Access Control Permission for the Dynamic View of a Hive Single Table
- Allowing Users without ADMIN Permission to Create Temporary Functions
- Allowing Users with Select Permission to View the Table Structure
- Allowing Only the Hive Administrator to Create Databases and Tables in the Default Database
- Configuring Hive to Support More Than 32 Roles
- Creating User-Defined Hive Functions
- Configuring High Reliability for Hive Beeline
- Detecting Statements That Overwrite a Table with Its Own Data
- Configuring Hive Dynamic Data Masking
- Hive Performance Tuning
- Hive O&M Management
- Common Hive SQL Syntax
-
Common Issues About Hive
- How Do I Delete UDFs on Multiple HiveServers?
- Why Cannot the DROP operation Be Performed on a Backed-up Hive Table?
- How to Perform Operations on Local Files with Hive User-Defined Functions
- How Do I Forcibly Stop MapReduce Jobs Executed by Hive?
- Which special characters are not supported by Hive in complex field names
- How Do I Monitor the Hive Table Size?
- How Do I Prevent Data Loss Caused by Misoperations of the insert overwrite Statement?
- Why Is Hive on Spark Task Freezing When HBase Is Not Installed?
- Error Reported When the WHERE Condition Is Used to Query Tables with Excessive Partitions in FusionInsight Hive
- Why Cannot I Connect to HiveServer When I Use IBM JDK to Access the Beeline Client?
- Description of Hive Table Location (Either Be an OBS or HDFS Path)
- Why Cannot Data Be Queried After the MapReduce Engine Is Switched After the Tez Engine Is Used to Execute Union-related Statements?
- Why Does Hive Not Support Concurrent Data Writing to the Same Table or Partition?
- Does Hive Support Vectorized Query?
- Why Does Metadata Still Exist When the HDFS Data Directory of the Hive Table Is Deleted by Mistake?
- How Do I Disable the Logging Function of Hive?
- Why Hive Tables in the OBS Directory Fail to Be Deleted?
- Why Does an OBS Quickly Deleted Directory Not Take Effect After Being Added to the Customized Hive Configuration?
- What Do I Do If Error Message "Not expecting to handle any events" Is Displayed When the Tez Engine Is Used to Execute Hive SQL Tasks?
- What Do I Do If Error Message "Client cannot authenticate via:[TOKEN, KERBEROS]" Is Displayed When the Tez Engine Is Used to Execute Hive SQL Tasks?
- Hive Troubleshooting
-
Using Hudi
- Hudi Table Overview
- Creating a Hudi Table Using Spark Shell
- Operating a Hudi Table Using hudi-cli.sh
- Hudi Write Operation
- Hudi Read Operation
- Hudi Data Management and Maintenance
- Hudi SQL Syntax Reference
- Hudi Schema Evolution
- Configuring Default Values for Hudi Data Columns
- Typical Hudi Configuration Parameters
- Hudi Performance Tuning
-
Common Issues About Hudi
- "Parquet/Avro schema" Is Reported When Updated Data Is Written
- UnsupportedOperationException Is Reported When Updated Data Is Written
- SchemaCompatabilityException Is Reported When Updated Data Is Written
- What Should I Do If Hudi Consumes Much Space in a Temporary Folder During Upsert?
- Hudi Fails to Write Decimal Data with Lower Precision
- Data in ro and rt Tables Cannot Be Synchronized to a MOR Table Recreated After Being Deleted Using Spark SQL
- IllegalArgumentException Is Reported When Kafka Is Used to Collect Data
- SQLException Is Reported During Hive Data Synchronization
- HoodieHiveSyncException Is Reported During Hive Data Synchronization
- SemanticException Is Reported During Hive Data Synchronization
-
Using Hue
- Accessing the Hue Web UI
- Creating a Hue Job
- Configuring HDFS Cold and Hot Data Migration
- Typical Hue Parameters
- Hue Log Overview
- Common Issues About Hue
-
Hue Troubleshooting
- Why Does the use database Statement Become Invalid in Hive?
- Why Do HDFS Files Fail to Access Through the Hue Web UI?
- Why Do Large Files Fail to Upload on the Hue Page
- Why Is the Hue Native Page Cannot Be Properly Displayed If the Hive Service Is Not Installed in a Cluster?
- What Should I Do If It Takes a Long Time to Access the Native Hue UI and the File Browser Reports "Read timed out"?
- Using IoTDB
- Using JobGateway
- Using Kafka
-
Using Loader
- Overview of Importing and Exporting Loader Data
- Loader User Permission Management
- Uploading the MySQL Database Connection Driver
-
Creating a Loader Data Import Job
- Using Loader to Import Data to an MRS Cluster
- Using Loader to Import Data from an SFTP Server to HDFS or OBS
- Using Loader to Import Data from an SFTP Server to HBase
- Using Loader to Import Data from an SFTP Server to Hive
- Using Loader to Import Data from an FTP Server to HBase
- Using Loader to Import Data from a Relational Database to HDFS or OBS
- Using Loader to Import Data from a Relational Database to HBase
- Using Loader to Import Data from a Relational Database to Hive
- Using Loader to Import Data from HDFS or OBS to HBase
- Using Loader to Import Data from a Relational Database to ClickHouse
- Using Loader to Import Data from HDFS to ClickHouse
-
Creating a Loader Data Export Job
- Using Loader to Export Data from an MRS Cluster
- Using Loader to Export Data from HDFS or OBS to an SFTP Server
- Using Loader to Export Data from HBase to an SFTP Server
- Using Loader to Export Data from Hive to an SFTP Server
- Using Loader to Export Data from HDFS or OBS to a Relational Database
- Using Loader to Export Data from HDFS to MOTService
- Using Loader to Export Data from HBase to a Relational Database
- Using Loader to Export Data from Hive to a Relational Database
- Using Loader to Export Data from HBase to HDFS or OBS
- Using Loader to Export Data from HDFS to ClickHouse
- Managing Loader Jobs
- Loader O&M Management
- Loader Operator Help
-
Loader Client Tools
- Running a Loader Job by Using Commands
- loader-tool Usage Guide
- loader-tool Usage Example
- schedule-tool Usage Guide
- schedule-tool Usage Example
- Using loader-backup to Back Up Job Data
- Open Source sqoop-shell Tool Usage Guide
- Importing Data to HDFS Using sqoop-shell
- Importing Data to HDFS Using sqoop-shell
-
Common Issues About Loader
- Data Cannot Be Saved When Loader Jobs Are Configured
- Differences Among Connectors Used During the Process of Importing Data from the Oracle Database to HDFS
- Why Data Is Not Imported to HDFS After All Data Types of SQL Server Are Selected?
- An Error Is Reported When a Large Amount of Data Is Written to HDFS
- Failed to Run Jobs Related to the sftp-connector Connector
-
Using MapReduce
- Configuring the Distributed Cache to Execute MapReduce Jobs
- Configuring the MapReduce Shuffle Address
- Configuring the MapReduce Cluster Administrator List
- Submitting a MapReduce Task on Windows
- Configuring the Archiving and Clearing Mechanism for MapReduce Task Logs
-
MapReduce Performance Tuning
- MapReduce Optimization Configuration for Multiple CPU Cores
- Configuring the Baseline Parameters for MapReduce Jobs
- MapReduce Shuffle Tuning
- AM Optimization for Big MapReduce Tasks
- Configuring Speculative Execution for MapReduce Tasks
- Tuning MapReduce Tasks Using Slow Start
- Optimizing the Commit Phase of MapReduce Tasks
- Improving MapReduce Client Task Reliability
- MapReduce Log Overview
-
Common Issues About MapReduce
- How Do I Handle the Problem that MapReduce Task Has No Progress for a Long Time?
- Why Is the Client Unavailable When a Task Is Running?
- What Should I Do If HDFS_DELEGATION_TOKEN Cannot Be Found in the Cache?
- How Do I Set the Task Priority When Submitting a MapReduce Task?
- Why Physical Memory Overflow Occurs If a MapReduce Task Fails?
- What Should I Do If MapReduce Job Information Cannot Be Opened Through Tracking URL on the ResourceManager Web UI?
- Why MapReduce Tasks Fails in the Environment with Multiple NameServices?
- What Should I Do If the Partition-based Task Blacklist Is Abnormal?
-
Using Oozie
- Submitting a Job Using the Oozie Client
-
Using Hue to Submit an Oozie Job
- Creating a Workflow Using Hue
- Submitting an Oozie Hive2 Job Using Hue
- Submitting an Oozie HQL Script Using Hue
- Submitting an Oozie Spark2x Job Using Hue
- Submitting an Oozie Java Job Using Hue
- Submitting an Oozie Loader Job Using Hue
- Submitting an Oozie MapReduce Job Using Hue
- Submitting an Oozie Sub-workflow Job Using Hue
- Submitting an Oozie Shell Job Using Hue
- Submitting an Oozie HDFS Job Using Hue
- Submitting an Oozie Streaming Job Using Hue
- Submitting an Oozie DistCp Job Using Hue
- Submitting an Oozie SSH Job Using Hue
- Submitting a Coordinator Periodic Scheduling Job Using Hue
- Submitting a Bundle Batch Processing Job Using Hue
- Querying Oozie Job Results on the Hue Page
- Configuring Mutual Trust Between Oozie Nodes
- Enterprise-Class Enhancements of Oozie
- Oozie Log Overview
- Common Issues About Oozie
-
Using Ranger
- Enabling Ranger Authentication for MRS Cluster Services
- Logging In to the Ranger Web UI
- Adding a Ranger Permission Policy
-
Configuration Examples for Ranger Permission Policy
- Adding a Ranger Access Permission Policy for CDL
- Adding a Ranger Access Permission Policy for HDFS
- Adding a Ranger Access Permission Policy for HBase
- Adding a Ranger Access Permission Policy for Hive
- Adding a Ranger Access Permission Policy for Yarn
- Adding a Ranger Access Permission Policy for Spark2x
- Adding a Ranger Access Permission Policy for Kafka
- Adding a Ranger Access Permission Policy for HetuEngine
- Adding a Ranger Access Permission Policy for OBS
- Hive Tables Supporting Cascading Authorization
- Viewing Ranger Audit Information
- Configuring Ranger Security Zone
- Changing the Ranger Data Source to LDAP for a Normal Cluster
- Viewing Ranger User Permission Synchronization Information
- Ranger Performance Tuning
- Ranger Log Overview
-
Common Issues About Ranger
- How Do I Determine Whether the Ranger Authentication Is Used for a Service?
- Why Cannot a New User Log In to Ranger After Changing the Password?
- What Should I Do If I Cannot View the Created MRS User on the Ranger Management Page?
- What Should I Do If MRS Users Failed to Be Synchronized to the Ranger Web UI
- Ranger Troubleshooting
-
Using Spark/Spark2x
- Spark Usage Instruction
- Spark User Permission Management
- Using the Spark Client
- Accessing the Spark Web UI
- Submitting a Spark Job as a Proxy User
- Configuring Spark to Read HBase Data
- Configuring Spark Tasks Not to Obtain HBase Token Information
-
Spark Core Enterprise-Class Enhancements
- Configuring Spark HA to Enhance HA
- Configuring the Size of the Spark Event Queue
- Configuring the Compression Format of a Parquet Table
- Adapting to the Third-party JDK When Ranger Is Used
- Using the Spark Small File Combination Tool (MRS 3.3.0 or later)
- Using the Spark Small File Combination Tool (Versions Earlier Than MRS 3.3.0)
- Configuring Streaming Reading of Spark Driver Execution Results
- Enabling a Spark Executor to Execute Custom Code When Exiting
- Spark SQL Enterprise-Class Enhancements
- Spark Streaming Enterprise-Class Enhancements
-
Spark Core Performance Tuning
- Spark Core Data Serialization
- Spark Core Memory Tuning
- Setting Spark Core DOP
- Configuring Spark Core Broadcasting Variables
- Configuring Heap Memory Parameters for Spark Executor
- Using the External Shuffle Service to Improve Spark Core Performance
- Configuring Spark Dynamic Resource Scheduling in YARN Mode
- Adjusting Spark Core Process Parameters
- Spark DAG Design Specifications
- Experience Summary
-
Spark SQL Performance Tuning
- Optimizing the Spark SQL Join Operation
- Improving Spark SQL Calculation Performance Under Data Skew
- Optimizing Spark SQL Performance in the Small File Scenario
- Optimizing the Spark INSERT SELECT Statement
- Configuring Multiple Concurrent Clients to Connect to JDBCServer
- Configuring the Default Number of Data Blocks Divided by SparkSQL
- Optimizing Memory When Data Is Inserted into Spark Dynamic Partitioned Tables
- Optimizing Small Files
- Optimizing the Aggregate Algorithms
- Optimizing Datasource Tables
- Merging CBO
- SQL Optimization for Multi-level Nesting and Hybrid Join
- Spark Streaming Performance Tuning
- Spark on OBS Performance Tuning
-
Spark O&M Management
- Configuring Spark Parameters Rapidly
- Spark Common Configuration Parameters
- Spark Log Overview
- Obtaining Container Logs of a Running Spark Application
- Changing Spark Log Levels
- Viewing Container Logs on the Web UI
- Configuring the Number of Lost Executors Displayed on the Web UI
- Configuring Local Disk Cache for JobHistory
- Configuring Spark Event Log Rollback
- Enhancing Stability in a Limited Memory Condition
- Configuring Environment Variables in Yarn-Client and Yarn-Cluster Modes
- Broaden Support for Hive Partition Pruning Predicate Pushdown
- Configuring the Column Statistics Histogram for Higher CBO Accuracy
- Using CarbonData for First Query
-
Common Issues About Spark
-
Spark Core
- How Do I View Aggregated Spark Application Logs?
- Why Is the Return Code of Driver Inconsistent with Application State Displayed on ResourceManager WebUI?
- Why Cannot Exit the Driver Process?
- Why Does FetchFailedException Occur When the Network Connection Is Timed out
- How to Configure Event Queue Size If Event Queue Overflows?
- What Can I Do If the getApplicationReport Exception Is Recorded in Logs During Spark Application Execution and the Application Does Not Exit for a Long Time?
- What Can I Do If "Connection to ip:port has been quiet for xxx ms while there are outstanding requests" Is Reported When Spark Executes an Application and the Application Ends?
- Why Do Executors Fail to be Removed After the NodeManeger Is Shut Down?
- What Can I Do If the Message "Password cannot be null if SASL is enabled" Is Displayed?
- "Failed to CREATE_FILE" Is Displayed When Data Is Inserted into the Dynamic Partitioned Table Again
- Why Tasks Fail When Hash Shuffle Is Used?
- What Can I Do If the Error Message "DNS query failed" Is Displayed When I Access the Aggregated Logs Page of Spark Applications?
- What Can I Do If Shuffle Fetch Fails Due to the "Timeout Waiting for Task" Exception?
- Why Does the Stage Retry due to the Crash of the Executor?
- Why Do the Executors Fail to Register Shuffle Services During the Shuffle of a Large Amount of Data?
- NodeManager OOM Occurs During Spark Application Execution
-
Spark SQL and DataFrame
- What Do I have to Note When Using Spark SQL ROLLUP and CUBE?
- Why Spark SQL Is Displayed as a Temporary Table in Different Databases?
- How to Assign a Parameter Value in a Spark Command?
- What Directory Permissions Do I Need to Create a Table Using SparkSQL?
- Why Do I Fail to Delete the UDF Using Another Service?
- Why Cannot I Query Newly Inserted Data in a Parquet Hive Table Using SparkSQL?
- How to Use Cache Table?
- Why Are Some Partitions Empty During Repartition?
- Why Does 16 Terabytes of Text Data Fails to Be Converted into 4 Terabytes of Parquet Data?
- How Do I Rectify the Exception Occurred When I Perform an Operation on the Table Named table?
- Why Is a Task Suspended When the ANALYZE TABLE Statement Is Executed and Resources Are Insufficient?
- If I Access a parquet Table on Which I Do not Have Permission, Why a Job Is Run Before "Missing Privileges" Is Displayed?
- Why Is "RejectedExecutionException" Displayed When I Exit Spark SQL?
- How Do I Do If I Incidentally Kill the JDBCServer Process During Health Check?
- Why No Result Is found When 2016-6-30 Is Set in the Date Field as the Filter Condition?
- Why Is the "Code of method ... grows beyond 64 KB" Error Message Displayed When I Run Complex SQL Statements?
- Why Is Memory Insufficient if 10 Terabytes of TPCDS Test Suites Are Consecutively Run in Beeline/JDBCServer Mode?
- Why Functions Cannot Be Used When Different JDBCServers Are Connected?
- Why Does an Exception Occur When I Drop Functions Created Using the Add Jar Statement?
- Why Does Spark2x Have No Access to DataSource Tables Created by Spark1.5?
- Why Cannot I Query Newly Inserted Data in an ORC Hive Table Using Spark SQL?
-
Spark Streaming
- Same DAG Log Is Recorded Twice for a Streaming Task
- What Can I Do If Spark Streaming Tasks Are Blocked?
- What Should I Pay Attention to When Optimizing Spark Streaming Task Parameters?
- Why Does the Spark Streaming Application Fail to Be Submitted After the Token Validity Period Expires?
- Why Does the Spark Streaming Application Fail to Be Started from the Checkpoint When the Input Stream Has No Output Logic?
- Why Is the Input Size Corresponding to Batch Time on the Web UI Set to 0 Records When Kafka Is Restarted During Spark Streaming Running?
- What Should I Do If Recycle Bin Version I Set on the Spark Client Does Not Take Effect?
- How Do I Change the Log Level to INFO When Using Spark yarn-client?
-
Spark Core
-
Spark Troubleshooting
- Why the Job Information Obtained from the restful Interface of an Ended Spark Application Is Incorrect?
- Why Cannot I Switch from the Yarn Web UI to the Spark Web UI?
- What Can I Do If an Error Occurs when I Access the Application Page Because the Application Cached by HistoryServer Is Recycled?
- Apps Cannot Be Displayed on the JobHistory Page When an Empty Part File Is Loaded
- Why Does Spark Fail to Export a Table with the Same Field Name?
- Why JRE fatal error after running Spark application multiple times?
- Native Spark2x UI Fails to Be Accessed or Is Incorrectly Displayed when Internet Explorer Is Used for Access
- How Does Spark2x Access External Cluster Components?
- Why Does the Foreign Table Query Fail When Multiple Foreign Tables Are Created in the Same Directory?
- Why Is the Native Page of an Application in Spark2x JobHistory Displayed Incorrectly?
- Why Do I Fail to Create a Table in the Specified Location on OBS After Logging to spark-beeline?
- Spark Shuffle Exception Handling
- Why Cannot Common Users Log In to the Spark Client When There Are Multiple Service Scenarios in Spark?
- Why Does the Cluster Port Fail to Connect When a Client Outside the Cluster Is Installed or Used?
- How Do I Handle the Exception Occurred When I Query Datasource Avro Formats?
- What Should I Do If Statistics of Hudi or Hive Tables Created Using Spark SQLs Are Empty Before Data Is Inserted?
- Failed to Query Table Statistics by Partition Using Non-Standard Time Format When the Partition Column in the Table Creation Statement is timestamp
- How Do I Use Special Characters with TIMESTAMP and DATE?
- Using Sqoop
- Using Tez
-
Using YARN
- Yarn User Permission Management
- Submitting a Task Using the Yarn Client
- Configuring Container Log Aggregation
- Enabling Yarn CGroups to Limit the Container CPU Usage
- Configuring HA for TimelineServer
-
Enterprise-Class Enhancements of Yarn
- Configuring the Yarn Permission Control
- Specifying the User Who Runs Yarn Tasks
- Configuring the Number of ApplicationMaster Retries
- Configure the ApplicationMaster to Automatically Adjust the Allocated Memory
- Configuring ApplicationMaster Work Preserving
- Configuring the Access Channel Protocol
- Configuring the Additional Scheduler WebUI
- Configuring Resources for a NodeManager Role Instance
- Configuring Yarn Restart
- Yarn Performance Tuning
- Yarn O&M Management
-
Common Issues About Yarn
- Why Mounted Directory for Container is Not Cleared After the Completion of the Job While Using CGroups?
- Why the Job Fails with HDFS_DELEGATION_TOKEN Expired Exception?
- Why Are Local Logs Not Deleted After YARN Is Restarted?
- Why the Task Does Not Fail Even Though AppAttempts Restarts for More Than Two Times?
- Why Is an Application Moved Back to the Original Queue After ResourceManager Restarts?
- Why Does Yarn Not Release the Blacklist Even All Nodes Are Added to the Blacklist?
- Why Does the Switchover of ResourceManager Occur Continuously?
- Why Does a New Application Fail If a NodeManager Has Been in Unhealthy Status for 10 Minutes?
- Why Does an Error Occur When I Query the ApplicationID of a Completed or Non-existing Application Using the RESTful APIs?
- Why May A Single NodeManager Fault Cause MapReduce Task Failures in the Superior Scheduling Mode?
- Why Are Applications Suspended After They Are Moved From Lost_and_Found Queue to Another Queue?
- How Do I Limit the Size of Application Diagnostic Messages Stored in the ZKstore?
- Why Does a MapReduce Job Fail to Run When a Non-ViewFS File System Is Configured as ViewFS?
- Why Do Reduce Tasks Fail to Run in Some OSs After the Native Task Feature is Enabled?
-
Using ZooKeeper
- Using ZooKeeper from Scratch
- Configuring the ZooKeeper Permissions
- ZooKeeper Common Configuration Parameters
- ZooKeeper Log Overview
-
Common Issues About ZooKeeper
- Why Do ZooKeeper Servers Fail to Start After Many znodes Are Created?
- Why Does the ZooKeeper Server Display the java.io.IOException: Len Error Log?
- Why Four Letter Commands Don't Work With Linux netcat Command When Secure Netty Configurations Are Enabled at Zookeeper Server?
- How Do I Check Which ZooKeeper Instance Is a Leader?
- Why Cannot the Client Connect to ZooKeeper using the IBM JDK?
- What Should I Do When the ZooKeeper Client Fails to Refresh a TGT?
- Why Is Message "Node does not exist" Displayed when A Large Number of Znodes Are Deleted Using the deleteallCommand
- Appendix
-
Using CarbonData
-
Best Practices
-
Data Analytics
- Using Spark2x to Analyze IoV Drivers' Driving Behavior
- Using Hive to Load HDFS Data and Analyze Book Scores
- Using Hive to Load OBS Data and Analyze Enterprise Employee Information
- Using Flink Jobs to Process OBS Data
- Consuming Kafka Data Using Spark Streaming Jobs
- Using Flume to Collect Log Files from a Specified Directory to HDFS
- Kafka-based WordCount Data Flow Statistics Case
- Data Migration
- Interconnection with Other Cloud Services
-
Interconnection with Ecosystem Component
- Using DBeaver to Access Phoenix
- Using DBeaver to Access MRS HetuEngine
- Using Tableau to Access MRS HetuEngine
- Using Yonghong BI to Access MRS HetuEngine
- Interconnecting Hive with External Self-Built Relational Databases
- Configuring Interconnection between MRS Hive and External LDAP
- Using Jupyter Notebook to Connect to MRS Spark
- MRS Cluster Management
-
Data Analytics
-
Developer Guide
-
Developer Guide (LTS)
- Description
- Obtaining Sample Projects from Huawei Mirrors
- Using Open-source JAR File Conflict Lists
- Mapping Between Maven Repository JAR Versions and MRS Component Versions
- Security Authentication
- ClickHouse Development Guide (Security Mode)
- ClickHouse Development Guide (Normal Mode)
-
Flink Development Guide (Security Mode)
- Overview
- Environment Preparation
- Developing an Application
- Debugging the Application
-
More Information
- Introduction to Common APIs
- Overview of RESTful APIs
- Overview of Savepoints CLI
- Introduction to Flink Client CLI
-
FAQ
- Savepoints-related Problems
- What If the Chrome Browser Cannot Display the Title
- What If the Page Is Displayed Abnormally on Internet Explorer 10/11
- What If Checkpoint Is Executed Slowly in RocksDBStateBackend Mode When the Data Amount Is Large
- What If yarn-session Start Fails When blob.storage.directory Is Set to /home
- Why Does Non-static KafkaPartitioner Class Object Fail to Construct FlinkKafkaProducer010?
- When I Use a Newly Created Flink User to Submit Tasks, Why Does the Task Submission Fail and a Message Indicating Insufficient Permission on ZooKeeper Directory Is Displayed?
- Why Cannot I Access the Apache Flink Dashboard?
- How Do I View the Debugging Information Printed Using System.out.println or Export the Debugging Information to a Specified File?
- Incorrect GLIBC Version
-
Flink Development Guide (Normal Mode)
- Overview
- Environment Preparation
- Developing an Application
- Debugging the Application
-
More Information
- Introduction to Common APIs
- Overview of RESTful APIs
- Overview of Savepoints CLI
- Introduction to Flink Client CLI
-
FAQ
- Savepoints-related Problems
- What If the Chrome Browser Cannot Display the Title
- What If the Page Is Displayed Abnormally on Internet Explorer 10/11
- What If Checkpoint Is Executed Slowly in RocksDBStateBackend Mode When the Data Amount Is Large
- What If yarn-session Start Fails When blob.storage.directory Is Set to /home
- Why Does Non-static KafkaPartitioner Class Object Fail to Construct FlinkKafkaProducer010?
- When I Use a Newly Created Flink User to Submit Tasks, Why Does the Task Submission Fail and a Message Indicating Insufficient Permission on ZooKeeper Directory Is Displayed?
- Why Cannot I Access the Apache Flink Dashboard?
- How Do I View the Debugging Information Printed Using System.out.println or Export the Debugging Information to a Specified File?
- Incorrect GLIBC Version
-
HBase Development Guide (Security Mode)
- Overview
- Environment Preparation
-
Developing an Application
- Typical Scenario Description
- Development Idea
-
Example Code Description
- Configuring Log4j Log Output
- Creating Configuration
- Creating Connection
- Creating a Table
- Deleting a Table
- Modifying a Table
- Inserting Data
- Deleting Data
- Reading Data Using Get
- Reading Data Using Scan
- Filtering Data
- Creating a Secondary Index
- Deleting an Index
- Secondary Index-based Query
- Multi-Point Region Division
- Creating a Phoenix Table
- Writing Data to the PhoenixTable
- Reading the PhoenixTable
- Accessing Multiple ZooKeepers
- Querying Cluster Information Using REST
- Obtaining All Tables Using REST
- Operate Namespaces Using REST
- Operate Tables Using REST
- Accessing the ThriftServer Operation Table
- Accessing ThriftServer to Write Data
- Accessing ThriftServer to Write Data
- Using HBase Dual-Read
- Application Commissioning
-
More Information
- SQL Query
- HBase Dual-Read Configuration Items
- External Interfaces
- Phoenix Command Line
-
FAQs
- How to Rectify the Fault When an Exception Occurs During the Running of an HBase-developed Application and "org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory" Is Displayed in the Error Information?
- What Are the Application Scenarios of the Bulkload and put Data-loading Modes?
- An Error Occurred When Building a JAR Package
-
HBase Development Guide (Normal Mode)
- Overview
- Environment Preparation
-
Developing an Application
- Typical Scenario Description
- Development Idea
-
Example Code Description
- Configuring Log4j Log Output
- Creating Configuration
- Creating Connection
- Creating a Table
- Deleting a Table
- Modifying a Table
- Inserting Data
- Deleting Data
- Reading Data Using Get
- Reading Data Using Scan
- Filtering Data
- Creating a Secondary Index
- Deleting an Index
- Secondary Index-based Query
- Multi-Point Region Division
- Creating a Phoenix Table
- Writing Data to the PhoenixTable
- Reading the PhoenixTable
- Accessing Multiple ZooKeepers
- Querying Cluster Information Using REST
- Obtaining All Tables Using REST
- Operate Namespaces Using REST
- Operate Tables Using REST
- Accessing the ThriftServer Operation Table
- Accessing ThriftServer to Write Data
- Accessing ThriftServer to Write Data
- Using HBase Dual-Read
- Application Commissioning
-
More Information
- SQL Query
- HBase Dual-Read Configuration Items
- External Interfaces
- Phoenix Command Line
-
FAQs
- How to Rectify the Fault When an Exception Occurs During the Running of an HBase-developed Application and "org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory" Is Displayed in the Error Information?
- What Are the Application Scenarios of the bulkload and put Data-loading Modes?
- An Error Occurred When Building a JAR Package
- HDFS Development Guide (Security Mode)
- HDFS Development Guide (Normal Mode)
- HetuEngine Development Guide (Security Mode)
- HetuEngine Development Guide (Normal Mode)
- Hive Development Guide (Security Mode)
- Hive Development Guide (Normal Mode)
-
Kafka Development Guide (Security Mode)
- Overview
- Environment Preparation
- Developing an Application
- Application Commissioning
- More Information
- Kafka Development Guide (Normal Mode)
- MapReduce Development Guide (Security Mode)
- MapReduce Development Guide (Normal Mode)
- Oozie Development Guide (Security Mode)
- Oozie Development Guide (Normal Mode)
-
Spark2x Development Guide (Security Mode)
- Overview
- Preparing for the Environment
-
Developing the Project
- Spark Core Project
- Spark SQL Project
- Accessing the Spark SQL Through JDBC
-
Spark on HBase
- Performing Operations on Data in Avro Format
- Performing Operations on the HBase Data Source
- Using the BulkPut Interface
- Using the BulkGet Interface
- Using the BulkDelete Interface
- Using the BulkLoad Interface
- Using the foreachPartition Interface
- Distributedly Scanning HBase Tables
- Using the mapPartition Interface
- Writing Data to HBase Tables In Batches Using SparkStreaming
- Reading Data from HBase and Write It Back to HBase
- Reading Data from Hive and Write It to HBase
- Streaming Connecting to Kafka0-10
- Structured Streaming Project
- Structured Streaming Stream-Stream Join
- Structured Streaming Status Operation
- Concurrent Access from Spark to HBase in Two Clusters
- Synchronizing HBase Data from Spark to CarbonData
- Using Spark to Perform Basic Hudi Operations
- Compiling User-defined Configuration Items for Hudi
- Commissioning the Application
-
More Information
- Common APIs
- Common CLIs
- JDBCServer Interface
- Structured Streaming Functions and Reliability
-
FAQ
- How to Add a User-Defined Library
- How to Automatically Load Jars Packages?
- Why the "Class Does not Exist" Error Is Reported While the SparkStresmingKafka Project Is Running?
- Privilege Control Mechanism of SparkSQL UDF Feature
- Why Does Kafka Fail to Receive the Data Written Back by SLog in to the node where the client is installed as the client installation user.park Streaming?
- Why a Spark Core Application Is Suspended Instead of Being Exited When Driver Memory Is Insufficient to Store Collected Intensive Data?
- Why the Name of the Spark Application Submitted in Yarn-Cluster Mode Does not Take Effect?
- How to Perform Remote Debugging Using IDEA?
- How to Submit the Spark Application Using Java Commands?
- A Message Stating "Problem performing GSS wrap" Is Displayed When IBM JDK Is Used
- Application Fails When ApplicationManager Is Terminated During Data Processing in the Cluster Mode of Structured Streaming
- Restrictions on Restoring the Spark Application from the checkpoint
- Support for Third-party JAR Packages on x86 and TaiShan Platforms
- What Should I Do If a Large Number of Directories Whose Names Start with blockmgr- or spark- Exist in the /tmp Directory on the Client Installation Node?
- Error Code 139 Reported When Python Pipeline Runs in the ARM Environment
- What Should I Do If the Structured Streaming Task Submission Way Is Changed?
- Common JAR File Conflicts
-
Spark2x Development Guide (Normal Mode)
- Overview
- Preparing for the Environment
-
Developing the Project
- Spark Core Project
- Spark SQL Project
- Accessing the Spark SQL Through JDBC
-
Spark on HBase
- Performing Operation on Data in Avro Format
- Performing Operations on the HBase Data Source
- Using the BulkPut Interface
- Using the BulkGet Interface
- Using the BulkDelete Interface
- Using the BulkLoad Interface
- Using the foreachPartition Interface
- Distributedly Scanning HBase Tables
- Using the mapPartition Interface
- Writing Data to HBase Tables In Batches Using SparkStreaming
- Reading Data from HBase and Write It Back to HBase
- Reading Data from Hive and Write It to HBase
- Streaming Connecting to Kafka0-10
- Structured Streaming Project
- Structured Streaming Stream-Stream Join
- Structured Streaming Status Operation
- Synchronizing HBase Data from Spark to CarbonData
- Using Spark to Perform Basic Hudi Operations
- Compiling User-defined Configuration Items for Hudi
- Commissioning the Application
-
More Information
- Common APIs
- Common CLIs
- JDBCServer Interface
- Structured Streaming Functions and Reliability
-
FAQ
- How to Add a User-Defined Library
- How to Automatically Load Jars Packages?
- Why the "Class Does not Exist" Error Is Reported While the SparkStresmingKafka Project Is Running?
- Why Does Kafka Fail to Receive the Data Written Back by Spark Streaming?
- Why a Spark Core Application Is Suspended Instead of Being Exited When Driver Memory Is Insufficient to Store Collected Intensive Data?
- Why the Name of the Spark Application Submitted in Yarn-Cluster Mode Does not Take Effect?
- How to Perform Remote Debugging Using IDEA?
- How to Submit the Spark Application Using Java Commands?
- A Message Stating "Problem performing GSS wrap" Is Displayed When IBM JDK Is Used
- Application Fails When ApplicationManager Is Terminated During Data Processing in the Cluster Mode of Structured Streaming
- Restrictions on Restoring the Spark Application from the checkpoint
- Support for Third-party JAR Packages on x86 and TaiShan Platforms
- What Should I Do If a Large Number of Directories Whose Names Start with blockmgr- or spark- Exist in the /tmp Directory on the Client Installation Node?
- Error Code 139 Reported When Python Pipeline Runs in the ARM Environment
- What Should I Do If the Structured Streaming Task Submission Way Is Changed?
- Common JAR File Conflicts
- YARN Development Guide (Security Mode)
- YARN Development Guide (Normal Mode)
- Development Specifications
-
Manager Management Development Guide
- Overview
- Environment Preparation
- Developing an Application
- Application Commissioning
-
More Information
- External Interfaces
-
FAQ
- JDK1.6 Fails to Connect to the FusionInsight System Using JDK1.8
- An Operation Fails and "authorize failed" Is Displayed in Logs
- An Operation Fails and "log4j:WARN No appenders could be found for logger(basicAuth.Main)" Is Displayed in Logs
- An Operation Fails and "illegal character in path at index 57" Is Displayed in Logs
- Run the curl Command to Access REST APIs
-
Developer Guide (Normal_3.x)
- Description
- Obtaining Sample Projects from Huawei Mirrors
- Using Open-source JAR File Conflict Lists
- Mapping Between Maven Repository JAR Versions and MRS Component Versions
- Security Authentication
- CQL Development Guide (Security Mode)
- CQL Development Guide (Normal Mode)
- ClickHouse Development Guide (Security Mode)
- ClickHouse Development Guide (Normal Mode)
-
Flink Development Guide (Security Mode)
- Overview
- Environment Preparation
- Developing an Application
- Debugging the Application
-
More Information
- Introduction to Common APIs
- Overview of RESTful APIs
- Overview of Savepoints CLI
- Introduction to Flink Client CLI
-
FAQ
- Savepoints-related Problems
- What If the Chrome Browser Cannot Display the Title
- What If the Page Is Displayed Abnormally on Internet Explorer 10/11
- What If Checkpoint Is Executed Slowly in RocksDBStateBackend Mode When the Data Amount Is Large
- What If yarn-session Start Fails When blob.storage.directory Is Set to /home
- Why Does Non-static KafkaPartitioner Class Object Fail to Construct FlinkKafkaProducer010?
- When I Use a Newly Created Flink User to Submit Tasks, Why Does the Task Submission Fail and a Message Indicating Insufficient Permission on ZooKeeper Directory Is Displayed?
- Why Cannot I Access the Apache Flink Dashboard?
- How Do I View the Debugging Information Printed Using System.out.println or Export the Debugging Information to a Specified File?
- Incorrect GLIBC Version
-
Flink Development Guide (Normal Mode)
- Overview
- Environment Preparation
- Developing an Application
- Debugging the Application
-
More Information
- Introduction to Common APIs
- Overview of RESTful APIs
- Overview of Savepoints CLI
- Introduction to Flink Client CLI
-
FAQ
- Savepoints-related Problems
- What If the Chrome Browser Cannot Display the Title
- What If the Page Is Displayed Abnormally on Internet Explorer 10/11
- What If Checkpoint Is Executed Slowly in RocksDBStateBackend Mode When the Data Amount Is Large
- What If yarn-session Start Fails When blob.storage.directory Is Set to /home
- Why Does Non-static KafkaPartitioner Class Object Fail to Construct FlinkKafkaProducer010?
- When I Use a Newly Created Flink User to Submit Tasks, Why Does the Task Submission Fail and a Message Indicating Insufficient Permission on ZooKeeper Directory Is Displayed?
- Why Cannot I Access the Apache Flink Dashboard?
- How Do I View the Debugging Information Printed Using System.out.println or Export the Debugging Information to a Specified File?
- Incorrect GLIBC Version
-
HBase Development Guide (Security Mode)
- Overview
- Environment Preparation
-
Developing an Application
- Typical Scenario Description
- Development Idea
-
Example Code Description
- Configuring Log4j Log Output
- Creating Configuration
- Creating Connection
- Creating a Table
- Deleting a Table
- Modifying a Table
- Inserting Data
- Deleting Data
- Reading Data Using Get
- Reading Data Using Scan
- Filtering Data
- Creating a Secondary Index
- Deleting an Index
- Secondary Index-based Query
- Multi-Point Region Division
- Creating a Phoenix Table
- Writing Data to the PhoenixTable
- Reading the PhoenixTable
- Accessing Multiple ZooKeepers
- Querying Cluster Information Using REST
- Obtaining All Tables Using REST
- Operate Namespaces Using REST
- Operate Tables Using REST
- Accessing the ThriftServer Operation Table
- Accessing ThriftServer to Write Data
- Accessing ThriftServer to Read Data
- Using HBase Dual-Read
- Application Commissioning
-
More Information
- SQL Query
- HBase Dual-Read Configuration Items
- External Interfaces
- HBase Access Configuration on Windows Using EIPs
- Phoenix Command Line
-
FAQs
- How to Rectify the Fault When an Exception Occurs During the Running of an HBase-developed Application and "org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory" Is Displayed in the Error Information?
- What Are the Application Scenarios of the Bulkload and put Data-loading Modes?
- An Error Occurred When Building a JAR Package
-
HBase Development Guide (Normal Mode)
- Overview
- Environment Preparation
-
Developing an Application
- Typical Scenario Description
- Development Idea
-
Example Code Description
- Configuring Log4j Log Output
- Creating Configuration
- Creating Connection
- Creating a Table
- Deleting a Table
- Modifying a Table
- Inserting Data
- Deleting Data
- Reading Data Using Get
- Reading Data Using Scan
- Filtering Data
- Creating a Secondary Index
- Deleting an Index
- Secondary Index-based Query
- Multi-Point Region Division
- Creating a Phoenix Table
- Writing Data to the PhoenixTable
- Reading the PhoenixTable
- Accessing Multiple ZooKeepers
- Querying Cluster Information Using REST
- Obtaining All Tables Using REST
- Operate Namespaces Using REST
- Operate Tables Using REST
- Accessing the ThriftServer Operation Table
- Accessing ThriftServer to Write Data
- Accessing ThriftServer to Read Data
- Using HBase Dual-Read
- Application Commissioning
-
More Information
- SQL Query
- HBase Dual-Read Configuration Items
- External Interfaces
- HBase Access Configuration on Windows Using EIPs
- Phoenix Command Line
-
FAQs
- How to Rectify the Fault When an Exception Occurs During the Running of an HBase-developed Application and "org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory" Is Displayed in the Error Information?
- What Are the Application Scenarios of the bulkload and put Data-loading Modes?
- An Error Occurred When Building a JAR Package
- HDFS Development Guide (Security Mode)
- HDFS Development Guide (Normal Mode)
- Hive Development Guide (Security Mode)
- Hive Development Guide (Normal Mode)
- Impala Development Guide (Security Mode)
- Impala Development Guide (Normal Mode)
-
Kafka Development Guide (Security Mode)
- Overview
- Environment Preparation
- Developing an Application
- Application Commissioning
- More Information
- Kafka Development Guide (Normal Mode)
- Kudu Development Guide (Security Mode)
- Kudu Development Guide (Normal Mode)
- MapReduce Development Guide (Security Mode)
- MapReduce Development Guide (Normal Mode)
- Oozie Development Guide (Security Mode)
- Oozie Development Guide (Normal Mode)
- Presto Development Guide (Security Mode)
- Presto Development Guide (Normal Mode)
-
Spark2x Development Guide (Security Mode)
- Overview
- Preparing for the Environment
-
Developing the Project
- Spark Core Project
- Spark SQL Project
- Accessing the Spark SQL Through JDBC
-
Spark on HBase
- Performing Operations on Data in Avro Format
- Performing Operations on the HBase Data Source
- Using the BulkPut Interface
- Using the BulkGet Interface
- Using the BulkDelete Interface
- Using the BulkLoad Interface
- Using the foreachPartition Interface
- Distributedly Scanning HBase Tables
- Using the mapPartition Interface
- Writing Data to HBase Tables In Batches Using SparkStreaming
- Reading Data from HBase and Write It Back to HBase
- Reading Data from Hive and Write It to HBase
- Streaming Connecting to Kafka0-10
- Structured Streaming Project
- Structured Streaming Stream-Stream Join
- Structured Streaming Status Operation
- Concurrent Access from Spark to HBase in Two Clusters
- Synchronizing HBase Data from Spark to CarbonData
- Using Spark to Perform Basic Hudi Operations
- Compiling User-defined Configuration Items for Hudi
- Commissioning the Application
-
More Information
- Common APIs
- Common CLIs
- JDBCServer Interface
- Structured Streaming Functions and Reliability
-
FAQ
- How to Add a User-Defined Library
- How to Automatically Load Jars Packages?
- Why the "Class Does not Exist" Error Is Reported While the SparkStresmingKafka Project Is Running?
- Privilege Control Mechanism of SparkSQL UDF Feature
- Why Does Kafka Fail to Receive the Data Written Back by SLog in to the node where the client is installed as the client installation user.park Streaming?
- Why a Spark Core Application Is Suspended Instead of Being Exited When Driver Memory Is Insufficient to Store Collected Intensive Data?
- Why the Name of the Spark Application Submitted in Yarn-Cluster Mode Does not Take Effect?
- How to Perform Remote Debugging Using IDEA?
- How to Submit the Spark Application Using Java Commands?
- A Message Stating "Problem performing GSS wrap" Is Displayed When IBM JDK Is Used
- Application Fails When ApplicationManager Is Terminated During Data Processing in the Cluster Mode of Structured Streaming
- Restrictions on Restoring the Spark Application from the checkpoint
- Support for Third-party JAR Packages on x86 and TaiShan Platforms
- What Should I Do If a Large Number of Directories Whose Names Start with blockmgr- or spark- Exist in the /tmp Directory on the Client Installation Node?
- Error Code 139 Reported When Python Pipeline Runs in the ARM Environment
- What Should I Do If the Structured Streaming Task Submission Way Is Changed?
- Common JAR File Conflicts
-
Spark2x Development Guide (Normal Mode)
- Overview
- Preparing for the Environment
-
Developing the Project
- Spark Core Project
- Spark SQL Project
- Accessing the Spark SQL Through JDBC
-
Spark on HBase
- Performing Operation on Data in Avro Format
- Performing Operations on the HBase Data Source
- Using the BulkPut Interface
- Using the BulkGet Interface
- Using the BulkDelete Interface
- Using the BulkLoad Interface
- Using the foreachPartition Interface
- Distributedly Scanning HBase Tables
- Using the mapPartition Interface
- Writing Data to HBase Tables In Batches Using SparkStreaming
- Reading Data from HBase and Write It Back to HBase
- Reading Data from Hive and Write It to HBase
- Streaming Connecting to Kafka0-10
- Structured Streaming Project
- Structured Streaming Stream-Stream Join
- Structured Streaming Status Operation
- Synchronizing HBase Data from Spark to CarbonData
- Using Spark to Perform Basic Hudi Operations
- Compiling User-defined Configuration Items for Hudi
- Commissioning the Application
-
More Information
- Common APIs
- Common CLIs
- JDBCServer Interface
- Structured Streaming Functions and Reliability
-
FAQ
- How to Add a User-Defined Library
- How to Automatically Load Jars Packages?
- Why the "Class Does not Exist" Error Is Reported While the SparkStreamingKafka Project Is Running?
- Why Does Kafka Fail to Receive the Data Written Back by Spark Streaming?
- Why a Spark Core Application Is Suspended Instead of Being Exited When Driver Memory Is Insufficient to Store Collected Intensive Data?
- Why the Name of the Spark Application Submitted in Yarn-Cluster Mode Does not Take Effect?
- How to Perform Remote Debugging Using IDEA?
- How to Submit the Spark Application Using Java Commands?
- A Message Stating "Problem performing GSS wrap" Is Displayed When IBM JDK Is Used
- Application Fails When ApplicationManager Is Terminated During Data Processing in the Cluster Mode of Structured Streaming
- Restrictions on Restoring the Spark Application from the checkpoint
- Support for Third-party JAR Packages on x86 and TaiShan Platforms
- What Should I Do If a Large Number of Directories Whose Names Start with blockmgr- or spark- Exist in the /tmp Directory on the Client Installation Node?
- Error Code 139 Reported When Python Pipeline Runs in the ARM Environment
- What Should I Do If the Structured Streaming Task Submission Way Is Changed?
- Common JAR File Conflicts
-
Storm Development Guide (Security Mode)
- Overview
- Environment Preparation
- Developing an Application
- Running an Application
- More Information
-
Storm Development Guide (Normal Mode)
- Overview
- Environment Preparation
- Developing an Application
- Running an Application
- More Information
- YARN Development Guide (Security Mode)
- YARN Development Guide (Normal Mode)
- Development Specifications
-
Manager Management Development Guide
- Overview
- Environment Preparation
- Developing an Application
- Application Commissioning
-
More Information
- External Interfaces
-
FAQ
- JDK1.6 Fails to Connect to the FusionInsight System Using JDK1.8
- An Operation Fails and "authorize failed" Is Displayed in Logs
- An Operation Fails and "log4j:WARN No appenders could be found for logger(basicAuth.Main)" Is Displayed in Logs
- An Operation Fails and "illegal character in path at index 57" Is Displayed in Logs
- Run the curl Command to Access REST APIs
-
Developer Guide (Normal_Earlier Than 3.x)
- Before You Start
- Method of Building an MRS Sample Project
-
HBase Application Development
- Overview
- Environment Preparation
-
Application Development
- Development Guidelines in Typical Scenarios
- Creating the Configuration Object
- Creating the Connection Object
- Creating a Table
- Deleting a Table
- Modifying a Table
- Inserting Data
- Deleting Data
- Reading Data Using Get
- Reading Data Using Scan
- Using a Filter
- Adding a Secondary Index
- Enabling/Disabling a Secondary Index
- Querying a List of Secondary Indexes
- Using a Secondary Index to Read Data
- Deleting a Secondary Index
- Writing Data into a MOB Table
- Reading MOB Data
- Multi-Point Region Splitting
- ACL Security Configuration
- Application Commissioning
- More Information
- HBase APIs
- FAQs
- Development Specifications
- Hive Application Development
- MapReduce Application Development
- HDFS Application Development
-
Spark Application Development
- Overview
-
Environment Preparation
- Environment Overview
- Preparing a Development User
- Preparing a Java Development Environment
- Preparing a Scala Development Environment
- Preparing a Python Development Environment
- Preparing an Operating Environment
- Downloading and Importing a Sample Project
- (Optional) Creating a Project
- Preparing the Authentication Mechanism Code
-
Application Development
- Spark Core Application
- Spark SQL Application
- Spark Streaming Application
- Application for Accessing Spark SQL Through JDBC
- Spark on HBase Application
- Reading Data from HBase and Writing Data Back to HBase
- Reading Data from Hive and Write Data to HBase
- Using Streaming to Read Data from Kafka and Write Data to HBase
- Application for Connecting Spark Streaming to Kafka0-10
- Structured Streaming Application
- Application Commissioning
-
Application Tuning
-
Spark Core Tuning
- Data Serialization
- Memory Configuration Optimization
- Setting a Degree of Parallelism
- Using Broadcast Variables
- Using the External Shuffle Service to Improve Performance
- Configuring Dynamic Resource Scheduling in Yarn Mode
- Configuring Process Parameters
- Designing a Direction Acyclic Graph (DAG)
- Experience Summary
- SQL and DataFrame Tuning
- Spark Streaming Tuning
- Spark CBO Tuning
-
Spark Core Tuning
- Spark APIs
-
FAQs
- How Do I Add a Dependency Package with Customized Codes?
- How Do I Handle the Dependency Package That Is Automatically Loaded?
- Why the "Class Does not Exist" Error Is Reported While the SparkStreamingKafka Project Is Running?
- Why a Spark Core Application Is Suspended Instead of Being Exited When Driver Memory Is Insufficient to Store Collected Intensive Data?
- Why the Name of the Spark Application Submitted in Yarn-Cluster Mode Does not Take Effect?
- How Do I Submit the Spark Application Using Java Commands?
- How Does the Permission Control Mechanism Work for the UDF Function in SparkSQL?
- Why Does Kafka Fail to Receive the Data Written Back by Spark Streaming?
- How Do I Perform Remote Debugging Using IDEA?
- A Message Stating "Problem performing GSS wrap" Is Displayed When IBM JDK Is Used
- Why Does the ApplicationManager Fail to Be Terminated When Data Is Being Processed in the Structured Streaming Cluster Mode?
- What Should I Do If FileNotFoundException Occurs When spark-submit Is Used to Submit a Job in Spark on Yarn Client Mode?
- What Should I Do If the "had a not serializable result" Error Is Reported When a Spark Task Reads HBase Data?
- How Do I Connect to Hive and HDFS of an MRS Cluster when the Spark Program Is Running on a Local Host?
- Development Specifications
- Storm Application Development
-
Kafka Application Development
- Overview
- Environment Preparation
-
Application Development
- Typical Application Scenario
- Old Producer API Usage Sample
- Old Consumer API Usage Sample
- Producer API Usage Sample
- Consumer API Usage Sample
- Multi-Thread Producer API Usage Sample
- Multi-Thread Consumer API Usage Sample
- SimpleConsumer API Usage Sample
- Description of the Sample Project Configuration File
- Application Commissioning
- Kafka APIs
- FAQs
- Development Specifications
- Presto Application Development
- OpenTSDB Application Development
-
Flink Application Development
- Overview
- Environment Preparation
- Application Development
- Application Commissioning
- Performance Tuning
- More Information
-
FAQs
- Savepoints FAQs
- What Should I Do If Running a Checkpoint Is Slow When RocksDBStateBackend is Set for the Checkpoint and a Large Amount of Data Exists?
- What Should I Do If yarn-session Failed to Be Started When blob.storage.directory Is Set to /home?
- Why Does Non-static KafkaPartitioner Class Object Fail to Construct FlinkKafkaProducer010?
- When I Use the Newly-Created Flink User to Submit Tasks, Why Does the Task Submission Fail and a Message Indicating Insufficient Permission on ZooKeeper Directory Is Displayed?
- Why Can't I Access the Flink Web Page?
- Impala Application Development
- Alluxio Application Development
- Appendix
-
Developer Guide (LTS)
-
API Reference
- Before You Start
- API Overview
- Selecting an API Type
- Calling APIs
- Application Cases
- API V2
- API V1.1
- Out-of-Date APIs
- Permissions Policies and Supported Actions
- Appendix
- SDK Reference
-
FAQs
-
MRS Basics
- What Is MRS Used For?
- What Types of Distributed Storage Does MRS Support?
- What Are Regions and AZs?
- Can I Change the Network Segment of Nodes in an MRS Cluster?
- Can I Downgrade the Specifications of an MRS Cluster Node?
- Are Hive Components of Different Versions Compatible with Each Other?
- What Are the Differences Between OBS and HDFS in Data Storage?
- What Are the Solutions for Processing 1 Billion Data Records?
- What are the advantages of the compression ratio of zstd?
- Billing
-
Cluster Creation
- How Do I Create an MRS Cluster Using a Custom Security Group?
- What Should I Do If HDFS, Yarn, and MapReduce Components Are Unavailable When I Buy an MRS Cluster?
- What Should I Do If the ZooKeeper Component Is Unavailable When I Buy an MRS Cluster?
- What Should I Do If Invalid Authentication Is Reported When I Submit an Order for Purchasing an MRS Cluster?
-
Web Page Access
- How Do I Change the Session Timeout Duration for an Open Source Component Web UI?
- What Can I Do If the Dynamic Resource Plan Page in MRS Tenant Management Cannot Be Refreshed?
- What Do I Do If the Kafka Topic Monitoring Tab Is Not Displayed on Manager?
- What CAN I DO IF an Error Is Reported or Some Functions Are Unavailable When I Access the Web UIs of Components such as HDFS, Hue, Yarn, Flink, and HetuEngine?
- How Do I Switch the Methods to Access MRS Manager?
- Why Cannot I Find the User Management Page on MRS Manager?
- What Can I Do If the Excel File Downloaded by Hue Cannot Be Opened?
-
Authentication and Permission
- What Is the User for Logging in to FusionInsight Manager?
- How Do I Query and Change the Password Validity Period of a User In a Cluster?
- Does an MRS Cluster Support Access Permission Control If Kerberos Authentication Is not Enabled?
- How Do I Add Tenant Management Permission to Users in a Cluster?
- Does Hue Provide the Function of Configuring Account Permissions?
- Why Can't I Submit Jobs on the Console After My IAM Account Is Assigned with MRS Permissions?
- How Do I View the Hive Table Created by Another User?
- How Do I Prevent Kerberos Authentication Expiration?
- How Do I Enable or Disable Kerberos Authentication for an Existing MRS Cluster?
- What Are the Ports of the Kerberos Authentication Service?
-
Client Usage
- How Do I Disable SASL Authentication for ZooKeeper?
- What Can I Do If the Error Message "Permission denied" Is Displayed When kinit Is Executed on a Client Outside the MRS Cluster?
- What Should I Do If an Alarm Is Reported Indicating that the Memory Is Insufficient When I Execute a SQL Statement on the ClickHouse Client?
- How Do I Connect to Spark Shell from MRS?
- How Do I Connect to Spark Beeline from MRS?
- What Should I Do If the Connection to the ClickHouse Server Fails and Error Code 516 Is Reported?
-
Component Configurations
- Does MRS Support Running Hive on Kudu?
- Does an MRS Cluster Support Hive on Spark?
- Can I Change the IP address of DBService?
- What Access Protocols Does Kafka Support?
- What Python Versions Are Supported by Spark Tasks in an MRS Cluster?
- What Are the Restrictions on the Storm Log Size in an MRS 2.1.0 Cluster?
- How Do I Modify the HDFS fs.defaultFS of an Existing Cluster?
- Can MRS Run Multiple Flume Tasks at a Time?
- How Do I Change FlumeClient Logs to Standard Logs?
- Where Are the JAR Files and Environment Variables of Hadoop Stored?
- How Do I View HBase Logs?
- How Do I Set the TTL for an HBase Table?
- How Do I Change the Number of HDFS Replicas?
- How Do I Modify the HDFS Active/Standby Switchover Class?
- What Data Type in Hive Tables Is Recommended for the Number Type of DynamoDB?
- Can I Export the Query Result of Hive Data?
- What Should I Do If an Error Occurs When Hive Runs the beeline -e Command to Execute Multiple Statements?
- What Do I Do If "over max user connections" Is Displayed When Hue Connects to HiveServer?
- How Do I View MRS Hive Metadata?
- How Do I Reset Kafka Data?
- What Should I Do If the Error Message "Not Authorized to access group XXX" Is Displayed When Kafka Topics Are Consumed?
- What Compression Algorithms Does Kudu Support?
- How Do I View Kudu Logs?
- How Do I Handle the Kudu Service Exceptions Generated During Cluster Creation?
- How Do I Configure Other Data Sources on Presto?
- How Do I Update the Ranger Certificate in MRS 1.9.3?
- How Do I Specify a Log Path When Submitting a Task in an MRS Storm Cluster?
- How Do I Check the ResourceManager Configuration of Yarn?
- How Do I Modify the allow_drop_detached Parameter of ClickHouse?
- How Do I Add a Periodic Deletion Policy to Prevent Large ClickHouse System Table Logs?
-
Cluster Management
- How Do I View All Clusters?
- How Do I View MRS Operation Logs?
- How Do I View MRS Cluster Configuration Information?
- How Do I Add Components to an MRS Cluster?
- How Do I Cancel Message Notification for Cluster Alarms?
- Why Is the Resource Pool Memory Displayed in the MRS Cluster Smaller Than the Actual Cluster Memory?
- What Is the Python Version Installed for an MRS Cluster?
- How Do I Upload a Local File to a Node Inside a Cluster?
- What Can I Do If the Time Information of an MRS Cluster Node Is Incorrect?
- What Are the Differences and Relationships Between the MRS Management Console and MRS Manager?
- How Do I Unbind an EIP from FusionInsight Manager of an MRS Cluster?
- How Do I Stop the Firewall Service?
- How Do I Switch the Login Mode of a Node in an MRS Cluster?
- How Do I Access an MRS Cluster from a Node Outside the Cluster?
- In an MRS Streaming Cluster, Can the Kafka Topic Monitoring Function Send Alarm Notifications?
- Where can I view the running resource queues when ALM-18022 Insufficient Yarn Queue Resources is generated?
- How Do I Understand the Multi-Level Chart Statistics in the HBase Operation Requests Metric?
-
Node Management
- What are the Operating Systems of Hosts in MRS Clusters of Different Versions?
- Do I Need to Shut Down a Master Node Before Upgrading It?
- Can I Change MRS Cluster Nodes on the MRS Console?
- How Do I Query the Startup Time of an MRS Node?
- What Do I Do If Trust Relationships Between Nodes Are Abnormal?
- Can Master Node Specifications Be Adjusted in an MRS Cluster?
- Can Sudo Logs of Nodes in an MRS Cluster Be Cleared?
- How Do I Partition Disks in an MRS Cluster?
- Does an MRS Cluster Support System Reinstallation?
- Can I Change the OS of an MRS Cluster?
- Component Management
-
Job Management
- What Types of Spark Jobs Can Be Submitted in a Cluster?
- What Should I Do If Error 408 Is Reported When an MRS Node Accesses OBS?
- How Do I Enable Different Service Programs to Use Different Yarn Queues?
- What Should I Do If a Job Fails to Be Submitted and the Error Is Related to OBS?
- Can I Run Multiple Spark Tasks at the Same Time After the Minimum Tenant Resources of an MRS Cluster Is Changed to 0?
- What Should I Do If Job Parameters Separated By Spaces Cannot Be Identified?
- What Are the Differences Between the Client Mode and Cluster Mode of Spark Jobs?
- How Do I View MRS Job Logs?
- What Can I Do If the System Displays a Message Indicating that the Current User Does Not Exist on Manager When I Submit a Job?
- What Can I Do If LauncherJob Fails to Be Executed and the Error Message "jobPropertiesMap is null" Is Displayed?
- What Should I Do If the Flink Job Status on the MRS Console Is Inconsistent with That on Yarn?
- What Can I Do If a SparkStreaming Job Fails After Running for Dozens of Hours and Error 403 Is Reported for OBS Access?
- What Should I Do If Error Message "java.io.IOException: Connection reset by peer" Is Displayed During the Execution of a Spark Job?
- What Should I Do If the Error Message "requestId=XXX" Is Displayed When a Spark Job Accesses OBS?
- What Should I Do If the Error Message "UnknownScannerExeception" Is Displayed for Spark Jobs?
- What Can I Do If DataArts Studio Occasionally Fails to Schedule Spark Jobs?
- What Should I Do If a Flink Job Fails to Execute and the Error Message "java.lang.NoSuchFieldError: SECURITY_SSL_ENCRYPT_ENABLED" Is Displayed?
- What Should I Do If Submitted Yarn Jobs Cannot Be Viewed on the Web UI?
- What Can I Do If launcher-job Is Terminated by Yarn When a Flink Task Is Submitted?
- What Should I Do If the Error Message "slot request timeout" Is Displayed When I Submit a Flink Job?
- FAQs About Importing and Exporting Data Using DistCP Jobs
- How Do I View SQL Statements of Hive Jobs on the Yarn Web UI?
- How Do I View Logs of a Specified Yarn Task?
- What Should I Do If a HiveSQL/HiveScript Job Fails to be Submitted After Hive Is Added?
- Where Are the Execution Logs of Spark Jobs Stored?
- What Should I Do If an Alarm Indicating Insufficient Memory Is Reported During Spark Task Execution?
- What Can I Do If an Alarm is Generated Because the NameNode Is not Restarted on Time After the hdfs-site.xml File Is Modified?
- What Should I Do If It Takes a Long Time for Spark SQL to Access Hive Partitioned Tables Before a Job Starts?
- How Does the System Select the Queue When an MRS Cluster User Is Associated to Multiple Queues?
-
Performance Tuning
- How Do I Obtain the Hadoop Pressure Test Tool?
- How Do I Improve the Resource Utilization of Core Nodes in a Cluster?
- How Do I Configure the knox Memory?
- How Do I Adjust the Memory Size of the manager-executor Process?
- What Should I Do If the spark.yarn.executor.memoryOverhead Setting Does Not Take Effect?
- How Do I Improve Presto Resource Usage?
-
Application Development
- How Do I Get My Data into OBS or HDFS?
- Can MRS Write Data to HBase Through an HBase External Table of Hive?
- Where Can I Download the Dependency Package (com.huawei.gaussc10) in the Hive Sample Project?
- Does MRS Support Python Code?
- Does OpenTSDB Support Python APIs?
- How Do I Obtain a Spark JAR File?
- How Do I Configure the node_id Parameter When Calling the API for Adjusting Cluster Nodes?
- How Do I Manage and Use Third-Party JAR Packages for MRS Cluster Components?
-
Peripheral Service Interconnection
- Does MRS Support Read and Write Operations on DLI Service Tables?
- Does OBS Support the ListObjectsV2 Protocol?
- Can a Crawler Service Be Deployed on Nodes in an MRS Cluster?
- Does MRS Support Secure Deletion?
- How Do I Use PySpark to Connect MRS Spark?
- Why Mapped Fields Do not Exist in the Database After HBase Synchronizes Data to CSS?
- Can MRS Connect to an External KDC?
- What Can I Do If Jetty Is Incompatible with MRS when Open-Source Kylin 3.x Is Interconnected with MRS 1.9.3?
- What Should I Do If Data Failed to Be Exported from MRS to an OBS Encrypted Bucket?
- How Do I Interconnect MRS with LTS?
- How Do I Install HSS on MRS Cluster Nodes?
- How Do I Connect to HBase of MRS Through HappyBase?
- Can the Hive Driver Be Interconnected with DBCP2?
- Upgrade and Patching
-
MRS Basics
-
Troubleshooting
- Account Passwords
- Account Permissions
-
Common Exceptions in Logging In to the Cluster Manager
- Failed to Access Manager of an MRS Cluster
-
Accessing the Web Pages
- Error "502 Bad Gateway" Is Reported During the Access to MRS Manager
- An Error Message Occurs Indicating that the VPC Request Is Incorrect During the Access
- Error 503 Is Reported When Manager Is Accessed Through Direct Connect
- Error Message "You have no right to access this page." Is Displayed When Users log in to the Cluster Page
- Error Message "Invalid credentials" Is Displayed When a User Logs In to Manager
- Failed to Log In to the Manager After Timeout
- Failed to Log In to MRS Manager After the Python Upgrade
- Failed to Log In to MRS Manager After Changing the Domain Name
- Manager Page Is Blank After a Success Login
- Cluster Login Fails Because Native Kerberos Is Installed on Cluster Nodes
- Using Google Chrome to Access MRS Manager on macOS
- How Do I Unlock a User Who Logs in to Manager?
- Why Does the Manager Page Freeze?
-
Common Exceptions in Accessing the MRS Web UI
- How Do I Do If an Error Is Reported or Some Functions Are Unavailable When I Access the Web UIs of HDFS, Hue, YARN, HetuEngine, and Flink?
- Error 500 Is Reported When a User Accesses the Component Web UI
- [HBase WebUI] Users cannot switch from the HBase WebUI to the RegionServer WebUI
- [HDFS WebUI] When users access the HDFS WebUI, an error message is displayed indicating that the number of redirections is too large
- [HDFS WebUI] Failed to access the HDFS WebUI using the Internet Explorer
- [Hue Web UI] A "No Permission" Error Is Displayed When a User Log In to the Hue Web UI
- [Hue Web UI] Failed to Access the Hue Web UI
- [Hue WebUI] The error "Proxy Error" is reported when a user accesses the Hue WebUI
- [Hue WebUI] Why Is the Hue Native Page Cannot Be Properly Displayed If the Hive Service Is Not Installed in a Cluster?
- Hue (Active) Cannot Open Web Pages
- [Ranger WebUI] Why Cannot a New User Log In to Ranger After Changing the Password?
- [Tez WebUI] Error 404 is reported when users access the Tez WebUI
- [Spark WebUI] Why Cannot I Switch from the Yarn Web UI to the Spark Web UI?
- [Spark WebUI] What Can I Do If an Error Occurs when I Access the Application Page Because the Application Cached by HistoryServer Is Recycled?
- [Spark WebUI] Why Is the Native Page of an Application in Spark2x JobHistory Displayed Incorrectly?
- [Spark WebUI] The Spark2x WebUI fails to be accessed using the Internet Explorer
- [Yarn Web UI] Failed to Access the Yarn Web UI
- APIs
-
Cluster Management
- Failed to Reduce Task Nodes
- OBS Certificate in a Cluster Expired
- Replacing a Disk in an MRS Cluster (Applicable to 2.x and Earlier)
- Replacing a Disk in an MRS Cluster (Applicable to 3.x)
- Failed to Execute an MRS Backup Task
- Inconsistency Between df and du Command Output on the Core Node
- Disassociating a Subnet from a Network ACL
- MRS Cluster Becomes Abnormal After the Hostname of a Node Is Changed
- Processes Are Terminated Unexpectedly
- Failed to Configure Cross-Cluster Mutual Trust for MRS
- Network Is Unreachable When Python Is Installed on an MRS Cluster Node Using Pip3
- Connecting the Open-Source confluent-kafka-go to an MRS Security Cluster
- Failed to Execute the Periodic Backup Task of an MRS Cluster
- Failed to Download the MRS Cluster Client
- An Error Is Reported When a Flink Job Is Submitted in an MRS Cluster with Kerberos Authentication Enabled
- An Error Is Reported When the Insert Command Is Executed on the Hive Beeline CLI
- Upgrading the OS to Fix Vulnerabilities for an MRS Cluster Node
- Failed to Migrate Data to MRS HDFS Using CDM
- Alarms Indicating Heartbeat Interruptions Between Nodes Are Frequently Generated in the MRS Cluster
- High Memory Usage of the PMS Process
- High Memory Usage of the Knox Process
- It Takes a Long Time to Access HBase from a Client Outside a Security Cluster
- Failed to Submit Jobs
- OS Disk Space Is Insufficient Due to Oversized HBase Log Files
- OS Disk Space Is Insufficient Due to Oversized HDFS Log Files
- An Exception Occurs During Specifications Upgrade of Nodes in an MRS Cluster
- Failed to Delete a New Tenant on FusionInsight Manager
- MRS Cluster Becomes Unavailable After the VPC Is Changed
- Failed to Submit Jobs on the MRS Console
- Error "symbol xxx not defined in file libcrypto.so.1.1" Is Displayed During HA Certificate Generation
- Some Instances Fail to Be Started After Core Nodes Are Added to the MRS Cluster
- Using Alluixo
- Using ClickHouse
-
Using DBService
- DBServer Instance Is in Abnormal Status
- DBServer Instance Remains in the Restoring State
- Default Port 20050 or 20051 of DBService Is Occupied
- DBServer Instance Is Always in the Restoring State Because the Incorrect /tmp Directory Permission
- Failed to Execute a DBService Backup Task
- Components Failed to Connect to DBService in Normal State
- DBServer Failed to Start
- DBService Backup Failed Because the Floating IP Address Is Unreachable
- DBService Failed to Start Due to the Loss of the DBService Configuration File
-
Using Flink
- Error Message "Error While Parsing YAML Configuration File: Security.kerberos.login.keytab" Is Displayed When a Command Is Executed on the Flink Client
- Error Message "Error while parsing YAML configuration file : security.kerberos.login.principal:pippo" Is Displayed When a Command Is Executed on the Flink Client
- Error Message "Could Not Connect to the Leading JobManager" Is Displayed When a Command Is Executed on the Flink Client
- Failed to Create a Flink Cluster by Running yarn-session As Different Users
- Flink Service Program Fails to Read Files on the NFS Disk
- Failed to Customize the Flink Log4j Log Level
- Using Flume
-
Using HBase
- Slow Response to HBase Connection
- Failed to Authenticate the HBase User
- RegionServer Failed to Start Because the Port Is Occupied
- HBase Failed to Start Due to Insufficient Node Memory
- HBase Service Unavailable Due to Poor HDFS Performance
- HBase Failed to Start Due to Inappropriate Parameter Settings
- RegionServer Failed to Start Due to Residual Processes
- HBase Failed to Start Due to a Quota Set on HDFS
- HBase Failed to Start Due to Corrupted Version Files
- High CPU Usage Caused by Zero-Loaded RegionServer
- HBase Failed to Start with "FileNotFoundException" in RegionServer Logs
- The Number of RegionServers Displayed on the Native Page Is Greater Than the Actual Number After HBase Is Started
- RegionServer Instance Is in the Restoring State
- HBase Failed to Start in a Newly Installed Cluster
- HBase Failed to Start Due to the Loss of the ACL Table Directory
- HBase Failed to Start After the Cluster Is Powered Off and On
- Failed to Import HBase Data Due to Oversized File Blocks
- Failed to Load Data to the Index Table After an HBase Table Is Created Using Phoenix
- Failed to Run the hbase shell Command on the MRS Cluster Client
- Disordered Information Display on the HBase Shell Client Console Due to Printing of the INFO Information
- HBase Failed to Start Due to Insufficient RegionServer Memory
- Failed to Start HRegionServer on the Node Newly Added to the Cluster
- Region in the RIT State for a Long Time Due to HBase File Loss
-
Using HDFS
- HDFS NameNode Instances Become Standby After the RPC Port Is Changed
- An Error Is Reported When the HDFS Client Is Connected Through a Public IP Address
- Failed to Use Python to Remotely Connect to the Port of HDFS
- HDFS Capacity Reaches 100%, Causing Unavailable Upper-Layer Services Such as HBase and Spark
- Error Message "Permission denied" Is Displayed When HDFS and Yarn Are Started
- HDFS Users Can Create or Delete Files in Directories of Other Users
- A DataNode of HDFS Is Always in the Decommissioning State
- HDFS NameNode Failed to Start Due to Insufficient Memory
- A Large Number of Blocks Are Lost in HDFS due to the Time Change Using ntpdate
- CPU Usage of DataNodes Is Close to 100% Occasionally, Causing Node Loss
- Manually Performing Checkpoints When a NameNode Is Faulty for a Long Time
- Error "Failed to place enough replicas" Is Reported When HDFS Reads or Writes Files
- Maximum Number of File Handles Is Set to a Too Small Value, Causing File Reading and Writing Exceptions
- HDFS Client File Fails to Be Closed After Data Writing
- File Fails to Be Uploaded to HDFS Due to File Errors
- After dfs.blocksize Is Configured on the UI and Data Is Uploaded, the Block Size Does Not Change
- HDFS File Fails to Be Read, and Error Message "FileNotFoundException" Is Displayed
- Failed to Write Files to HDFS, and Error Message "item limit of xxx is exceeded" Is Displayed
- Adjusting the Log Level of the HDFS SHDFShell Client
- HDFS File Read Fails, and Error Message "No common protection layer" Is Displayed
- Failed to Write Files Because the HDFS Directory Quota Is Insufficient
- Balancing Fails, and Error Message "Source and target differ in block-size" Is Displayed
- Failed to Query or Delete HDFS Files
- Uneven Data Distribution Due to Non-HDFS Data Residuals
- Uneven Data Distribution Due to HDFS Client Installation on the DataNode
- Unbalanced DataNode Disk Usages of a Node
- Locating Common Balance Problems
- HDFS Displays Insufficient Disk Space But 10% Disk Space Remains
- Error Message "error creating DomainSocket" Is Displayed When the HDFS Client Installed on the Core Node in a Normal Cluster Is Used
- HDFS Files Fail to Be Uploaded When the Client Is Installed on a Node Outside the Cluster
- Insufficient Number of Replicas Is Reported During High Concurrent HDFS Writes
- HDFS Client Failed to Delete Overlong Directories
- An Error Is Reported When a Node Outside the Cluster Accesses MRS HDFS
- "ALM-12027 Host PID Usage Exceeds the Threshold" Is Generated for a NameNode
- ALM-14012 JournalNode Is Out of Synchronization Is Generated in the Cluster
- Failed to Decommission a DataNode Due to HDFS Block Loss
- An Error Is Reported When DistCP Is Used to Copy an Empty Folder
-
Using Hive
- Common Hive Logs
- Failed to Start Hive
- Error Message "Cannot modify xxx at runtime" Is Displayed When the set Command Is Executed in a Security Cluster
- Specifying a Queue When Submitting a Hive Task
- Setting the Map/Reduce Memory on the Client
- Specifying the Output File Compression Format When Importing a Hive Table
- Description of the Hive Table Is Too Long to Be Completely Displayed
- NULL Is Displayed When Data Is Inserted After the Partition Column Is Added to a Hive Table
- New User Created in the Cluster Does Not Have the Permission to Query Hive Data
- An Error Is Reported When SQL Is Executed to Submit a Task to a Specified Queue
- An Error Is Reported When the "load data inpath" Command Is Executed
- An Error Is Reported When the "load data local inpath" Command Is Executed
- An Error Is Reported When the create external table Command Is Executed
- An Error Is Reported When the dfs -put Command Is Executed on the Beeline Client
- Insufficient Permissions to Execute the set role admin Command
- An Error Is Reported When a UDF Is Created on the Beeline Client
- Hive Is Faulty
- Difference Between Hive Service Health Status and Hive Instance Health Status
- "authentication failed" Is Displayed During an Attempt to Connect to the Shell Client
- Failed to Access ZooKeeper from the Client
- "Invalid function" Is Displayed When a UDF Is Used
- Hive Service Status Is Unknown
- Health Status of a HiveServer or MetaStore Instance Is unknown
- Health Status of a HiveServer or MetaStore Instance Is Concerning
- Garbled Characters Returned Upon a Query If Text Files Are Compressed Using ARC4
- Hive Task Failed to Run on the Client but Successful on Yarn
- Error Message "Execution Error return code 2" Is Displayed When the SELECT Statement Is Executed
- Failed to Perform drop partition When There Are a Large Number of Partitions
- Failed to Start the Local Task When the Join Operation Is Performed
- WebHCat Fails to Be Started After the Hostname Is Changed
- An Error Is Reported When the Hive Sample Program Is Running After the Domain Name of a Cluster Is Changed
- Hive MetaStore Exception Occurs When the Number of DBService Connections Exceeds the Upper Limit
- Error Message "Failed to execute session hooks: over max connections" Is Displayed on the Beeline Client
- Error Message "OutOfMemoryError" Is Displayed on the Beeline Client
- Task Execution Fails Because the Input File Number Exceeds the Threshold
- Hive Task Execution Fails Because of Stack Memory Overflow
- Task Failed Due to Concurrent Writes to One Table or Partition
- Hive Task Failed Due to a Lack of HDFS Directory Permission
- Failed to Load Data to Hive Tables
- Failed to Run the Application Developed Based on the Hive JDBC Code Case
- HiveServer and HiveHCat Process Faults
- Error Message "ConnectionLoss for hiveserver2" Is Displayed When MRS Hive Connects to ZooKeeper
- An Error Is Reported When Hive Executes the insert into Statement
- Timeout Reported When Adding the Hive Table Field
- Failed to Restart Hive
- Failed to Delete a Table Due to Excessive Hive Partitions
- An Error Is Reported When msck repair table Is Executed on Hive
- Insufficient User Permission for Running the insert into Command on Hive
- Releasing Disk Space After Dropping a Table in Hive
- Abnormal Hive Query Due to Damaged Data in the JSON Table
- Connection Timed Out During SQL Statement Execution on the Hive Client
- WebHCat Failed to Start Due to Abnormal Health Status
- WebHCat Failed to Start Because the mapred-default.xml File Cannot Be Parsed
- An SQL Error Is Reported When the Number of MetaStore Dynamic Partitions Exceeds the Threshold
- Using Hue
- Using Impala
-
Using Kafka
- An Error Is Reported When the Kafka Client Is Run to Obtain Topics
- Using Python3.x to Connect to Kafka in a Security Cluster
- Flume Normally Connects to Kafka but Fails to Send Messages
- Producer Fails to Send Data and Error Message "NullPointerException" Is Displayed
- Producer Fails to Send Data and Error Message "TOPIC_AUTHORIZATION_FAILED" Is Displayed
- Producer Occasionally Fails to Send Data and the Log Displays "Too many open files in system"
- Consumer Is Initialized Successfully, but the Specified Topic Message Cannot Be Obtained from Kafka
- Consumer Fails to Consume Data and Remains in the Waiting State
- SparkStreaming Fails to Consume Kafka Messages, and "Error getting partition metadata" Is Displayed
- Consumer Fails to Consume Data in a Newly Created Cluster, and Message " GROUP_COORDINATOR_NOT_AVAILABLE" Is Displayed
- SparkStreaming Fails to Consume Kafka Messages, and Message "Couldn't find leader offsets" Is Displayed
- Consumer Fails to Consume Data and Message "SchemaException: Error reading field" Is Displayed
- Kafka Consumer Loses Consumed Data
- Failed to Start Kafka Due to Account Lockout
- Kafka Broker Reports Abnormal Processes and the Log Shows "IllegalArgumentException"
- Kafka Topics Cannot Be Deleted
- Error "AdminOperationException" Is Displayed When a Kafka Topic Is Deleted
- When a Kafka Topic Fails to Be Created, "NoAuthException" Is Displayed
- Failed to Set an ACL for a Kafka Topic, and "NoAuthException" Is Displayed
- When a Kafka Topic Fails to Be Created, "NoNode for /brokers/ids" Is Displayed
- When a Kafka Topic Fails to Be Created, "replication factor larger than available brokers" Is Displayed
- Consumer Repeatedly Consumes Data
- Leader for the Created Kafka Topic Partition Is Displayed as none
- Safety Instructions on Using Kafka
- Obtaining Kafka Consumer Offset Information
- Adding or Deleting Configurations for a Topic
- Reading the Content of the __consumer_offsets Internal Topic
- Configuring Logs for Shell Commands on the Kafka Client
- Obtaining Topic Distribution Information
- Kafka HA Usage Description
- Failed to Manage a Kafka Cluster Using the Kafka Shell Command
- Kafka Producer Writes Oversized Records
- Kafka Consumer Reads Oversized Records
- High Usage of Multiple Disks on a Kafka Cluster Node
- Kafka Is Disconnected from the ZooKeeper Client
- Using Oozie
-
Using Presto
- During sql-standard-with-group Configuration, a Schema Fails to Be Created and the Error Message "Access Denied" Is Displayed
- Coordinator Process of Presto Cannot Be Started
- When Presto Queries a Kudu Table, an Error Is Reported Indicating That the Table Cannot Be Found
- No Data is Found in the Hive Table Using Presto
- Error Message "The node may have crashed or be under too much load" Is Displayed During MRS Presto Query
- Accessing Presto from an MRS Cluster Through a Public Network
-
Using Spark
- An Error Is Reported When the Split Size Is Changed for a Running Spark Application
- Incorrect Parameter Format Is Displayed When a Spark Task Is Submitted
- Spark, Hive, and Yarn Are Unavailable Due to Insufficient Disk Capacity
- A Spark Job Fails to Run Due to Incorrect JAR File Import
- Spark Job Suspended Due to Insufficient Memory or Lack of JAR Packages
- Error "ClassNotFoundException" Is Reported When a Spark Task Is Submitted
- Driver Displays a Message Indicating That the Running Memory Exceeds the Threshold When a Spark Task Is Submitted
- Error "Can't get the Kerberos realm" Is Reported When a Spark Task Is Submitted in Yarn-Cluster Mode
- Failed to Start spark-sql and spark-shell Due to JDK Version Mismatch
- ApplicationMaster Fails to Start Twice When a Spark Task Is Submitted in Yarn-client Mode
- Failed to Connect to ResourceManager When a Spark Task Is Submitted
- DataArts Studio Failed to Schedule Spark Jobs
- Job Status Is error After a Spark Job Is Submitted Through an API
- ALM-43006 Is Repeatedly Reported for the MRS Cluster
- Failed to Create or Delete a Table in Spark Beeline
- Failed to Connect to the Driver When a Spark Job Is Submitted on a Node Outside the Cluster
- Large Number of Shuffle Results Are Lost During Spark Task Execution
- Disk Space Is Insufficient Due to Long-Term Running of JDBCServer
- Failed to Load Data to a Hive Table Across File Systems by Running SQL Statements Using Spark Shell
- Spark Task Submission Failure
- Spark Task Execution Failure
- JDBCServer Connection Failure
- Failed to View Spark Task Logs
- Spark Streaming Task Submission Issues
- Authentication Fails When Spark Connects to Other Services
- Authentication Fails When Spark Connects to Kafka
- An Error Occurs When SparkSQL Reads the ORC Table
- Failed to Switch to the Log Page from stderr and stdout on the Native Spark Web UI
- An Error Is Reported When spark-beeline Is Used to Query a Hive View
-
Using Sqoop
- Connecting Sqoop to MySQL
- Failed to Find the HBaseAdmin.<init> Method When Sqoop Reads Data from the MySQL Database to HBase
- An Error Is Reported When a Sqoop Task Is Created Using Hue to Import Data from HBase to HDFS
- A Data Format Error Is Reported When Data Is Exported from Hive to MySQL 8.0 Using Sqoop
- An Error Is Reported When the sqoop import Command Is Executed to Extract Data from PgSQL to Hive
- Failed to Use Sqoop to Read MySQL Data and Write Parquet Files to OBS
- An Error Is Reported When Database Data Is Migrated Using Sqoop
-
Using Storm
- Invalid Hyperlink of Events on the Storm Web UI
- Failed to Submit the Storm Topology
- Failed to Submit the Storm Topology and Message "Failed to check principle for keytab" Is Displayed
- Worker Logs Are Empty After the Storm Topology Is Submitted
- Worker Runs Abnormally After the Storm Topology Is Submitted and Error "Failed to bind to XXX" Is Displayed
- "well-known file is not secure" Is Displayed When the jstack Command Is Used to Check the Process Stack
- Data Cannot Be Written to Bolts When the Storm-JDBC Plug-in Is Used to Develop Oracle Databases
- Internal Server Error Is Displayed When the User Queries Information on the Storm UI
- Using Ranger
-
Using Yarn
- A Large Number of Jobs Occupying Resources After Yarn Is Started in a Cluster
- Error "GC overhead" Is Reported When Tasks Are Submitted Using the hadoop jar Command on the Client
- Disk Space of a Node Is Used Up Due to Oversized Aggregated Logs of Yarn
- Temporary Files Are Not Deleted When a MapReduce Job Is Abnormal
- Incorrect Port Information of the Yarn Client Causes Error "connection refused" After a Task Is Submitted
- "Could not access logs page!" Is Displayed When Job Logs Are Queried on the Yarn Web UI
- Error "ERROR 500" Is Displayed When Queue Information Is Queried on the Yarn Web UI
- Error "ERROR 500" Is Displayed When Job Logs Are Queried on the Yarn Web UI
- An Error Is Reported When a Yarn Client Command Is Used to Query Historical Jobs
- Number of Files in the TimelineServer Directory Reaches the Upper Limit
- Using ZooKeeper
-
Storage-Compute Decoupling
- A User Without the Permission on the /tmp Directory Failed to Execute a Job for Accessing OBS
- When the Hadoop Client Is Used to Delete Data from OBS, It Does Not Have the Permission for the .Trash Directory
- An MRS Cluster Fails Authentication When Accessing OBS Because the NTP Time of Cluster Nodes Is Not Synchronized
On this page
Show all
Help Center/
MapReduce Service/
Troubleshooting/
Cluster Management/
MRS Cluster Becomes Unavailable After the VPC Is Changed
MRS Cluster Becomes Unavailable After the VPC Is Changed
Updated on 2024-12-18 GMT+08:00
Symptom
In an MRS cluster, after the VPCs of all nodes are changed on the ECS console, the cluster status becomes abnormal.
All services are unavailable. The Hive Beeline reports the following error.
Cause Analysis
MRS does not support VPC change. After the VPC is changed, the internal IP address of the node changes, but the configuration file and database still use the original IP address. As a result, functions such as cluster communication are abnormal, and the cluster status is also abnormal. Therefore, to restore the cluster, you need to change the VPC back and ensure that the IP address maps that in the hosts file.
Procedure
- Log in to the Master1 node and run the ifconfig command to view the new VPC. Run the cat /etc/hosts command to check the IP address recorded in the hosts file before the VPC change.
- Log in to the MRS console and view the cluster ID and VPC on the Dashboard page of the cluster.
- Log in to the ECS console, select Name in the search box, and enter the MRS cluster ID to search for all nodes in the MRS cluster.
- In the Operation column of the MRS cluster node, click More and choose Manage Network > Change VPC.
NOTE:
- You need to change the VPC for each node.
- When changing the VPC, ensure that the VPC, subnet, and security group are the same as those in the initial cluster configuration.
- Set Private IP Address to Assign new and enter the IP address of the node queried in 1.
- After the change is successful, click the node name, switch to the Network Interfaces tab, and enable Source/Destination Check again.
- Perform the following steps to rebind the virtual IP address to the master node of the cluster:
- Log in to the MRS console and access the MRS cluster. On the Dashboard page, click
next to Access Manager, set Access Mode to Direct Connect, and record the floating IP address of the cluster. View and take note of the subnet in Default Subnet.
- Log in to the VPC console, choose Virtual Private Cloud > Subnets, and search for the subnet of the MRS cluster.
- Click the subnet name, click the IP Addresses tab, and search for the floating IP address of the MRS cluster.
- Click Bind to Server in the Operation column of the floating IP address. On the Bind to Server page, select the master node of the MRS cluster. After the binding is successful, the following figure is displayed.
- Log in to the MRS console and access the MRS cluster. On the Dashboard page, click
- Wait for the cluster to restore.
Parent topic: Cluster Management
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.
The system is busy. Please try again later.