Updated on 2025-08-09 GMT+08:00

Restoring ClickHouse Metadata

Scenario

ClickHouse metadata needs to be restored in the following scenarios: Data is modified or deleted unexpectedly and needs to be restored. After a user performs major operations (such as upgrade and migration) on ClickHouse, an exception occurs or the expected result is not achieved. The ClickHouse component is faulty and becomes unavailable. Data is migrated to a new cluster.

Users can create a ClickHouse restoration task on FusionInsight Manager. Only manual restoration tasks are supported.

  • This function is supported only by MRS 3.1.0 or later.
  • Data restoration can be performed only when the system version is consistent with that during data backup.
  • To restore ClickHouse metadata when the service is running properly, you are advised to manually back up the latest ClickHouse metadata before restoration. Otherwise, the ClickHouse metadata that is generated after the data backup and before the data restoration will be lost.
  • ClickHouse metadata restoration and service data restoration cannot be performed at the same time. Otherwise, service data restoration fails. You are advised to restore service data after metadata restoration is complete.

Impact on the System

  • After the metadata is restored, the data generated after the data backup and before the data restoration is lost.
  • After the metadata is restored, the ClickHouse upper-layer applications need to be started.

Prerequisites

  • You have checked the path for storing ClickHouse metadata backup files.
  • If you need to restore data from a remote HDFS, a standby cluster has been created and the data has been backed up. For details, see Backing Up ClickHouse Metadata. If the active and standby clusters are deployed in security mode and they are not managed by the same FusionInsight Manager, mutual trust must be configured. For details, see Configuring Mutual Trust Between MRS Clusters. If the active and standby clusters are deployed in normal mode, no mutual trust is required.
  • In an active/standby cluster, the value of HADOOP_RPC_PROTECTION of ClickHouse must be the same as that of hadoop.rpc.protection in the HDFS when you restore data from the remote HDFS to the local host.
  • If you need to restore backup data in another MRS ClickHouse cluster to this cluster, the following requirements must be met:
    1. The two clusters are of the same MRS version.
    2. The two clusters are in the same mode.
    3. The two clusters have the same ClickHouse topology, including shards, backups.

Restoring ClickHouse Metadata

  1. On FusionInsight Manager, choose O&M > Backup and Restoration > Backup Management.
  2. In the Operation column of the specified task in the task list, choose More > View History.

    In the window that is displayed, select a success record and click View in the Backup Path column to view its backup path information and find the following information:

    • Backup Object: indicates the backup data source.
    • Backup Path: indicates the full path where backup files are stored.

      Select the correct path, and manually copy the full path of backup files in Backup Path.

  3. On FusionInsight Manager, choose O&M > Backup and Restoration > Restoration Management.
  4. Click Create.
  5. Set Task Name to the name of the restoration task.
  6. Select the cluster to be operated from Recovery Object.
  7. In Restoration Configuration, select ClickHouse under Metadata and other data.
  8. Set Path Type of ClickHouse to a restoration directory type.

    Table 1 Path for data restoration

    Directory Type

    Description

    LocalDir

    Indicates that data is restored from the local disk of the active management node.

    If you select this option, you also need to configure the following parameters:

    • Source Path: Enter the name of the backup file to be restored. To obtain the file name, log in to the active OMS node, go to the backup path copied in Step 2, and record the name of the metadata package, for example, Backup task name_Data source_Task execution time.tar.gz.

    RemoteHDFS

    Indicates that data is restored from the HDFS directory of the standby cluster.

    If you select this value option for MRS 3.2.0 or later clusters, you also need to configure the following parameters:
    • Source NameService Name: indicates the NameService name of the backup data cluster, for example, hacluster. You can obtain it from the NameService Management page of HDFS of the standby cluster.
    • IP Mode: indicates the mode of the target IP address. The system automatically selects an IP address mode based on the cluster network type, for example, IPv4 or IPv6.
    • Source Active NameNode IP Address: indicates the service plane IP address of the active NameNode in the standby cluster.
    • Source Standby NameNode IP Address: indicates the service plane IP address of the standby NameNode in the standby cluster.
    • Source NameNode RPC Port: indicates the value of dfs.namenode.rpc.port in the HDFS basic configuration of the destination cluster.
    • Source Path: Enter the complete HDFS path for storing backup data of the standby cluster, that is, the backup path copied in Step 2, for example, Backup path/Backup task name_Data source_Task creation time.
    If you select this value option for MRS 3.1.0 or 3.1.2 clusters, you also need to configure the following parameters:
    • Source NameService Name: indicates the NameService name of the backup data cluster, for example, hacluster. You can obtain it from the NameService Management page of HDFS of the standby cluster.
    • IP Mode: indicates the mode of the target IP address. The system automatically selects an IP address mode based on the cluster network type, for example, IPv4 or IPv6.
    • Source NameNode IP Address: indicates the IP address of the NameNode service plane in the standby cluster. It can be of an active or standby node.
    • Source Path: indicates the full path of HDFS directory for storing backup data of the standby cluster, for example, Backup path/Backup task name_Data source_Task creation time/Data source_Task execution time.tar.gz.

    OBS

    Indicates that data is restored from OBS. This option is available for MRS 3.3.0-LTS.1 and later versions only.

    If you select this option, you also need to configure the following parameters:

    • Source Path: indicates the full OBS path of a backup file, for example, Backup path/Backup task name_Data source_Task creation time/Version_Data source_Task execution time.tar.gz.

  9. Click OK.
  10. In the restoration task list, locate the row where the created task is located, and click Start in the Operation column. In the displayed dialog box, click OK to start the restoration task.

    • After the restoration is successful, the progress bar is in green.
    • After the restoration is successful, the restoration task cannot be executed again.
    • If the restoration task fails during the first execution, rectify the fault and click Retry to execute the task again.

  11. Choose Cluster > Services and start the ClickHouse service.