Overview

GaussDB ensures high availability, but accidental or intentional deletion of a database or table will result in data loss across both primary and standby nodes, making it unrecoverable from the standby node. In this case, you can only restore the deleted data from backup. GaussDB enables data restoration from backup, either to the state it was in when the backup was created or to a specific point in time.

This section outlines typical accidental operations and their corresponding recovery methods. For details, see Table 1. It also presents typical use cases and performance specifications for backup and restoration. For details, see Table 2. You can choose different data restoration methods based on service requirements.

Restoration Methods for Misoperations

**Table 1** Restoration methods for different misoperations
Scenario	Restoration Method	Restoration Scope	Instructions
An instance is deleted by mistake.	Locate the deleted instance in the recycle bin and rebuild it.	All databases and tables	Restoring an Instance from the Recycle Bin
An instance is deleted by mistake.	If a manual backup was created before the instance was deleted, restore the instance on the Backups page.	All databases and tables	Restoring an Instance from a Backup
A table is deleted by mistake.	Use the database and table restoration method to restore the table.	All databases and tables Certain databases and tables	Restoring Databases or Tables to a Specific Point in Time Restoring Databases or Tables Using a Backup
A database is deleted by mistake.	Use the database and table restoration method to restore the database.	All databases and tables Certain databases and tables
An entire table is overwritten, or the columns, rows, or data in a table is deleted or modified by mistake.	Use the database and table restoration method to restore table data.	All databases and tables Certain databases and tables

Backup and Restoration Use Cases and Performance Specifications

**Table 2** Backup and restoration use cases and performance specifications
Use Case	Key Performance Factor	Typical Data Volume	Performance Specifications
DB instance backup	Data size Network configuration	Data volume: Petabytes Object quantity: about 1 million	OBS backup and restoration specifications: In a standard environment, a full backup or restoration of 2 TB of data can be completed within 8 hours. With the right hardware, plenty of OBS bandwidth, a high compression ratio, and independent deployment, the full backup or restoration duration can be calculated using the following formula: Distributed instances Backup or restoration duration = (Total data volume of the DB instance/Number of shards)/min(Disk I/O read bandwidth, Compression bandwidth, Single-thread OBS transmission bandwidth/Compression ratio) Centralized instances Backup or restoration duration = Total data volume of the DB instance/min(Disk I/O read bandwidth, Compression bandwidth, Single-thread OBS transmission bandwidth/Compression ratio) NOTE: min() means that the smallest of the values listed is used. Disk read bandwidth: SATA SSD: 200 MB/s to 300 MB/s SAS SSD: ~500 MB/s NVMe SSD: ~1 GB/s Reserving enough bandwidth for database workloads is critical, or backup tasks may severely degrade performance. Compression bandwidth: LZ4 compression is used by default. Generally, the compression bandwidth ranges from 300 MB/s to 400 MB/s. The compression level ranges from 1 (default) to 9. Higher levels slow down compression and cause the backup to take longer. The exact time varies depending on data attributes. Single-thread OBS transmission bandwidth: 100 MB/s to 300 MB/s in unrestricted mode or the specified limit in speed-restricted mode. Compression ratio: LZ4 compression is used by default, achieving a compression ratio between 0.1 and 0.5. The compression ratio depends on various data attributes. Setting the parallel upload parameter to 2 or higher increases CPU and other resource usage during backups. Backup performance improves based on the ratio of OBS single-stream transmission bandwidth to total OBS bandwidth. However, if single-stream bandwidth multiplied by the parallel upload parameter exceeds the total bandwidth, no further performance gains are achieved. During the restoration of a backup set for hash bucket tables undergoing scale-out and redistribution in a distributed instance: Restoration time (excluding the redistribution process after restoration) ≤ 2 x Restoration duration of a backup set with the same data volume in the same way during non-scale-out + Redistribution duration of hash bucket tables with the same data volume. When restoring a backup set for hash bucket tables undergoing scale-out and redistribution, there are three steps: Step 1: Restore the full backups of all nodes, restore all incremental backups of the old DNs before scale-out, and replay logs. Step 2: Physically migrate the hash bucket files to be redistributed from the old DNs to the new DNs. Step 3: Restore all incremental backups of the new DNs and replay logs. The time it takes to start up a distributed instance after data restoration depends on the number of sequences and databases involved. During startup after restoration, the sequence information of each database is obtained and set in ETCD. Most of the time is spent on acquiring sequence information and configuring sequences in ETCD. Connecting to each database to acquire sequence information: The more the databases, the longer the time required. Configuring sequences in ETCD: The more the sequences, the longer the time required. Updating PGXC catalog information: When you connect to each database to update the pgxc_class and pgxc_slice catalog information, the more the databases, the longer the time required.
Database-level physical restoration	Data size Network configuration	-	Database-level physical restoration based on OBS consists of four steps: Step 1: Read all data for database-level restoration from the backup media. In a standard Huawei Cloud environment, 2 TB of data can be read within 8 hours. Step 2: Run VACUUM FREEZE on database-level data in the auxiliary database. The VACUUM FREEZE performance is as follows: Distributed instances: 1,400 GB/hour per shard. Parallel replication is allowed between shards. Centralized instances: 1,400 GB/hour. Step 3: Replicate the database-level data after VACUUM FREEZE to each DN replica of the production instance. The replication performance is as follows: Distributed instances: Replication performance = Data volume of a single shard/min(Network bandwidth, Disk I/O bandwidth). Parallel replication is allowed between shards. Centralized instances: Replication performance = Database-level data volume to be restored/min(Network bandwidth, Disk I/O bandwidth) Step 4: Import data to the production instance. The import performance is as follows: Distributed instances: Depending on the data volume per shard and disk I/O bandwidth. Parallel replication is allowed between shards. Centralized instances: Depending on the database-level data volume to be restored and disk I/O bandwidth. Recommended scenarios: Performance: For equivalent data volumes, database-level physical restoration achieves approximately 70% of the performance of instance-level physical restoration. If the total database-level data requiring restoration is below 70% of the instance-level data volume, database-level physical restoration is recommended. Availability: During a database-level restoration, other databases within the same instance remain operational, ensuring higher availability compared to an instance-level restoration. For uninterrupted access to other databases throughout the process, database-level physical restoration is recommended. Impacts: Before a database-level data import, ensure that flow control is disabled and the GUC parameter recovery_time_target is set to 0. Note that during this process, the throughput of the production environment may be impacted, typically reduced to 50% of its peak capacity, or, in extreme cases, as low as 25%. To avoid impacting services, perform fine-grained restorations during off-peak hours.
Table-level physical restoration	Data size Network configuration Table storage type Table attribute (column) type	-	Table-level physical restoration based on OBS consists of three steps: Step 1: Read all data for table-level restoration from the backup media. In a standard Huawei Cloud environment, 2 TB of data can be read within 8 hours. Step 2: Export table data from the auxiliary database to a local file. The export performance is about 25 MB/s. Step 3: Import the locally exported file into the production instance. When the GUC parameter page_version_check is set to off, the import speed is about 25 MB/s (setting this parameter to memory reduces the performance by about 15%). Additionally, factors such as the row count, table indexes, and triggers can further decrease import speeds to roughly 10 MB/s. Recommended scenarios: Performance: For equivalent data volumes, table-level physical restoration operates at approximately one-fifth the speed of instance-level restoration. Table-level physical restoration is recommended when the total data requiring restoration is below one-fifth of the instance-level data volume and does not exceed 1 TB. Availability: During a table-level restoration, other databases and tables within the same instance remain operational, ensuring higher availability compared to an instance-level restoration. For uninterrupted access to other databases and tables throughout the process, table-level physical restoration is recommended. Impacts: During a table-level restoration, the throughput of the production environment may be impacted, typically reduced to 50% of its peak capacity, or, in extreme cases, as low as 25%. To avoid impacting services, perform fine-grained restorations during off-peak hours.