Updated on 2023-04-03 GMT+08:00

gs_restore

Context

gs_restore is a tool provided by GaussDB(DWS) to import data that was exported using gs_dump. It can also be used to import files that were exported using gs_dump.

It has the following functions:

  • Imports data to the database.

    If a database is specified, data is imported in the database. For parallel import, the password for connecting to the database is required.

  • Imports data to the script file.

    If the database storing imported data is not specified, a script containing the SQL statement to recreate the database is created and written to a file or standard output. This script output is equivalent to the plain text output format of gs_dump.

Syntax

gs_restore [OPTION]... FILE
  • The FILE does not have a short or long parameter. It is used to specify the location for the archive files.
  • The dbname or -l parameter is required as prerequisites. Users cannot enter dbname and -l parameters at the same time.
  • gs_restore incrementally imports data by default. To prevent data exception caused by multiple import jobs, use the -e and -c parameters during the jobs. In this way, existing database objects in a target database are deleted before import; and errors during import will be ignored to proceed the import and the error information will be displayed after the import.

Parameter Description

Common parameters:

  • -d, --dbname=NAME

    Connects to the dbname database and imports data to the database.

  • -f, --file=FILENAME

    Specifies the output file for the generated script, or uses the output file in the list specified using -l.

    The default is the standard output.

    -f cannot be used in conjunction with -d.

  • -F, --format=c|d|t

    Specifies the format of the archive. The format does not need to be specified because the gs_restore determines the format automatically.

    Value range:

    • c/custom: The archive form is the customized format in section 4.21-gs_dump.
    • d/directory: The archive form is a directory archive format.
    • t/tar: The archive form is a tar archive format.
  • -l, --list

    Lists the forms of the archive. The operation output can be used for the input of the -L parameter. If filtering parameters, such as -n or -t, are used together with -l, they will restrict the listed items.

  • -v, --verbose

    Specifies the verbose mode.

  • -V, --version

    Prints the gs_restore version and exits.

  • -?, --help

    Shows help information about the parameters of gs_restore and exits.

Import parameters

  • -a, -data-only

    Imports only the data, not the schema (data definition). gs_restore incrementally imports data.

  • -c, --clean

    Cleans (deletes) existing database objects in the database to be restored before recreating them.

  • -C, --create

    Creates the database before importing data to it. (When this parameter is used, the database named with -d is used to issue the initial CREATE DATABASE command. All data is imported to the database that appears in the archive files.)

  • -e, --exit-on-error

    Exits if an error occurs when you send the SQL statement to the database. If you do not exit, the commands will still be sent and error information will be displayed when the import ends.

  • -I, --index=NAME

    Imports only the definition of the specified index. Multiple indexes can be imported. Enter -I index multiple times to import multiple indexes.

    For example:

    gs_restore -h host_name -p port_number -d gaussdb -I Index1 -I Index2 backup/MPPDB_backup.tar

    In this example, Index1 and Index2 will be imported.

  • -j, --jobs=NUM

    Specifies the number of concurrent, the most time-consuming jobs of gs_restore (such as loading data, creating indexes, or creating constraints). This parameter can greatly reduce the time to import a large database to a server running on a multiprocessor machine.

    Each job is one process or one thread, depending on the OS; and uses a separate connection to the server.

    The optimal value of this option depends on the hardware settings of the server, the client, the network, the number of CPU cores, and hard disk settings. It is recommended that the parameter be set to the number of CPU cores on the server. In addition, a larger value can also lead to faster import in many cases. However, an overly large value will lead to decreased performance because of thrashing.

    This parameter supports custom-format archives only. The input file must be a regular file (not the pipe file). This parameter can be ignored when you select the script method rather than connect to a database server. In addition, multiple jobs cannot be used in conjunction with the --single-transaction parameter.

  • -L, --use-list=FILENAME

    Imports only archive elements that are listed in list-file and imports them in the order that they appear in the file. If filtering parameters, such as -n or -t, are used in conjunction with -L, they will further limit the items to be imported.

    list-file is normally created by editing the output of a previous -l parameter. File lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the beginning of the row. An example is provided in this document.

  • -n, --schema=NAME

    Restores only objects that are listed in schemas.

    This parameter can be used in conjunction with the -t parameter to import a specific table.

    Entering -n schemaname multiple times can import multiple schemas.

    For example:

    gs_restore -h host_name -p port_number -d gaussdb -n sch1 -n sch2 backup/MPPDB_backup.tar

    In this example, sch1 and sch2 will be imported.

  • -O, --no-owner

    Do not output commands to set ownership of objects to match the original database. By default, gs_restore issues the ALTER OWNER or SET SESSION AUTHORIZATION statement to set ownership of created schema elements. Unless the system administrator or the user who has all the objects in the script initially accesses the database. Otherwise, the statement will fail. Any user name can be used for the initial connection using -O, and this user will own all the created objects.

  • -P, --function=NAME(args)

    Imports only listed functions. You need to correctly spell the function name and the parameter based on the contents of the dump file in which the function exists.

    Entering -P alone means importing all function-name(args) functions in a file. Entering -P with -n means importing the function-name(args) functions in a specified schema. Entering -P multiple times and using -n once means that all imported functions are in the -n schema by default.

    You can enter -n schema-name -P 'function-name(args)' multiple times to import functions in specified schemas.

    For example:

    ./gs_restore -h host_name -p port_number -d gaussdb -n test1 -P 'Func1(integer)' -n test2 -P 'Func2(integer)' backup/MPPDB_backup.tar

    In this example, both Func1 (i integer) in the test1 schema and Func2 (j integer) in the test2 schema will be imported.

  • -s, --schema-only

    Imports only schemas (data definitions), instead of data (table content). The current sequence value will not be imported.

  • -S, --sysadmin=NAME

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • -t, --table=NAME

    Imports only listed table definitions or data, or both. This parameter can be used in conjunction with the -n parameter to specify a table object in a schema. When -n is not entered, the default schema is PUBLIC. Entering -n schemaname -t tablename multiple times can import multiple tables in a specified schema.

    For example:

    Import table1 in the PUBLIC schema.

    gs_restore -h host_name -p port_number -d gaussdb -t table1 backup/MPPDB_backup.tar

    Import test1 in the test1 schema and test2 in the test2 schema.

    gs_restore -h host_name -p port_number -d gaussdb -n test1 -t test1 -n test2 -t test2 backup/MPPDB_backup.tar

    Import table1 in the PUBLIC schema and test1 in the test1 schema.

    gs_restore -h host_name -p port_number -d gaussdb -n PUBLIC -t table1 -n test1 -t table1 backup/MPPDB_backup.tar

    -t does not support the schema_name.table_name input format.

  • -T, --trigger=NAME

    This parameter is reserved for extension.

  • -x, --no-privileges/--no-acl

    Prevents the import of access permissions (GRANT/REVOKE commands).

  • -1, --single-transaction

    Executes import as a single transaction (that is, commands are wrapped in BEGIN/COMMIT).

    This parameter ensures that either all the commands are completed successfully or no application is changed. This parameter means --exit-on-error.

  • --disable-triggers

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --no-data-for-failed-tables

    By default, table data will be imported even if the statement to create a table fails (for example, the table already exists). Data in such table is skipped using this parameter. This operation is useful if the target database already contains the desired table contents.

    This parameter takes effect only when you import data directly into a database, not when you output SQL scripts.

  • --no-security-labels

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --no-tablespaces

    Does not issue commands to select tablespaces. If this parameter is used, all objects will be created during the import process no matter which tablespace is selected.

  • --section=SECTION

    Imports the listed sections (such as pre-data, data, or post-data).

  • --use-set-session-authorization

    Is used for plain-text backup.

    Outputs the SET SESSION AUTHORIZATION statement instead of the ALTER OWNER statement to determine object ownership. This parameter makes dump more standards-compatible. If the records of objects in exported files are referenced, import may fail. Only administrators can use the SET SESSION AUTHORIZATION statement to dump data, and the administrators must manually change and verify the passwords of exported files by referencing the SET SESSION AUTHORIZATION statement before import. The ALTER OWNER statement requires lower permissions.

  • --with-key=KEY

    Specifies that the key length of AES128 must be 16 bytes.

    If the dump is encrypted, enter the --with-key <keyname> parameter in the gs_restore command. If it is not entered, you will receive an error message.

    Enter the same key while entering the dump.

  • If any local additions need to be added to the template1 database during the installation, restore the output of gs_restore into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, copy data from template0 rather than template1. Example:
CREATE DATABASE foo WITH TEMPLATE template0;
  • gs_restore cannot import large objects selectively. For example, it can only import the objects of a specified table. If an archive contains large objects, all large objects will be imported, or none of them will be restored if they are excluded by using -L, -t, or other parameters.

1. The -d/--dbname and -f/--file parameters do not coexist.

2. The -s/--schema-only and -a/--data-only parameters do not coexist.

3. The -c/--clean and -a/--data-only parameters do not coexist.

4. When --single-transaction is used, -j/--jobs must be a single job.

5. --role must be used in conjunction with --rolepassword.

Connection parameters:

  • -h, --host=HOSTNAME

    Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment variable. If it is not set, a UNIX domain socket connection is attempted.

    This parameter is used only for defining names of the hosts outside a cluster. The names of the hosts inside the cluster must be 127.0.0.1.

  • -p, --port=PORT

    Specifies the TCP port listened to by the server or the local UNIX domain socket file name extension to ensure a correct connection. The default value is the PGPORT environment variable.

  • -U, --username=NAME

    Specifies the user name to connect to.

  • -w, --no-password

    Never issue a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.

  • -W, --password=PASSWORD

    Specifies the user password to connect to. If the host uses the trust authentication policy, the administrator does not need to enter the -W parameter. If the -W parameter is not provided and you are not a system administrator, gs_restore will ask you to enter a password.

  • --role=ROLENAME

    Specifies a role name for the import operation. If this parameter is selected, the SET ROLE statement will be issued after gs_restore connects to the database. It is useful when the authenticated user (specified by -U) lacks the permissions required by gs_restore. This parameter allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as the initial user. This parameter allows data to be imported without violating the policy.

  • --rolepassword=ROLEPASSWORD

    Specifies the password of the specific role.

Examples

Special case: Execute the gsql tool. Run the following commands to import the MPPDB_backup.sql file in the exported folder (in plain-text format) generated by gs_dump/gs_dumpall to the gaussdb database:

gsql -d gaussdb -p 8000 -W {password} -f /home/omm/test/MPPDB_backup.sql
SET
SET
SET
SET
SET
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
SET
CREATE INDEX
REVOKE
REVOKE
GRANT
GRANT
total time: 30476  ms

gs_restore is used to import the files exported by gs_dump.

Example 1: Execute the gs_restore tool to import the exported MPPDB_backup.dmp file (in custom format) to the gaussdb database.

gs_restore -W {password} backup/MPPDB_backup.dmp -p 8000 -d gaussdb
gs_restore: restore operation successful
gs_restore: total time: 13053  ms

Example 2: Execute the gs_restore tool to import the exported MPPDB_backup.tar file (in tar format) to the gaussdb database.

gs_restore backup/MPPDB_backup.tar -p 8000 -d gaussdb 
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21203  ms

Example 3: Execute the gs_restore tool to import the exported MPPDB_backup file (in directory format) to the gaussdb database.

gs_restore backup/MPPDB_backup -p 8000 -d gaussdb
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21003  ms

Example 4: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import all the object definitions and data in the PUBLIC schema. Existing objects are deleted from the target database before the import. If an existing object references to an object in another schema, you need to manually delete the referenced object first.

gs_restore backup/MPPDB_backup.dmp -p 8000 -d gaussdb -e -c -n PUBLIC
gs_restore: [archiver (db)] Error while PROCESSING TOC:
gs_restore: [archiver (db)] Error from TOC entry 313; 1259 337399 TABLE table1 gaussdba
gs_restore: [archiver (db)] could not execute query: ERROR:  cannot drop table table1 because other objects depend on it
DETAIL:  view t1.v1 depends on table table1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
    Command was: DROP TABLE public.table1;

Manually delete the referenced object and create it again after the import is complete.

gs_restore backup/MPPDB_backup.dmp -p 8000 -d gaussdb -e -c -n PUBLIC
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 2203  ms

Example 5: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import only the definition of table1 in the PUBLIC schema.

gs_restore backup/MPPDB_backup.dmp -p 8000 -d gaussdb -e -c -s -n PUBLIC -t table1
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21000  ms

Example 6: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import only the data of table1 in the PUBLIC schema.

gs_restore backup/MPPDB_backup.dmp -p 8000 -d gaussdb -e -a -n PUBLIC -t table1
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 20203  ms

When a cluster is created, the scheduler is started and some resources of the scheduler are created, including the schema scheduler where the tables of the scheduler are located, and the tables created when the scheduler is running, such as bandwidth_history_table, cpu_template_storage, io_template_storage, mem_template_storage, scheduler_config, scheduler_storage, task_history_storage, task_storage, vacuum_full_rslt, function scheduler_workload_query_func, and pg_task. When gs_restore is executed, the tables, schemas, and indexes of the scheduler are also restored. The scheduler is a resident process, and the new cluster automatically creates these objects. Therefore, an error message is displayed when gs_restore is executed, indicating that the objects of the scheduler exist. This error has no impact on normal cluster operations and can be ignored.

Helpful Links

gs_dump and gs_dumpall