Exporting Data In Parallel
In high-concurrency scenarios, you can use GDS to export a large volume of data from a database to a common file system. To export data in parallel using a foreign table, you must enable the stream operator first.
Overview
- The CN plans data export tasks and delivers the tasks to DNs. Then the CN is released to process other tasks.
- The computing capabilities and bandwidths of all the DNs are fully leveraged to export data.
Figure 1 Exporting data using foreign tables
Concepts
- Data file: A TEXT, CSV, or FIXED file that stores data exported from GaussDB.
- Foreign table: a table that stores information, such as the format, location, and encoding format of a data file.
- GDS: a data service tool. To export data, deploy GDS on the server where data files are stored.
- Table: a table in the database, including row-store tables and column-store tables. Data in data files is exported from these tables.
- Local mode: Service data in a cluster is exported to hosts in the cluster.
- Remote mode: Service data in a cluster is exported to hosts outside the cluster.
Exporting a Schema
In GaussDB, data can be exported in local or remote mode.
- Remote mode: Service data in a cluster is exported to hosts outside the cluster.
- In this mode, multiple GDSs are used to concurrently export data. One GDS can export data for only one cluster at a time.
- The data export rate of a GDS that resides on the same intranet as cluster nodes is limited by the network bandwidth. A 10 GE configuration is recommended.
- Data files in TEXT, CSV, or FIXED format are supported. The size of data in a single row must be less than 1 GB.
- Local mode: Service data in a cluster is exported to hosts in the cluster. The local mode is dedicated to exporting data from a large number of small files.
- In this mode, data is evenly divided and stored in specified directories on cluster nodes, occupying the disk space of these cluster nodes.
- Data files in TEXT, CSV, or FIXED format are supported. The size of data in a single row must be less than 1 GB.
Export Process
Process |
Description |
Sub-task |
---|---|---|
Plan data export |
Prepare data to export and plan the export path. For details, see Planning Data Export. |
N/A |
Check whether the Local mode is selected |
Check the export mode specified during foreign table creation to determine whether the Local mode is selected. |
N/A |
Start GDS |
If the Remote mode is selected, install, configure, and start GDS on data servers. For details, see Installing, Configuring, and Starting GDS. |
N/A |
Create a foreign table |
Create a foreign table to help GDS specify information about a data file. The foreign table stores information, such as the location, format, encoding, and inter-data delimiter of a data file. For details, see Creating a GDS Foreign Table. |
N/A |
Export data |
After the foreign table is created, run the INSERT statement to efficiently export data to data files. For details, see Exporting Data. |
N/A |
Stop GDS |
Stop GDS after data is exported. For details, see Stopping GDS. |
N/A |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot