Help Center > > Developer Guide

Overview

Updated at: Jul 14, 2021 GMT+08:00
GaussDB(DWS) allows you to export ORC data to MRS using an HDFS foreign table. You can specify the export mode and export data format in the foreign table. Data is exported in parallel using multiple DNs from GaussDB(DWS) and stored in HDFS. In this way, the overall export performance is improved.
  • The CN only plans data export tasks and delivers the tasks to DNs. In this case, the CN is released to process other tasks.
  • In this way, the computing capabilities and bandwidths of all the DNs are fully leveraged to export data.
  • You can concurrently export data using multiple HDFS servers. The export path can be empty. The naming rules must be the same as those of the exported files.
  • MRS connects to GaussDB(DWS) cluster nodes. The export rate is affected the network bandwidth.
  • Data files in the ORC format are supported.

Naming Rules of Exported Files

The rules for naming ORC data files exported from GaussDB(DWS) are as follows:

  1. Export to MRS (HDFS): Data exported from DNs is stored as segments in HDFS. The file is named in the format of mpp_Database name_Schema name_Table name_Node name_segment_n.orc. n is a natural number starting from 0, for example, 0, 1, 2, 3.
  2. You are advised to export data from different clusters or databases to different paths. The maximum size of an ORC file is 128 MB, and that of a stripe file is 64 MB.
  3. After the export is complete, the _SUCCESS file is generated.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel