Importing Data from MRS to GaussDB(DWS)
Importing Data from MRS to a Data Warehouse Cluster
MRS is a big data cluster running based on the open-source Hadoop ecosystem. It provides the industry's latest cutting-edge storage and analysis capabilities of massive volumes of data, satisfying your data storage and processing requirements. For details about MRS services, see the MapReduce Service User Guide.
You can use Hive/Spark (analysis cluster of MRS) to store massive volumes of service data. Hive/Spark data files are stored in HDFS. On GaussDB(DWS), you can connect a data warehouse cluster to MRS clusters, read data from HDFS files, and write the data to GaussDB(DWS) when the clusters are on the same network.
Import Process
Perform the following operations to import data from MRS to a data warehouse cluster:
- In the data warehouse cluster, create an MRS data source connection according to Creating an MRS Data Source Connection.
- Multiple MRS data sources can exist on the same network, but one GaussDB(DWS) cluster can connect to only one MRS cluster at a time.
- Create an HDFS foreign table for querying data from the MRS cluster over APIs of a foreign server.
For details, see Data Import > Importing Data from MRS to a Cluster in the Data Warehouse Service Database Development Guide.
- (Optional) When the HDFS configuration of the MRS cluster changes, update the MRS data source configuration on GaussDB(DWS). For details, see Updating the MRS Data Source Configuration.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot