Connecting to an MRS Hive Data Source
Overview
ROMA Connect can use the MRS Hive data source for data integration tasks. Before using the MRS Hive data source, you need to connect it to ROMA Connect.
- If two data integration tasks use MRS data sources of different versions (including MRS Hive, MRS HDFS, and MRS HBase) and Kerberos authentication is enabled for the MRS data sources, the two data integration tasks cannot be executed at the same time. Otherwise, the integration tasks fail.
- Only a maximum of one million data records can be integrated.
Prerequisites
- Each connected data source must belong to an integration application. Ensure that an integration application is available before connecting a data source, or create one first.
- Kerberos authentication has been enabled for the MRS cluster where the MRS Hive data source is located. The execution permission has been configured for machine-machine interaction users. For details, see Preparing a Development User.
Procedure
- Log in to the ROMA Connect console. On the Instances page, click View Console of an instance.
- In the navigation pane on the left, choose Data Sources. In the upper right corner of the page, click Access Data Source.
- On the Default tab page, select MRS Hive and click Next.
- Configure the data source connection information.
Table 1 Data source connection information Parameter
Description
Name
Enter a data source name. Using naming rules facilitates future search.
Encoding Format
Default: utf-8
Integration Application
Select the integration application to which the data source belongs.
Description
Enter a brief description of the data source.
HDFS URL
Enter the name of the MRS Hive file system to access.
- If the root directory is used, set this parameter to hdfs:///. This operation requires administrator permissions.
- If the default directory is used, set this parameter to hdfs:///hacluster. This operation requires administrator permissions.
- If a planned directory is used, set this parameter to the planned directory.
- If a user database directory is used, for example, /user/hive/testdb, the user must have the permission on the directory.
Machine-machine Username
Enter the machine-machine username for connecting to MRS Hive.
Configuration File
Click Upload to upload the MRS Hive configuration files. For details about how to obtain the files, see "Obtaining MRS Hive configuration files".
Obtaining MRS Hive configuration files
- Obtain the krb5.conf and user.keytab files.
Download the user authentication file from MRS Manager by following the procedure described in Downloading a User Authentication File, and decompress the file to obtain the krb5.conf and user.keytab files.
- Obtain the hiveclient.properties, core-site.xml, hdfs-site.xml, and hosts files.
Download the client configuration file from the MRS console by following the procedure described in Updating a Client Configuration File. After the file is decompressed:
- Obtain the hosts file from xxx_Services_ClientConfig_ConfigFiles.
- Obtain the hiveclient.properties file from xxx_Services_ClientConfig_ConfigFiles > Hive > config.
- Obtain the core-site.xml and hdfs-site.xml files from xxx_Services_ClientConfig_ConfigFiles > HDFS > config.
Check whether the value of dfs.client.failover.proxy.provider.hacluster in the hdfs-site.xml file is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider. If no, change it to this value.
- Create a Version file.
Create a text file named Version without an extension, and add version=MRS 3.1.0 to the file.
- Obtain the MRS Hive configuration files.
Save the obtained files to a new directory and compress them into a .zip package. All files are stored in the root directory of the .zip package.
- The file name contains a maximum of 255 characters, including only letters and digits.
- The file size cannot exceed 2 MB.
- Click Check Connectivity to check the connectivity between ROMA Connect and the data source.
- If the test result is Data source connected successfully, go to the next step.
- If the test result is Failed to connect to the data source, check the data source status and connection parameters, and click Recheck until the connection is successful.
- Click Create.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot