Help Center > > User Guide> Managing Data Connections

Managing Data Connections

Updated at: Mar 25, 2021 GMT+08:00

MRS data connections are used to manage external source connections used by components in a cluster. For example, if Hive metadata uses an external relational database, a data connection can be used to associate the external relational database with the Hive component.

This function is not supported in MRS 3.x.

When Hive metadata is switched between different clusters, MRS synchronizes only the permissions in the metadata database of the Hive component. The permission model on MRS is maintained on MRS Manager. Therefore, when Hive metadata is switched between clusters, the permissions of users or user groups cannot be automatically synchronized to MRS Manager of another cluster.

Creating a Data Connection

  1. Log in to the MRS management console, and choose Data Connections in the left navigation pane.
  2. Click Create Data Connection.
  3. Set parameters according to Table 1.

    Table 1 Data connection parameters

    Parameter

    Description

    Type

    Select the type of an external source connection.

    • RDS for PostgreSQL database. Clusters of MRS 1.9.2 or later that support Hive can connect to this type of database.
    • RDS for MySQL database. Clusters of MRS 1.9.x that supports Hive or Ranger can connect to this type of database.

    Name

    Name of a data connection

    RDS Instance

    RDS database instance, which must be created in RDS before being referenced here. You can click View RDS Instance to view the created instance.

    NOTE:
    • To ensure network communications between the cluster and the PostgreSQL database, you are advised to create the instance in the same VPC and subnet as the cluster.
    • The inbound rule of the security group of the RDS instance must allow access of the instance to port 3306. To configure that, click the instance name on the RDS console to go to the instance management page. In Connection Information area, click the name of Security Group. On the page that is displayed, click the Inbound Rules tab, and click Add Rule. On the displayed dialog box, in Protocol & Port area, select TCP and enter port number 3306. In Source area, enter the IP address of all nodes where the MetaStore instance of Hive resides.
    • Currently, MRS supports PostgreSQL9.5/PostgreSQL9.6 on RDS.
    • Currently, MRS supports only MySQL 5.7.x or later versions on RDS.

    Database

    Name of the database to be connected to

    Username

    Username for logging in to the database to be connected

    Password

    Password for logging in to the database to be connected

    Figure 1 Parameters for creating a data connection

    If the selected data connection is an RDS MySQL database, ensure that the database user is a root user. If the database user is not a root user, log in to the database as user root and run the following SQL statement to grant permissions to the database user. In the command, ${db_name} and ${db_user} indicate the database name and username entered during data connection creation.

    grant all privileges on mysql.* to '${db_user}'@'%' with grant option;
    grant all privileges on ${db_name}.* to '${db_user}'@'%' with grant option;
    grant reload on *.* to '${db_user}'@'%' with grant option;
    flush privileges;

  4. Click OK.

Editing a Data Connection

  1. Log in to the MRS management console, and choose Data Connections in the left navigation pane.
  2. In the Operation column of the data connection list, click Edit in the row where the data connection to be edited is located.
  3. Modify parameters according to Table 1.

    If the selected data connection has been associated with a cluster, the configuration changes will be synchronized to the cluster.

Deleting a Data Connection

  1. Log in to the MRS management console, and choose Data Connections in the left navigation pane.
  2. In the Operation column of the data connection list, click Delete in the row where the data connection to be deleted is located.

    If the selected data connection has been associated with a cluster, the deletion does not affect the cluster.

Configuring a data connection during cluster creation

  1. Log in to the MRS management console.
  2. Click Buy Cluster. The Buy Cluster page is displayed.
  3. On the page for purchasing a cluster, click the Custom Config tab.
  4. In the software configuration area, set Use External Data Sources to Store Metadata by referring to Table 2. For other parameters, see Custom Purchase of a Cluster for configuration and cluster creation.

    Table 2 Data connection parameters

    Parameter

    Description

    Use External Data Sources to Store Metadata

    Whether to use external data sources to store metadata. Click to enable this function. If this function is enabled, metadata will not be affected if a cluster is abnormal or deleted. This function applies to scenarios where storage and computing are separated.

    This function is available for components such as Hive and Ranger in clusters of MRS 1.9.2 or later.

    Component

    This parameter is valid only when Use External Data Sources to Store Metadata is enabled. It indicates the type of an external data source.

    • Hive
    • Ranger

    Data Connection Type

    This parameter is valid only when Use External Data Sources to Store Metadata is enabled. It indicates the type of an external data source.

    • Hive supports the following data connection types:
      • RDS PostgreSQL database
      • RDS MySQL database
      • Local database
    • Ranger supports the following data connection types:
      • RDS MySQL database
      • Local database

    Data Connection Instance

    This parameter is valid only when Data Connection Type is set to RDS PostgreSQL database or RDS MySQL database. Name of the connection between the MRS cluster and the RDS service database. This instance must be created before being referenced here. In addition, you need to manually create a database named hive in the instance that connects to the Hive component (Ranger automatically creates the corresponding database). Otherwise, the cluster fails to be created. You can click Create Data Connection to create a data connection. For details, see Managing Data Connections.

    Figure 2 Configuring a data connection during cluster creation

Managing Data Connections in an Existing Cluster

  1. Log in to the MRS management console. In the left navigation pane, choose Clusters > Active Clusters.
  2. Click the name of the cluster to enter its details page.
  3. On the Dashboard tab page of the cluster details page, click Manage next to Data Connection.
  4. On the Data Connection dialog box, the data connections associated with the cluster are displayed. You can click Edit or Delete to edit or delete the data connections.
  5. If there is no associated data connection on the Data Connection page, click Configure Data Connection to add a connection.

    Only one data connection can be configured for a module type. For example, after a data connection is configured for Hive metadata, no other data connection can be configured for it. If no module type is available, the Configure Data Connection button is unavailable.

    Table 3 Parameters for configuring a data connection

    Parameter

    Description

    Component Name

    • Hive
    • Ranger

    Module Type

    If Component Name is set to Hive, Hive metadata is supported.

    When the Component Name is Ranger, Ranger metadata is supported.

    Data Connection Type

    • Hive supports the following data connection types:
      • RDS PostgreSQL database
      • Local database
    • Ranger supports the following data connection types:
      • RDS MySQL database
      • Local database

    Instance

    This parameter is valid only when Data Connection Type is set to RDS PostgreSQL database or RDS MySQL database. Select the name of the connection between the MRS cluster and the RDS database. This instance must be created before being referenced here. You can click Create Data Connection to create a data connection. For details, see Creating a Data Connection.

    Figure 3 Parameters for configuring a data connection

  6. Click Test to test connectivity of the data connection.
  7. After the data connection is successful, click OK.

    After Hive/Ranger metadata is configured, restart Hive/Ranger. Hive/Ranger will create necessary database tables in the specified database. (If tables exist, they will not be created.)

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel