Updated on 2024-05-29 GMT+08:00

Before You Start

HetuEngine supports quick joint query of multiple data sources and GUI-based data source configuration and management. You can quickly add a data source on the HSConsole page.

Table 1 lists the data sources supported by HetuEngine of the current version.

Table 1 List for connecting HetuEngine to data sources

HetuEngine Mode

Data Source

Data Source Mode

Supported Data Source Version

Security mode

Hive

Security mode

MRS 3.x and FusionInsight 6.5.1

HBase

MRS 3.x and FusionInsight 6.5.1

HetuEngine

MRS 3.1.1 or later

GaussDB

GaussDB 200 and GaussDB A 8.0.0 or later

Hudi

MRS 3.1.2 or later

ClickHouse

MRS 3.1.1 or later

IoTDB

MRS 3.2.0 or later

MySQL

MySQL 5.7, MySQL 8.0, and later

Normal mode

Hive

Normal mode

MRS 3.x and FusionInsight 6.5.1

HBase

MRS 3.x and FusionInsight 6.5.1

Hudi

MRS 3.1.2 or later

ClickHouse

MRS 3.1.1 or later

IoTDB

MRS 3.2.0 or later

GaussDB

Security mode

GaussDB 200 and GaussDB A 8.0.0 or later

MySQL

MySQL 5.7, MySQL 8.0, and later

Operations such as adding, configuring, and deleting a HetuEngine data source takes effect dynamically without restarting the cluster.

A configured data source takes effect dynamically and you cannot disable this function. By default, the interval for a data source to dynamically take effect is 60 seconds. You can change the interval to a desired one by changing the value of catalog.scanner-interval in coordinator.config.properties and worker.config.properties by referring to 3.e in Creating a HetuEngine Compute Instance. See the following example.

catalog.scanner-interval =120s

HetuEngine supports query pushdown. It can push down queries or partial queries to connected data sources. This means that special predicates, aggregate functions, or other operations can be passed to the underlying database or file system for processing. Query pushdown brings the following benefits:

  1. Improves the overall query performance.
  2. Reduces the network traffic between HetuEngine and data sources.
  3. Reduces the load of remote data sources.

Whether HetuEngine supports query pushdown depends on specific connectors and the underlying data sources or storage systems related to the connectors.

  • The data source cluster and the HetuEngine cluster must use different domain names. Two data sources (Hive, HBase, and Hudi) with the same domain name cannot be connected to HetuEngine at the same time.
  • Nodes in the data source cluster and the HetuEngine cluster can communicate with each other on the service plane.