Using the HetuEngine Cross-Source Function
Using the HetuEngine Cross-Source Function
Enterprises usually store massive data, such as from various databases and warehouses, for management and information collection. However, diversified data sources, hybrid dataset structures, and scattered data storage rise the development cost for cross-source query and prolong the cross-source query duration.
Key Technologies and Advantages
- Compute pushdown: When HetuEngine is used for cross-source collaborative analysis, HetuEngine enhances the compute pushdown capability from the dimensions listed in the following table to improve access efficiency.
- Basic pushdown: predicate, projection, subquery, and limit
- Aggregate pushdown: GROUP BY, ORDER BY, COUNT, SUM, MIN, and MAX
- Operator pushdown: <, >, LIKE, and OR.
- Multi-source heterogeneous data: Collaborative analysis supports both structured data sources such as Hive, GaussDB, and ClickHouse, and unstructured data sources such as HBase and Elasticsearch.
- Global metadata: A mapping table is provided to map unstructured schemas to structured schemas, enabling HetuEngine to access HBase using SQL statements. Global management for data source information is provided.
- Global permission control: Data source permissions can be opened to Ranger through HetuEngine for centralized management and control.
Usage Guide of Cross-Source Function
HetuEngine supports quick joint query of multiple data sources and GUI-based data source configuration and management. You can quickly add the following data sources on the HSConsole page by referring to Before You Start:
- Configuring a Hive Data Source
- Configuring a Hudi Data Source
- Configuring a ClickHouse Data Source
- Configuring an Elasticsearch Data Source
- Configuring a GaussDB Data Source
- Configuring an HBase Data Source
- Configuring a HetuEngine Data Source
- Configuring an IoTDB Data Source
- Configuring a MySQL Data Source
Process of Using Cross-Source Collaborative Analysis
- Log in to the HetuEngine client by referring to Using the HetuEngine Client.
- Register data sources such as Hive, HBase, and GaussDB A.
hetuengine> show catalogs; Catalog ---------- dws hive hive_dg hbase system systemremote (6 rows)
- Compile SQL statements for cross-source collaborative analysis.
select * from hive_dg.schema1.table1 t1 join hbase.schema3.table3 t2 join dws.schema02.table4 t3 on t1.name = t2.item and t2.id = t3.cardNo;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot