Updated on 2023-04-28 GMT+08:00

Adjusting the Advanced Feature MDE FOR GaussDB

Scenario

When HetuEngine writes data to GaussDB using CTS or ITS syntax, data can be transferred to GaussDB through both the coordinator and worker node of GaussDB, reducing the pressure on the coordinator node of GaussDB and improving data write performance.

The GaussDB MDE function depends on SQL on Hadoop of the GaussDB and uses external servers as carriers for data transmission. Currently, only the external HDFS server is supported by HetuEngine and must be in the same Hadoop cluster as the HetuEngine service.

Currently, when HetuEngine executes an SLQ statement to write data to GaussDB, the GaussDB MDE function can be used to improve the performance using only the following two types of syntax:

  • CREATE TABLE AS SELECT FROM hive
  • INSERT INTO SELECT FROM hive
  • Currently, only GaussDB(DWS) 8.0.0 or later is supported.

Procedure

  1. Create an HDFS server in the cluster where the GaussDB service is deployed.
  2. Configure the GaussDB data source by referring to Configuring a GaussDB Data Source and add the gaussdba.mde.enabled parameter in 3.e to enable the GaussDB MDE function.

    Table 1 Custom parameters of GaussDB MDE

    Parameter

    Description

    Example Value

    gaussdba.mde.enabled

    Whether to enable the GaussDB MDE function.

    true

    gaussdba.hdfs-server-name

    Name of the HDFS server created in the GaussDB cluster when GaussDB MDE is enabled

    hdfs_server

    gaussdba.hdfs-server-hive-datasource

    Name of the Hive data source corresponding to the HDFS server created in the GaussDB cluster when GaussDB MDE is enabled

    hive

    gaussdba.hdfs-file-size-for-mde

    Size of temporary files when GaussDB MDE is enabled, in MB. You are advised to set this parameter to 64 MB or 128 MB.

    64

    allow-drop-table

    Whether to automatically delete temporary tables in GaussDB when GaussDB MDE is enabled.

    • true: Temporary tables can be deleted. For details about how to configure automatic deletion, see the following notice.
    • false: Temporary tables cannot be deleted.

    true

    If allow-drop-table is set to true, you can set mde.clean.task.enabled to true to enable automatic deletion of temporary tables in GaussDB. By default, temporary tables are queried every hour and automatically deleted 24 hours later.

    On FusionInsight Manager, click Cluster, choose Services > HetuEngine, click Configuration, and then All Configurations. On the displayed page, click HSBroker (Role), click Customization, and choose application.customized.properties. Add a custom parameter mde.clean.task.enabled and set its value to true.