Help Center/ MapReduce Service/ Component Operation Guide (LTS)/ Using HetuEngine/ Adding a HetuEngine Data Source/ Adding an Oracle Data Source

Updated on 2025-11-13 GMT+08:00

View PDF

Adding an Oracle Data Source

This topic is available for MRS 3.5.0 and later versions only.

HetuEngine allows you to configure, access, and query the Oracle data source. This topic guides you to add an Oracle JDBC data source on the HSConsole page of the cluster.

Prerequisites

The data source and the HetuEngine cluster nodes can communicate with each other.
If Kerberos authentication is enabled for the cluster (the cluster is in security mode), create a HetuEngine administrator user. If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), create a HetuEngine service user, and assign the HDFS administrator permission to the user. That is, the user is added to both the hadoop and hadoopmanager user groups. For details about how to create a user, see Creating a HetuEngine Permission Role.
A HetuEngine compute instance has been created. For details, see Creating a HetuEngine Compute Instance.
You have obtained the IP address, port, database instance name or PDB name, username, and password of the Oracle database.

Constraints on the Interconnection with Oracle Data Sources

Currently, the Oracle data source is read-only by default. Oracle 12 and later versions are supported.
The schema and table names of Oracle data sources supported by HetuEngine are case insensitive.
The following syntaxes are not supported: CREATE SCHEMA, ALTER SCHEMA, DROP SCHEMA, INSERT INTO, SELECT, INSERT OVERWRITE, UPDATE, ANALYZE and VIEW.
Columns of the CLOB, NCLOB, BLOB, or RAW(n) Oracle types or columns of the Trino data type that are mapped to these types do not support predicate pushdown.
For example, name is a column of the VARCHAR type in HetuEngine and is mapped to NCLOB in Oracle. The predicates of the following two queries are not pushed down:
```
SELECT * FROM nation WHERE name>'abcd';
SELECT * FROM nation WHERE name='abcd';
```
If you specify a WHERE clause, DELETE can be executed only when the predicate in the WHERE clause can be completely pushed down to Oracle.

Configuring the Oracle Data Source

Installing a cluster client

Install the cluster client that contains the HetuEngine service in the /opt/hadoopclient directory.

Preparing the Oracle driver

Obtain the Oracle driver file from the Oracle official website. The format is ojdbcxxx.jar, for example, ojdbc8.jar.

The name of the driver file to be uploaded must meet the following verification rules: The prefix of the driver file must be ojdbc and the end be jar. The version number in the middle can be any characters, but the total length cannot exceed 80 characters and the file cannot be larger than 100 MB.
Upload the Oracle driver file to the cluster where HetuEngine is deployed.
You can use either of the following methods:
- Upload the file to HDFS on FusionInsight Manager.
  1. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HDFS.
  2. In the Basic Information area on the Dashboard page, click the link next to NameNode Web UI.
  3. Choose Utilities > Browse the file system and click to create the /user/hetuserver/fiber/extra_file/driver/oracle directory.
  4. Go to the /user/hetuserver/fiber/extra_file/driver/oracle directory and click to upload the Oracle driver file obtained in 2.
  5. Click the value in the Permission column in the row containing the driver file, select Read and Write in the User column, Read in the Group column, and Read in the Other column, and click Set.
- Run HDFS commands to upload the file.
  1. Log in to the node where the HDFS service client is deployed and switch to the client installation directory, for example, /opt/hadoopclient.
    cd /opt/hadoopclient
  2. Configure environment variables.
    source bigdata_env
  3. If the cluster is in security mode, run the following command to authenticate the user. In normal mode, user authentication is not required.
    kinit HetuEngine administrator username
    
    Enter the password as prompted.
  4. Create the /user/hetuserver/fiber/extra_file/driver/oracle directory, upload the Oracle driver obtained in 2, and modify the permission.
    hdfs dfs -mkdir -p /user/hetuserver/fiber/extra_file/driver/oracle
    
    hdfs dfs -put ./Oracle driver file /user/hetuserver/fiber/extra_file/driver/oracle
    
    hdfs dfs -chmod -R 644 /user/hetuserver/fiber/extra_file/driver/oracle

Configuring the Oracle data source

Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine.
In the Basic Information area in the Dashboard tab, click the link next to HSConsole Web UI to access the HSConsole page.

Choose Data Source and click Add Data Source. Configure parameters on the Add Data Source page.

Configure the basic information, enter the data source name, and select JDBC > Oracle as the data source type.

Configure the parameters in the Oracle Configuration area. For details, see Table 1 Oracle configuration.

**Table 1** Oracle configuration
Parameter	Description	Example Value
Driver Name	Select the Oracle driver that has been uploaded in 2. The format is ojdbcxxx.jar.	ojdbc8.jar
JDBC URL	JDBC URL for connecting to the Oracle database. The default port is 1521. The following formats are available: jdbc:oracle:thin@IP address of the Oracle database:Port/Oracle PDB name jdbc:oracle:thin@IP address of the Oracle database:Port:Oracle database instance name	jdbc:oracle:thin:@192.168.1.1:1521/orclpdb jdbc:oracle:thin:@192.168.1.1:1521:orcl
Username	Oracle username for connecting to the Oracle data source.	-
Password	Oracle user password for connecting to the Oracle data source.	-

(Optional) Customize the configuration.

Click Add to add custom configuration parameters. For details, see Table 2.

**Table 2** Custom parameters for the Oracle data source
Parameter	Description	Example Value
case-insensitive-name-matching	The schema and table names of the Oracle data source supported by HetuEngine are case sensitive. false (default value): Schemas and tables whose names contain only lowercase letters can be queried. true: If no schema or table is matched ignoring case sensitivity, the schema and table can be queried. If schemas and tables are matched ignoring case sensitivity, the schema and table cannot be queried.	false
case-insensitive-name-matching.cache-ttl	Timeout interval for caching case-sensitive schema and table names of the Oracle data source. The default value is 1m (1 minute).	1m
dynamic-filtering.enabled	Whether dynamic filters will be pushed down to JDBC queries. true (default value): Enable pushdown. false: Disable pushdown.	true
dynamic-filtering.wait-timeout	Maximum duration for which HetuEngine waits to collect dynamic filters from where the connection is built before starting a JDBC query. Using a larger value may result in a more detailed dynamic filter. However, the latency of some queries is increased. The default value is 20s.	20s
unsupported-type-handling	How data types that are not supported by the connector will be processed. CONVERT_TO_VARCHAR: Convert unsupported types to VARCHAR and allow only read operations on them. Only the types in the data type mapping table are supported. IGNORE (default value): Do not display the unsupported types.	IGNORE
join-pushdown.enabled	Whether to enable join pushdown. Join pushdown may adversely affect some query statements. true: Enable Join pushdown. false (default value): Disable join pushdown.	false
join-pushdown.strategy	Policy used to evaluate whether Join is pushed down. AUTOMATIC (default value): Enable cost-based connection pushdown. EAGER: Push down joins as much as possible. Even if table statistics are unavailable, using EAGER will push down joins, which may cause query performance deterioration. Use EAGER only in test and troubleshooting scenarios.	AUTOMATIC
oracle.number.default-scale	Number of decimal places of the HetuEngine Decimal type mapped to the Oracle Number data type (which does not have specified precision and decimal places). By default, this parameter is not set and the column is not supported.	5
oracle.remarks-reporting.enabled	Whether to expose metadata comments. true: Expose metadata annotations. false (default): Metadata annotations are not exposed through the REMARKS column.	false
oracle.synonyms.enabled	Whether to enable the synonym function. HetuEngine disables the support for Oracle SYNONYM by default for performance reasons. true: The synonym function is enabled. false (default value): The synonym function is disabled.	false
oracle.source-encoding	Character set code of the remote data source environment, which is used to prevent garbled characters ZHS16GBK or GBK AL32UTF8 or UTF-8 (default value)	UTF-8

You can click Delete to delete custom configuration parameters.

Click OK

Log in to the node where the cluster client is deployed and run the following commands to switch to the client installation directory and authenticate the user:

cd /opt/hadoopclient

source bigdata_env

kinit User performing HetuEngine operations (If the cluster is in normal mode, skip this command.)
Log in to the catalog of the data source.

hetu-cli --catalog Data source name --schema Database name

For example, run the following command:

hetu-cli --catalog oracle_1 --schema oracle
Run the following command. If the database table information can be viewed or no error is reported, the connection is successful.

show tables;

Mapping Between Oracle and HetuEngine Data Types

Constraints:

HetuEngine cannot directly read data of the Number type whose precision and scale are not set in the Oracle database. You need to add the custom parameter oracle.number.default-scale=s to the data source configuration to map the data type to the decimal(38, s) type.
HetuEngine cannot read columns whose p – s > 38 in the Number(p, s) data type.
The date type in the Oracle database is stored in seconds. Therefore, the data type mapped to HetuEngine is timestamp(0).
If the decimal precision of the second in the timestamp data queried by HetuEngine is greater than 3, a value is truncated to three decimal places instead of being rounded off.
The time and date formats supported by the JDBC driver are different. So, an error may occur when the date and time earlier than 1582-10-15 are inserted or queried.
VARCHAR(n) in HetuEngine is mapped to VARCHAR2(n CHAR) in Oracle. If n is greater than 4000, it is mapped to NCLOB. CHAR(n) is mapped to CHAR(n CHAR). If n is greater than 2000, it is also mapped to NCLOB.
HetuEngine cannot write CHAR and VARCHAR data types to columns with improper lengths.
When you use CREATE TABLE AS to create an NCLOB column from a CHAR value, the suffix space in the initial value is removed. However, if you insert a CHAR value into an existing NCLOB column, the suffix space is retained.

**Table 3** Mapping from Oracle to HetuEngine data types
Oracle Type	HetuEngine Data Type
NUMBER(p, s)	DECIMAL(p, s)
NUMBER(p)	DECIMAL(p, 0)
FLOAT[(p)]	DOUBLE
BINARY_FLOAT	REAL
BINARY_DOUBLE	DOUBLE
VARCHAR2(n CHAR)	VARCHAR(n)
VARCHAR2(n BYTE)	VARCHAR(n)
NVARCHAR2(n)	VARCHAR(n)
CHAR(n)	CHAR(n)
NCHAR(n)	CHAR(n)
CLOB	VARCHAR
NCLOB	VARCHAR
RAW(n)	VARBINARY
BLOB	VARBINARY
DATE	TIMESTAMP(0)
TIMESTAMP(p)	TIMESTAMP(3)
TIMESTAMP(p) WITH TIME ZONE	TIMESTAMP(3) WITH TIME ZONE

**Table 4** Mapping from HetuEngine to Oracle data types
HetuEngine Type	Oracle Type
TINYINT	NUMBER(3)
SMALLINT	NUMBER(5)
INTEGER	NUMBER(10)
BIGINT	NUMBER(19)
DECIMAL(p, s)	NUMBER(p, s)
REAL	BINARY_FLOAT
DOUBLE	BINARY_DOUBLE
VARCHAR	NCLOB
VARCHAR(n)	VARCHAR2(n CHAR) or NCLOB
CHAR(n)	CHAR(n CHAR) or NCLOB
VARBINARY	BLOB
DATE	DATE
TIMESTAMP	TIMESTAMP(3)
TIMESTAMP WITH TIME ZONE	TIMESTAMP(3) WITH TIME ZONE

Function Enhancement

The connector allows you to directly query the underlying database. Only the native Oracle syntax is supported because complete statements are pushed down to and executed in the Oracle database. The connector can be used to access Oracle functions that are unavailable in HetuEngine or when you want to execute some statements faster on the Oracle side.

SELECT * FROM TABLE(
  oracle.system.query(
    query => 'SELECT * FROM tpch.nation'));

Parent topic: Adding a HetuEngine Data Source

Previous topic: Adding a MySQL Data Source

Next topic: Adding a GBase Data Source

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot