Supported Data Types
In the big data field, the mainstream file format is ORC, which is supported by GaussDB(DWS). You can use Hive to export data to an ORC file and use a read-only foreign table to query and analyze the data in the ORC file. Therefore, you need to map the data types supported by the ORC file format with the data types supported by GaussDB(DWS). For details, see Table 1 Mapping between ORC read-only foreign tables and Hive data types. Similarly, GaussDB(DWS) exports data through a write-only foreign table, and stores the data in the ORC format. Using Hive to read the ORC file content also requires matched data types. Table 2 shows the matching relationship.
Type |
Type Supported by GaussDB(DWS) Foreign Tables |
Hive Table Type |
---|---|---|
1-byte integer |
TINYINT (not recommended) |
TINYINT |
SMALLINT (recommended) |
TINYINT |
|
2-byte integer |
SMALLINT |
SMALLINT |
4-byte integer |
INTEGER |
INT |
8-byte integer |
BIGINT |
BIGINT |
Single-precision floating point number |
FLOAT4 (REAL) |
FLOAT |
Double-precision floating point number |
FLOAT8(DOUBLE PRECISION) |
DOUBLE |
Scientific data type |
DECIMAL[p (,s)] (The maximum precision can reach up to 38.) |
DECIMAL (The maximum precision can reach up to 38.) (HIVE 0.11) |
Date type |
DATE |
DATE |
Time type |
TIMESTAMP |
TIMESTAMP |
Boolean type |
BOOLEAN |
BOOLEAN |
CHAR type |
CHAR(n) |
CHAR (n) |
VARCHAR type |
VARCHAR(n) |
VARCHAR (n) |
String (large text object) |
TEXT(CLOB) |
STRING |
Type |
Type Supported by GaussDB(DWS) Internal Tables (Data Source Table) |
Type Supported by GaussDB(DWS) Write-only Foreign Tables |
Hive Table Type |
---|---|---|---|
1-byte integer |
TINYINT |
TINYINT (not recommended) |
SMALLINT |
SMALLINT (recommended) |
SMALLINT |
||
2-byte integer |
SMALLINT |
SMALLINT |
SMALLINT |
4-byte integer |
INTEGER, BINARY_INTEGER |
INTEGER |
INT |
8-byte integer |
BIGINT |
BIGINT |
BIGINT |
Single-precision floating point number |
FLOAT4, REAL |
FLOAT4, REAL |
FLOAT |
Double-precision floating point number |
DOUBLE PRECISION, FLOAT8, BINARY_DOUBLE |
DOUBLE PRECISION, FLOAT8, BINARY_DOUBLE |
DOUBLE |
Scientific data type |
DECIMAL, NUMERIC |
DECIMAL[p (,s)] (The maximum precision can reach up to 38.) |
precision ≤ 38: DECIMAL; precision > 38: STRING |
Date type |
DATE |
TIMESTAMP[(p)] [WITHOUT TIME ZONE] |
TIMESTAMP |
Time type |
TIME [(p)] [WITHOUT TIME ZONE], TIME [(p)] [WITH TIME ZONE] |
TEXT |
STRING |
TIMESTAMP[(p)] [WITHOUT TIME ZONE], TIMESTAMP[(p)][WITH TIME ZONE], SMALLDATETIME |
TIMESTAMP[(p)] [WITHOUT TIME ZONE] |
TIMESTAMP |
|
INTERVAL DAY (l) TO SECOND (p), INTERVAL [FIELDS] [(p)] |
VARCHAR(n) |
VARCHAR(n) |
|
Boolean type |
BOOLEAN |
BOOLEAN |
BOOLEAN |
CHAR type |
CHAR(n), CHARACTER(n), NCHAR(n) |
CHAR(n), CHARACTER(n), NCHAR(n) |
n ≤ 255: CHAR(n); n > 255: STRING |
VARCHAR type |
VARCHAR(n), CHARACTER VARYING(n), VARCHAR2(n) |
VARCHAR(n) |
n ≤ 65535: VARCHAR(n); n > 65535: STRING |
NVARCHAR2(n) |
TEXT |
STRING |
|
String (large text object) |
TEXT, CLOB |
TEXT, CLOB |
STRING |
Monetary type |
MONEY |
NUMERIC |
BIGINT |
- The GaussDB(DWS) foreign table supports the NULL definition, and the Hive data table supports and uses the corresponding NULL definition.
- The value range of TINYINT in the Hive data table is [-128, 127], and the value range of TINYINT in GaussDB(DWS) is [0, 255]. You are advised to use the SMALLINT type when creating a GaussDB(DWS) read-only foreign table for TINYINT in the Hive table. If TINYINT is used, the read value may be different from the actual value. Similarly, when exporting data of the TINYINT type from GaussDB(DWS), you are advised to use the SMALLINT type for write-only foreign tables and Hive tables.
- The time zone definition is not supported by the date and time types of the GaussDB(DWS) foreign table, or by the Hive table.
- The DATE type in Hive contains only date. The DATE type in GaussDB(DWS) contains date and time.
- In GaussDB(DWS), ORC files can be compressed in ZLIB, SNAPPY, LZ4, or NONE mode. The FLOAT4 format itself is not accurate, and the sum operation results in different effect in various environments. You are advised to use the DECIMAL type in the high-precision scenarios.
- In Teradata-compatible mode, foreign tables do not support the DATE type.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.