Data Type Mapping Between DWS Foreign Tables and Hive/Spark, ORC, and Parquet Data Sources
When data is read from Hive/Spark or third-party components' ORC/Parquet to DWS foreign tables, data incompatibility may occur due to different data types on different platforms. This section describes the data type mapping between DWS foreign tables and these data sources. By configuring correct data types for foreign table columns, you can ensure that data can be accurately identified, processed, and queried in DWS, and that no data is lost.
For details about the foreign table syntax, see "CREATE FOREIGN TABLE (SQL on OBS or Hadoop)" in Data Warehouse Service (DWS) SQL Syntax Reference.
Data Type Mapping Between DWS Foreign Tables and Hive/Spark Data Sources
Table 1 is based on the types supported by DWS foreign tables. Some types supported by Hive/Spark, such as INTERVAL and UNION, are not supported by DWS foreign tables.
|
Type |
DWS Foreign Table Type |
Hive Type |
Spark Type |
|---|---|---|---|
|
1-byte integer |
|
TINYINT |
BYTE, TINYINT |
|
2-byte integer |
SMALLINT |
SMALLINT |
SHORT, SMALLINT |
|
4-byte integer |
INTEGER |
INT, INTEGER |
INT, INTEGER |
|
8-byte integer |
BIGINT |
BIGINT |
LONG, BIGINT |
|
Single-precision floating point |
FLOAT, REAL |
FLOAT |
FLOA, REAL |
|
Double-precision floating point |
FLOAT8, DOUBLE |
DOUBLE |
DOUBLE |
|
Scientific data |
DECIMAL[p (,s)] |
DECIMAL |
DECIMAL |
|
Date type |
DATE |
DATE |
DATE |
|
Time type |
TIMESTAMP WITHOUT TIME ZONE |
TIMESTAMP |
TIMESTAMP_NTZ |
|
TIMESTAMP WITH TIME ZONE |
TIMESTAMP, TIMESTAMP_LTZ |
||
|
Boolean |
BOOLEAN |
BOOLEAN |
BOOLEAN |
|
Char |
CHAR(n) |
CHAR |
CHAR(length) |
|
Varchar |
VARCHAR(n) |
VARCHAR |
VARCHAR(length) |
|
NVARCHAR2 |
|||
|
text |
TEXT(CLOB) |
STRING |
STRING |
|
Binary |
BYTEA |
BINARY |
BINARY |
|
Complex type |
JSON |
ARRAY |
ARRAY |
|
MAP |
MAP |
||
|
STRUCT |
STRUCT |
Data Type Mapping Between DWS Foreign Tables and ORC Data Sources
- ORC types are internal types defined in the ORC format specifications. Generally, you do not need to pay special attention to these types. They can be used as a reference for creating DWS foreign tables if the Hive or Spark types are not clearly defined.
- When DWS foreign tables are written to ORC files, data types are mapped one by one. However, in the (read) scenario, in addition to the recommended mapping of types, some other mapping scenarios are supported. Table 2 lists these scenarios.
|
Type |
ORC Type |
DWS Foreign Table Type (Read) |
DWS Foreign Table Type (Write) |
|---|---|---|---|
|
Integer |
boolean (1 bit) |
BOOLEAN |
BOOLEAN |
|
tinyint (8 bit) |
|
TINYINT (not recommended) |
|
|
smallint (16 bit) |
SMALLINT |
SMALLINT |
|
|
int (32 bit) |
INTEGER |
INTEGER |
|
|
bigint (64 bit) |
BIGINT |
BIGINT |
|
|
Floating point |
float |
FLOAT4 |
FLOAT4 |
|
double |
FLOAT8 |
FLOAT8 |
|
|
String |
string |
|
|
|
char |
|
CHAR (n) |
|
|
varchar |
VARCHAR(n) |
VARCHAR (n) |
|
|
Binary |
binary |
BYTEA |
BYTEA |
|
Scientific data |
decimal |
DECIMAL[p (,s)] |
DECIMAL[p (,s)] |
|
Time |
timestamp |
|
|
|
timestamp with local time zone |
|
Not supported |
|
|
date |
|
DATE |
|
|
Complex type |
struct |
|
Not supported |
|
list |
|
Not supported |
|
|
map |
|
Not supported |
|
|
union |
Not supported |
Not supported |
Data type mapping between DWS foreign tables and PARQUET data sources
PARQUET types are the internal storage types defined in the PARQUET format specifications. PARQUET uses coverted_type (deprecated) or logical_type to further describe type behavior. When DWS data is written to PARQUET, only logical_type is used. When data is read from PARQUET, covered_type is compatible.
|
Parquet Type |
DWS Foreign Table Type (Read) |
DWS Foreign Table Type (Write) |
|---|---|---|
|
BOOLEAN |
BOOLEAN |
BOOLEAN |
|
INT32 |
|
|
|
INT64 |
|
|
|
INT96 |
|
Not supported |
|
FLOAT |
FLOAT4 |
FLOAT4 |
|
DOUBLE |
FLOAT8 |
FLOAT8 |
|
BYTE_ARRAY |
|
|
|
FIXED_LEN_BYTE_ARRAY |
DECIMAL[p (,s)] |
DECIMAL[p (,s)] |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot