Updated on 2024-08-30 GMT+08:00

Auto Table Creation

CDM converts the field type of the source to the field type of the destination based on the default rule and creates a table at the destination.

Field Mapping in Automatic Table Creation

Figure 1 describes the field mapping between the DWS tables created by CDM and source tables. For example, if you use CDM to migrate the Oracle database to DWS, CDM automatically creates a table on DWS and maps the NUMBER(3,0) field of the Oracle database to the SMALLINT field of DWS.

Figure 1 Field mapping in automatic table creation

Table 1, Table 2, Table 3, and Table 4 describe the field type mapping between Hive tables and source tables when CDM automatically creates tables in Hive. For example, if you use CDM to migrate the MySQL database to Hive, CDM automatically creates a table on Hive and maps the YEAR field of the MySQL database to the DATE field of Hive.

  • For the DECIMAL type, if the length of the source data exceeds the Hive length, the precision may be lost.
  • For the DECIMAL type, the precision is greater than or equal to 1 and less than or equal to 38, and the scale is greater than or equal to 0. If the precision for the source is greater than 38 bits, the precision for Hive table creation is 38 bits. If the scale is less than 0, the scale for Hive table creation is 0. In this case, precision loss may occur after data is written.
Table 1 Field mapping in automatic table creation for MySQL-to-Hive migration

Data Type (MySQL)

Data Type (Hive)

Description

Value

tinyint(1), bit(1)

BOOLEAN

-

TINYINT

SMALLINT

-

TINYINT UNSIGNED

SMALLINT

-

SMALLINT

SMALLINT

-

SMALLINT UNSIGNED

INTEGER

-

MEDIUMINT

INTEGER

-

MEDIUMINT UNSIGNED

BIGINT

-

INT

INTEGER

-

INT UNSIGNED

BIGINT

-

BIGINT

BIGINT

-

BIGINT UNSIGNED

DECIMAL(38,0)

-

DECIMAL(P,S)

DECIMAL(P,S)

The MySQL database supports a maximum of 65 bits. For Hive, the precision is greater than or equal to 1 and less than or equal to 38, and the scale is greater than or equal to 0. If the precision for the MySQL database is greater than 38 bits, the precision for Hive table creation is 38 bits. If the scale is less than 0, the scale for Hive table creation is 0.

FLOAT

FLOAT

-

FLOAT UNSIGNED

FLOAT

-

DOUBLE

DOUBLE

-

DOUBLE UNSIGNED

DOUBLE

-

Time

DATE

DATE

-

YEAR

DATE

-

DATETIME

TIMESTAMP

-

TIMESTAMP

TIMESTAMP

-

TIME

STRING

-

Character

CHAR(N)

CHAR(N*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

VARCHAR(N)

VARCHAR(N*3)

If the value is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

BINARY

BINARY

-

VARBINARY

BINARY

-

TINYBLOB

BINARY

-

MEDIUMBLOB

BINARY

-

BLOB

BINARY

-

LONGBLOB

BINARY

-

TINYTEXT

VARCHAR(765)

-

MEDIUMTEXT

STRING

-

TEXT

STRING

-

LONGTEXT

STRING

-

Others

STRING

-

Table 2 Field mapping in automatic table creation for Oracle-to-Hive migration

Data Type (Oracle)

Data Type (Hive)

Description

Character

CHAR(N)

CHAR(N*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

VARCHAR(N)

VARCHAR(N*3)

If the value is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

VARCHAR2

VARCHAR(N*3)

If the value is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

NCHAR

CHAR(N*3)

-

NVARCHAR2

STRING

-

Value

NUMBER

DECIMAL(P,S)

For Hive, the precision is greater than or equal to 1 and less than or equal to 38, and the scale is greater than or equal to 0. If the precision for the MySQL database is greater than 38 bits, the precision for Hive table creation is 38 bits. If the scale is less than 0, the scale for Hive table creation is 0.

BINARY_FLOAT

FLOAT

-

BINARY_DOUBLE

DOUBLE

-

FLOAT

FLOAT

-

Time

DATE

TIMESTAMP

-

TIMESTAMP

TIMESTAMP

-

TIMESTAMP WITH TIME ZONE

STRING

-

TIMESTAMP WITH LOCAL TIME ZONE

STRING

-

INTERVAL

STRING

-

Binary

BLOB

BINARY

-

CLOB

STRING

-

NCLOB

STRING

-

LONG

STRING

-

LONG_RAW

BINARY

-

RAW

BINARY

-

Other

STRING

-

Table 3 Field mapping in automatic table creation for PostgreSQL/DWS-to-Hive migration

Data Type (PostgreSQL/DWS)

Data Type (Hive)

Description

Value

int2

SMALLINT

-

int4

INT

-

int8

BIGINT

-

real

FLOAT

-

float4

FLOAT

-

float8

DOUBLE

-

smallserial

SMALLINT

-

serial

INT

-

bigserial

BIGINT

-

numeric(p,s)

DECIMAL(P,S)

For Hive, the precision is greater than or equal to 1 and less than or equal to 38, and the scale is greater than or equal to 0. If the precision for the MySQL database is greater than 38 bits, the precision for Hive table creation is 38 bits. If the scale is less than 0, the scale for Hive table creation is 0.

money

DOUBLE

-

bit(1)

TINYINT

-

varbit

STRING

-

Character

varchar(n)

VARCHAR(N*3)

If the value is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

bpchar(n)

CHAR(N*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

char(n)

CHAR(N*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

bytea

BINARY

-

text

STRING

-

Time

interval

STRING

-

date

DATE

-

time

STRING

-

timetz

STRING

-

timestamp

TIMESTAMP

-

timestamptz

TIMESTAMP

-

Boolean

bool

BOOLEAN

-

Other

STRING

-

Table 4 Field mapping in automatic table creation for SQL Server-to-Hive migration

Data Type (SQL Server)

Data Type (Hive)

Description

Value

TINYINT

SMALLINT

-

SMALLINT

SMALLINT

-

INT

INT

-

BIGINT

BIGINT

-

DECIMAL

DECIMAL(P,S)

For Hive, the precision is greater than or equal to 1 and less than or equal to 38, and the scale is greater than or equal to 0. If the precision for the MySQL database is greater than 38 bits, the precision for Hive table creation is 38 bits. If the scale is less than 0, the scale for Hive table creation is 0.

NUMERIC

DECIMAL(P,S)

For Hive, the precision is greater than or equal to 1 and less than or equal to 38, and the scale is greater than or equal to 0. If the precision for the MySQL database is greater than 38 bits, the precision for Hive table creation is 38 bits. If the scale is less than 0, the scale for Hive table creation is 0.

FLOAT

DOUBLE

-

REAL

FLOAT

-

SMALLMONEY

DECIMAL(10,4)

-

MONEY

DECIMAL(19,4)

-

BIT(1)

TINYINT

-

Time

DATE

DATE

-

DATETIME

TIMESTAMP

-

DATETIME2

TIMESTAMP

-

DATETIMEOFFSET

STRING

-

TIME(p)

STRING

-

TIMESTAMP

BINARY

-

Character

CHAR(n)

CHAR(n*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65535 (VARCHAR_MAX_LENGTH), a string is created.

VARCHAR(n)

VARCHAR(n*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65536 (VARCHAR_MAX_LENGTH), a string is created.

NCHAR(n)

VARCHAR(n*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65537 (VARCHAR_MAX_LENGTH), a string is created.

NVARCHAR(n)

VARCHAR(n*3)

If the value of (n*3<255) is greater than 255 (CHAR_MAX_LENGTH), varchar(N*3) is created. If the value of (n*3<255) is greater than 65538 (VARCHAR_MAX_LENGTH), a string is created.

Binary

BINARY

BINARY

-

VARBINARY

BINARY

-

TEXT

STRING

-

Other

STRING

-