CREATE TABLE

Function

This command is used to create a Hudi table by specifying the list of fields along with the table options.

Syntax

CREATE TABLE [ IF NOT EXISTS] [database_name.]table_name

[ (columnTypeList)]

USING hudi

[ COMMENT table_comment ]

[ LOCATION location_path ]

[ OPTIONS (options_list) ]

Parameter Description

**Table 1** Parameters
Parameter	Description
database_name	Database name that contains letters, digits, and underscores (_).
table_name	Database table name that contains letters, digits, and underscores (_).
columnTypeList	A comma-separated list of data types and optional column default values. The column name contains letters, digits, and underscores (_).
using	Uses hudi to define and create a Hudi table.
table_comment	Description of the table.
location_path	HDFS path. If this parameter is set, the Hudi table will be created as an external table.
options_list	List of Hudi table options.

**Table 2** Table options
Parameter	Description
primaryKey	Mandatory. Primary key name. Separate multiple primary key names with commas (,).
type	Type of the table. 'cow' indicates a copy-on-write (COW) table, and 'mor' indicates a merge-on-read (MOR) table. If this parameter is not specified, the default value is 'cow'.
preCombineField	The Pre-Combine field in the table. This field is mandatory.
payloadClass	Logic that uses preCombineField for data filtering. DefaultHoodieRecordPayload is used by default. In addition, multiple preset payloads are provided, such as OverwriteNonDefaultsWithLatestAvroPayload, OverwriteWithLatestAvroPayload, and EmptyHoodieRecordPayload.
useCache	Whether to cache table relationships in Spark. This parameter does not need to be configured. This parameter is set to false by default to support the incremental view query of the COW table in Spark SQL.

Examples

Create a non-partitioned table.

create table if not exists hudi_table0 (
id int,
name string,
price double
) using hudi
options (
type = 'cow',
primaryKey = 'id',
preCombineField = 'price'
);

Create a partitioned table.

create table if not exists hudi_table_p0 (
id bigint,
name string,
ts bigint,
dt string,
hh string
) using hudi
options (
type = 'cow',
primaryKey = 'id',
preCombineField = 'ts'
)
partitioned by (dt, hh);

Create a table in a specified path.

create table if not exists h3(
id bigint,
name string,
price double
) using hudi

options (
primaryKey = 'id',
preCombineField = 'price'
)
location '/path/to/hudi/h3';

Create a table and specify table attributes.

create table if not exists h3(
id bigint,
name string,
price double
) using hudi
options (
primaryKey = 'id',
type = 'mor',
preCombineField = 'name',
hoodie.cleaner.fileversions.retained = '20',
hoodie.keep.max.commits = '20'
);

Create a table and specify column default values.

create table if not exists h3(
id bigint,
name string,
price double default 12.34
) using hudi
options (
primaryKey = 'id',
type = 'mor',
preCombineField = 'name'
);

Precautions

Currently, Hudi does not support the CHAR, VARCHAR, TINYINT, and SMALLINT data types. You are advised to use the string or INT data type.
Currently, only the following types of data supports the configuration of default values: int, bigint, float, double, decimal, string, date, timestamp, boolean, and binary.
You must specify primaryKey and preCombineField for Hudi tables.
When you create a table in a specified path and there already are Hudi tables in the path, you do not need to specify columns during table creation.