Updated on 2025-02-22 GMT+08:00

CREATE TABLE

Function

This command is used to create a Hudi table by specifying the list of fields along with the table options. When using the metadata service provided by DLI, only foreign tables can be created, meaning you need to specify the table path through LOCATION.

Syntax

CREATE TABLE [ IF NOT EXISTS] [database_name.]table_name

[ (columnTypeList)]

USING hudi

[ COMMENT table_comment ]

[ LOCATION location_path ]

[ OPTIONS (options_list) ]

Parameter Description

Table 1 Parameter descriptions

Parameter

Description

database_name

Database name that contains letters, digits, and underscores (_).

table_name

Database table name that contains letters, digits, and underscores (_).

columnTypeList

List of comma-separated columns with data types. The column name contains letters, digits, and underscores (_).

using

Uses hudi to define and create a Hudi table.

table_comment

Description of the table.

location_path

OBS path. If specified, the Hudi table will be created as a foreign table.

options_list

List of Hudi table options.

Table 2 Table options

Parameter

Description

primaryKey

Mandatory. Primary key name. Separate multiple primary key names with commas (,).

type

Type of the table. 'cow' indicates a copy-on-write (COW) table, and 'mor' indicates a merge-on-read (MOR) table. If this parameter is not specified, the default value is 'cow'.

preCombineField

(Mandatory) Table's preCombine field. When pre-aggregating data before writing, if the primary keys are the same, the preCombine field will be used for comparison.

payloadClass

Logic that uses preCombineField for data filtering. DefaultHoodieRecordPayload is used by default. In addition, multiple preset payloads are provided, such as OverwriteNonDefaultsWithLatestAvroPayload, OverwriteWithLatestAvroPayload, and EmptyHoodieRecordPayload.

useCache

Whether to cache table relationships in Spark. This parameter does not need to be configured. This parameter is set to false by default to support the incremental view query of the COW table in Spark SQL.

Example

  • Create a non-partitioned table.
    create table if not exists hudi_table0 (
    id int,
    name string,
    price double
    ) using hudi
    options (
    type = 'cow',
    primaryKey = 'id',
    preCombineField = 'price'
    );
  • Create a partitioned table.
    create table if not exists hudi_table_p0 (
    id bigint,
    name string,
    ts bigint,
    dt string,
    hh string
    ) using hudi
    options (
    type = 'cow',
    primaryKey = 'id',
    preCombineField = 'ts'
    )
    partitioned by (dt, hh);
  • Create a table in a specified path.
    create table if not exists h3(
    id bigint,
    name string,
    price double
    ) using hudi
    
    options (
    primaryKey = 'id',
    preCombineField = 'price'
    )
    location 'obs://bucket/path/to/hudi/h3';
  • Specify table properties when creating a table. (This operation is supported but not recommended, as writing properties in the table creation statement makes future modifications inconvenient).
    create table if not exists h3(
    id bigint,
    name string,
    price double
    ) using hudi
    options (
    primaryKey = 'id',
    type = 'mor',
    hoodie.cleaner.fileversions.retained = '20',
    hoodie.keep.max.commits = '20'
    );

Caveats

  • Currently, Hudi does not support CHAR, VARCHAR, TINYINT, and SMALLINT types; you are advised to use string or INT types.
  • Currently, only int, bigint, float, double, decimal, string, date, timestamp, boolean, and binary types in Hudi support setting default values.
  • You must specify primaryKey and preCombineField for Hudi tables.
  • When creating a table at a specified path, if a Hudi table already exists at the path, you do not need to specify columns during table creation, and you cannot modify the table's original properties.

Permission Requirements

Metadata service provided by DLI

  • SQL permissions:

    database

    table

    CREATE_TABLE

    None

  • Fine-grained permission: dli:database:createTable

Metadata services provided by LakeFormation. Refer to the LakeFormation documentation for details on permission configuration.

System Response

The table is successfully created. The created Hudi table can be accessed by entering the DLI console, choosing Data Management > Databases and Tables from the left navigation pane, and then clicking the name of the database where the table is created.