Updated on 2025-08-25 GMT+08:00

CREATE TABLE

Function

CREATE TABLE is specific to DataArts Fabric SQL. It is used to create a table with a specified name on LakeFormation.

Precautions

  • Columns appearing in PARTITIONED BY must not be described among the regular columns of the table. Partition columns are always positioned after the regular columns.
  • The CLUSTERED BY syntax is applicable only to ORC and Parquet tables, and not to Hudi and Iceberg tables.

Syntax

1
2
3
4
5
6
7
8
9
CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] [schema_name.]table_name 
[ ( col_name col_type [ COMMENT col_comment ] [, ... ] ) ]
[ COMMENT table_comment ]
[ PARTITION BY ( col_name col_type COMMENT col_comment [, ... ] ) ]
[ CLUSTERED BY (col_name [,...]) INTO ( bucket_num ) BUCKETS ]
[ TABLEPROPERTIES ( option_key = option_value [, ... ] ) ]
[ STORE AS table_format ]
[ LOCATION table_path ]
AS select_stmt;

Parameter Description

  • EXTERNAL

    Specifying this keyword creates an EXTERNAL table. Omitting it defaults to creating a MANAGED table, where files, metadata, and statistical information are all managed by DataArts Fabric SQL.

  • IF NOT EXISTS

    Issues a notice rather than throwing an error if a table with the same name already exists.

  • schema_name

    The name of the database to which the table belongs. If no database name is specified, the table is created within the current_schema.

  • table_name

    The name of the table to be created, which should not exceed 63 characters in length.

  • col_name

    The name of the column to be created in the table, also limited to 63 characters. The combined total of regular and partition columns must not exceed 5,000.

  • col_type

    The data type of the column to be created, with supported ranges as follows:

    Column Type

    Can Be Declared As Partition Column

    Supports ORC Format

    Supports Parquet Format

    Supports Iceberg Format

    SmallInt

    Int

    BigInt

    Float

    ×

    Double

    ×

    Decimal

    Numeric

    Timestamp

    Date

    Varchar

    Char

    Bool

    ×

    Bytea

    ×

    Text

  • col_comment

    Comment for the column, which can be any string.

  • table_comment

    Comment for the table, which can also be any string.

  • bucket_num

    Number of buckets.

  • option_key = option_value

    Setting parameters at the table level, with supported parameter ranges as follows:

    Table 1 Value range of the option_key parameter

    option_key

    option_value

    Description

    Applicable Scope

    orc.compress

    zlib, snappy, lz4

    Compression method for ORC files.

    ORC

    parquet.compression

    zlib, snappy, lz4

    Compression method for Parquet files.

    PARQUET

    julian_adjust

    true, false

    Whether to convert to Julian date.

    PARQUET

    checkencoding

    high, low, no

    Whether to check character encoding.

    ORC, PARQUET

    column_index_access

    true, false

    Matching method between table-defined columns and file columns during reading. Defaults to true for column index matching, false for column name matching.

    ORC, PARQUET

    filesize

    Certificates ranging from 1 to 1,024 bits

    Size of external table files generated.

    ORC, PARQUET

    write.delete.mode

    copy-on-write, merge-on-read

    Mode setting for deletion: cow or mor.

    Iceberg

    write.update.mode

    copy-on-write, merge-on-read

    Mode setting for updates: cow or mor.

    write.merge.mode

    copy-on-write, merge-on-read

    Mode setting for merges: cow or mor.

    write.parquet.compression-codec

    zstd, lz4, snappy, gzip,uncompressed

    Compression method for Parquet files.

    write.merge.isolation-level

    snapshot, serializable

    Isolation level for merge commands.

    write.metadata.delete-after-commit.enabled

    true, false

    Controls whether the oldest tracked version metadata file is deleted after committing.

    write.update.isolation-level

    snapshot, serializable

    Isolation level for update commands.

    write.delete.isolation-level

    snapshot, serializable

    Isolation level for delete commands.

    write.metadata.previous-versions-max

    An integer greater than 0

    Number of older metadata files to retain.

    • table_format

      Table storage formats, supporting ORC, PARQUET, and ICEBERG.

    • table_path

      Table storage path. The value must be a valid OBS path. OBS buckets and parallel file systems are supported. If the path is an OBS bucket path, the table is read-only. Otherwise, the table can be read and written. If the table to be created is a managed table, the table storage path cannot be specified. The table storage path is specified by the system as the path with the same name as the table name in the schema path, and the path must be empty during table creation.

    • select_stmt

      Query statement. External tables do not support CTAS statements.

Examples

Create an ORC table.

1
CREATE TABLE test_table1 (a int, b bool, c text) store as orc;