CREATE TABLE
Syntax
①
CREATE TABLE [ IF NOT EXISTS ]
[catalog_name.][db_name.]table_name (
{ column_name data_type [ NOT NULL ]
[ COMMENT col_comment]
[ WITH ( property_name = expression [, ...] ) ]
| LIKE existing_table_name
[ { INCLUDING | EXCLUDING } PROPERTIES ]
}
[, ...]
)
[ COMMENT table_comment ]
[ WITH ( property_name = expression [, ...] ) ]
②
CREATE [EXTERNAL] TABLE [IF NOT EXISTS]
[catalog_name.][db_name.]table_name (
{ column_name data_type [ NOT NULL ]
[ COMMENT comment ]
[ WITH ( property_name = expression [, ...] ) ]
| LIKE existing_table_name
[ { INCLUDING | EXCLUDING } PROPERTIES ]
}
[, ...]
)
[COMMENT 'table_comment']
[PARTITIONED BY(col_name data_type, ....)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name, col_name, ...)] INTO num_buckets BUCKETS] ]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION 'hdfs_path']
[TBLPROPERTIES (orc_table_property = value [, ...] ) ]
③
CREATE [EXTERNAL] TABLE [IF NOT EXISTS]
[catalog_name.][db_name.]table_name (
{ column_name data_type [ NOT NULL ]
[ COMMENT comment ]
[ WITH ( property_name = expression [, ...] ) ]
| LIKE existing_table_name
[ { INCLUDING | EXCLUDING } PROPERTIES ]
}
[, ...]
)
[PARTITIONED BY(col_name data_type, ....)]
[SORT BY ([column [, column ...]])]
[COMMENT 'table_comment']
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION 'hdfs_path']
[TBLPROPERTIES (orc_table_property = value [, ...] ) ]
Remarks
- bucket_count can be set for the session attribute. The default value is -1, indicating that bucket_count is not set. During partitioned table creation, if bucket_count is set to -1 and buckets is not set in the table creation statement, the default value 16 is used.
- By default, external tables are stored in /user/hive/warehouse/{schema_name}/{table_name}. schema_name indicates the schema used for creating a table, and table_name indicates the table name.
- Setting "transactional=true" enables tables to support atomicity, consistency, isolation, and durability (ACID). However, after a table is defined as a transaction table, it cannot be degraded to a non-transaction table by setting "transactional=false".
When transactional='true' or '0' is executed, type conversion will not be performed. Therefore, the following exception is thrown:
Cannot convert ['true'] to boolean
Cannot convert ['0'] to boolean
- By default, data cannot be inserted into a managed table whose property external is set to true. To use this function, add custom Hive property hive.non-managed-table-writes-enabled=true. For details, see Precautions.
- The MPPDB has a restriction that a database identifier can contain no more than 63 characters. If the identifier name exceeds the maximum length, the excess part is automatically truncated.
- Table creation is not supported in cross-domain scenarios.
Description
Creates a new empty table with specified columns by using CREATE TABLE. Use the CREATE TABLE AS statement to create a table with data.
- When the optional parameter IF NOT EXISTS is used, no error is reported if the table already exists.
- The WITH clause can be used to set properties for a newly created table or a single column, such as the storage location of the table and whether the table is an external table.
- The LIKE clause is used to include all column definitions from an existing table in a new table. You can specify multiple LIKE clauses to allow columns to be copied from multiple tables. If INCLUDING PROPERTIES is specified, all table properties are copied to the new table. If the attribute name specified in the WITH clause is the same as the copied attribute name, the value in the WITH clause is used. By default, the EXCLUDING PROPERTIES attribute is used. You can specify the INCLUDING PROPERTIES attribute for only one table.
- PARTITIONED BY can be used to specify the column of a partition. CLUSTERED BY can be used to specify columns for buckets. SORT BY and SORTED BY can be used to sort specified bucket columns. BUCKETS can be used to specify the number of buckets. EXTERNAL can be used to create foreign tables. STORED AS can be used to specify the file storage format. LOCATION can be used to specify the storage path in HDFS.
To view the supported column properties, run the following statement:
SELECT * FROM system.metadata.column_properties;
To view the supported table properties, run the following statement:
SELECT * FROM system.metadata.table_properties;
The following table lists the query result when the catalog is hive.
SELECT * FROM system.metadata.table_properties where catalog_name = 'hive';
catalog_name |
property_name |
default_value |
type |
description |
---|---|---|---|---|
hive |
auto_purge |
false |
boolean |
Skip trash when table or partition is deleted |
hive |
avro_schema_url |
- |
varchar |
URI pointing to Avro schema for the table |
hive |
bucket_count |
0 |
integer |
Number of buckets |
hive |
bucketed_by |
[] |
array(varchar) |
Bucketing columns |
hive |
bucketing_version |
- |
integer |
Bucketing version |
hive |
csv_escape |
- |
varchar |
CSV escape character |
hive |
csv_quote |
- |
varchar |
CSV quote character |
hive |
csv_separator |
- |
varchar |
CSV separator character |
hive |
external_location |
- |
varchar |
File system location URI for external table |
hive |
format |
ORC |
varchar |
Hive storage format for the table. Possible values: [ORC, PARQUET, AVRO, RCBINARY, RCTEXT, SEQUENCEFILE, JSON, TEXTFILE, TEXTFILE_MULTIDELIM, CSV] |
hive |
orc_compress |
GZIP |
varchar |
Compression codec used. Possible values: [NONE, SNAPPY, LZ4, ZSTD, GZIP, ZLIB] |
hive |
orc_compress_size |
262144 |
bigint |
orc compression size |
hive |
orc_row_index_stride |
10000 |
integer |
no. of row index strides |
hive |
orc_stripe_size |
67108864 |
bigint |
orc stripe size |
hive |
orc_bloom_filter_columns |
[] |
array(varchar) |
ORC Bloom filter index columns |
hive |
orc_bloom_filter_fpp |
0.05 |
double |
ORC Bloom filter false positive probability |
hive |
partitioned_by |
[] |
array(varchar) |
Partition columns |
hive |
sorted_by |
[] |
array(varchar) |
Bucket sorting columns |
hive |
textfile_skip_footer_line_count |
- |
integer |
Number of footer lines |
hive |
textfile_skip_header_line_count |
- |
integer |
Number of header lines |
hive |
transactional |
false |
boolean |
Is transactional property enabled |
Example
- Create a new table orders and use the WITH clause to specify the storage format, storage location, and whether the table is a foreign table.
The auto.purge parameter can be used to specify whether to clear related data when data removal operations (such as DROP, DELETE, INSERT OVERWRITE, and TRUNCATE TABLE) are performed.
- If it is set to true, metadata and data files are cleared.
- If it is set to false, only metadata is cleared and data files are moved to the HDFS trash bin. The default value is false. You are advised not to change the value. Otherwise, deleted data cannot be restored.
CREATE TABLE orders ( orderkey bigint, orderstatus varchar, totalprice double, orderdate date ) WITH (format = 'ORC', location='/user',orc_compress='ZLIB',external=true, "auto.purge"=false); -- You can run the DESC FORMATTED statement to view details about table creation. desc formatted orders ; Describe Formatted Table ------------------------------------------------------------------------------ # col_name data_type comment orderkey bigint orderstatus varchar totalprice double orderdate date # Detailed Table Information Database: default Owner: admintest LastAccessTime: 0 Location: hdfs://hacluster/user Table Type: EXTERNAL_TABLE # Table Parameters: EXTERNAL TRUE auto.purge false orc.compress.size 262144 orc.compression.codec ZLIB orc.row.index.stride 10000 orc.stripe.size 67108864 presto_query_id 20220812_084110_00050_srknk@default@HetuEngine presto_version 1.2.0-h0.cbu.mrs.320.r1-SNAPSHOT transient_lastDdlTime 1660293670 # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 (1 row)
- Create a table with the specified row format.
-- When creating a table, use commas (,) to separate fields in each record in the data file. CREATE TABLE student( id string,birthday string, grade int, memo string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; -- When creating a table, specify the field delimiter and newline character as '\t' and '\n', respectively. CREATE TABLE test( id int, name string , tel string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
- If the orders table does not exist, create the orders table and add table comments and column comments:
CREATE TABLE IF NOT EXISTS orders ( orderkey bigint, orderstatus varchar, totalprice double COMMENT 'Price in cents.', orderdate date ) COMMENT 'A table to keep track of orders.'; insert into orders values (202011181113,'online',9527,date '2020-11-11'), (202011181114,'online',666,date '2020-11-11'), (202011181115,'online',443,date '2020-11-11'), (202011181115,'offline',2896,date '2020-11-11');
- Create the bigger_orders table using the column definition of the orders table:
CREATE TABLE bigger_orders ( another_orderkey bigint, LIKE orders, another_orderdate date ); SHOW CREATE TABLE bigger_orders ; Create Table --------------------------------------------------------------------- CREATE TABLE hive.default.bigger_orders ( another_orderkey bigint, orderkey bigint, orderstatus varchar, totalprice double, ordersdate date, another_orderdate date ) WITH ( external = false, format = 'ORC', location = 'hdfs://hacluster/user/hive/warehouse/bigger_orders', orc_compress = 'GZIP', orc_compress_size = 262144, orc_row_index_stride = 10000, orc_stripe_size = 67108864 ) (1 row)
- ①Example of creating a table:
CREATE EXTERNAL TABLE hetu_test (orderkey bigint, orderstatus varchar, totalprice double, orderdate date) PARTITIONED BY(ds int) SORT BY (orderkey, orderstatus) COMMENT 'test' STORED AS ORC LOCATION '/user' TBLPROPERTIES (orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = 'orderstatus,totalprice');
- ②Example of creating a table:
CREATE EXTERNAL TABLE hetu_test1 (orderkey bigint, orderstatus varchar, totalprice double, orderdate date) COMMENT 'test' PARTITIONED BY(ds int) CLUSTERED BY (orderkey, orderstatus) SORTED BY (orderkey, orderstatus) INTO 16 BUCKETS STORED AS ORC LOCATION '/user' TBLPROPERTIES (orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = 'orderstatus,totalprice');
- ③Example of creating a table:
CREATE TABLE hetu_test2 (orderkey bigint, orderstatus varchar, totalprice double, orderdate date, ds int) COMMENT 'This table is in Hetu syntax' WITH (partitioned_by = ARRAY['ds'], bucketed_by = ARRAY['orderkey', 'orderstatus'], sorted_by = ARRAY['orderkey', 'orderstatus'], bucket_count = 16, orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = ARRAY['orderstatus', 'totalprice'], external = true, format = 'orc', location = '/user');
- Run the following statements to view the table:
show create table hetu_test1; Create Table ------------------------------------------------------------------ CREATE TABLE hive.default.hetu_test1 ( orderkey bigint, orderstatus varchar, totalprice double, orderdate date, ds integer ) COMMENT 'test' WITH ( bucket_count = 16, bucketed_by = ARRAY['orderkey','orderstatus'], bucketing_version = 1, external_location = 'hdfs://hacluster/user', format = 'ORC', orc_bloom_filter_columns = ARRAY['orderstatus','totalprice'], orc_bloom_filter_fpp = 5E-2, orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_row_index_stride = 10000, orc_stripe_size = 67108864, partitioned_by = ARRAY['ds'], sorted_by = ARRAY['orderkey','orderstatus'] ) (1 row)
Create a partitioned table.
--Create a schema. CREATE SCHEMA hive.web WITH (location = 'hdfs://hacluster/user'); --Create a partitioned table. CREATE TABLE hive.web.page_views ( view_time timestamp, user_id bigint, page_url varchar, ds date, country varchar ) WITH ( format = 'ORC', partitioned_by = ARRAY['ds', 'country'], bucketed_by = ARRAY['user_id'], bucket_count = 50 ); --Insert an empty partition. CALL system.create_empty_partition( schema_name => 'web', table_name => 'page_views', partition_columns => ARRAY['ds', 'country'], partition_values => ARRAY['2020-07-17', 'US']); CALL system.create_empty_partition( schema_name => 'web', table_name => 'page_views', partition_columns => ARRAY['ds', 'country'], partition_values => ARRAY['2020-07-18', 'US']); --View the partition. SELECT * FROM hive.web."page_views$partitions"; ds | country ------------|--------- 2020-07-18 | US 2020-07-17 | US --Insert data. insert into hive.web.page_views values(timestamp '2020-07-17 23:00:15',bigint '15141','www.local.com',date '2020-07-17','US' ); insert into hive.web.page_views values(timestamp '2020-07-18 23:00:15',bigint '18148','www.local.com',date '2020-07-18','US' ); --Query data. select * from hive.web.page_views; view_time | user_id | page_url | ds | country -------------------------|---------|---------------|------------|--------- 2020-07-17 23:00:15.000 | 15141 | www.local.com | 2020-07-17 | US 2020-07-18 23:00:15.000 | 18148 | www.local.com | 2020-07-18 | US
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot