Creating an OBS Table Using the Hive Syntax
Function
This statement is used to create an OBS table using the Hive syntax.
Syntax
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
[(col_name1 col_type1 [COMMENT col_comment1], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name2 col_type2, [COMMENT col_comment2], ...)]
[ROW FORMAT row_format]
[STORED AS file_format]
LOCATION 'obs_path'
[AS select_statement];
row_format:
: SERDE serde_cls [WITH SERDEPROPERTIES (key1=val1, key2=val2, ...)]
| DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]]
[COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char]
[LINES TERMINATED BY char]
[NULL DEFINED AS char]
|
Keyword
- EXTERNAL: Creates an OBS table.
- IF NOT EXISTS: Prevents system errors when the created table exists.
- COMMENT: Field or table description.
- PARTITIONED BY: Partition field.
- ROW FORMAT: Row data format.
- STORED AS: Specifies the format of the file to be stored. Currently, only the TEXTFILE, AVRO, ORC, SEQUENCEFILE, RCFILE, PARQUET, and CARBON format are supported.
- LOCATION: Specifies the path of OBS. This keyword is mandatory when you create OBS tables.
- TBLPROPERTIES: The TBLPROPERTIES clause allows you to add the key/value attribute to a table.
- AS: Run the CREATE TABLE AS statement to create a table.
Parameter Description
| Parameter | Description |
|---|---|
| db_name | Database name that contains letters, digits, and underscores (_). It cannot contain only digits and cannot start with an underscore (_). |
| table_name | Table name of a database that contains letters, digits, and underscores (_). It cannot contain only digits and cannot start with an underscore (_). The matching rule is ^(?!_)(?![0-9]+$)[A-Za-z0-9_$]*$. If special characters are required, use single quotation marks ('') to enclose them. |
| col_name | Field name |
| col_type | Field type |
| col_comment | Field description |
| row_format | Line data format |
| file_format | OBS table storage format. TEXTFILE, AVRO, ORC, SEQUENCEFILE, RCFILE, PARQUET, and CARBON are supported. |
| table_comment | Table description |
| select_statement | The CREATE TABLE AS statement is used to insert the SELECT query result of the source table or a data record to a new table in OBS bucket. |
Precautions
- The table and column names are case-insensitive.
- Descriptions of table names and column names support only string constants.
- During table creation, you need to specify the column name and corresponding data type. The data type is primitive type.
- If a folder and a file have the same name in the OBS directory, the file is preferred as the path when creating an OBS table.
- During partition table creation, ensure that the specified partition column in PARTITIONED BY is not a column in the table and the data type needs to be specified. The partition column supports only the open-source Hive table types including string, boolean, tinyint, smallint, short, int, bigint, long, decimal, float, double, date, and timestamp.
- Multiple partition fields can be specified. The partition fields need to be specified after the PARTITIONED BY keyword, instead of the table name. Otherwise, an error occurs.
- A maximum of 100,000 partitions can be created in a single table.
- The CREATE TABLE AS statement cannot specify table attributes or create partition tables.
Example
- To create a Parquet table named student, in which the id, name, and score fields are contained and the data types of the respective fields are INT, STRING, and FLOAT, run the following statement:
1
CREATE TABLE student (id INT, name STRING, score FLOAT) STORED AS PARQUET LOCATION 'obs://bucketName/filePath';
- To create a table named student, for which classNo is the partition field and two fields name and score are specified, run the following statement:
1
CREATE TABLE IF NOT EXISTS student(name STRING, score DOUBLE) PARTITIONED BY (classNo INT) STORED AS PARQUET LOCATION 'obs://bucketName/filePath';
classNo is a partition field and must be specified after the PARTITIONED BY keyword, that is, PARTITIONED BY (classNo INT). It cannot be specified after the table name as a table field.
- To create table t1 and insert data of table t2 into table t1 by using the Hive syntax, run the following statement:
1
CREATE TABLE t1 STORED AS parquet LOCATION 'obs://bucketName/filePath' as select * from t2;
Last Article: Creating an OBS Table Using the DataSource Syntax
Next Article: Creating a DLI Table
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.