NULL Is Displayed When Data Is Inserted After the Partition Column Is Added to a Hive Table
Symptom
- Run the following command to create a table:
create table test_table( col1 string, col2 string ) PARTITIONED BY(p1 string) STORED AS orc tblproperties('orc.compress'='SNAPPY');
- Modify the table structure, add partitions, and insert data.
alter table test_table add partition(p1='a'); insert into test_table partition(p1='a') select col1,col2 from temp_table;
- Modify the table structure, add columns, and insert data.
alter table test_table add columns(col3 string); insert into test_table partition(p1='a') select col1,col2,col3 from temp_table;
- Query data in the test_table table. In the returned result, the values in the col3 column are all NULL.
select * from test_table where p1='a'
- Add a table partition and insert data.
alter table test_table add partition(p1='b'); insert into test_table partition(p1='b') select col1,col2,col3 from temp_table;
- Query data in the test_table table. In the returned result, the value of col3 is not all NULL.
select * from test_table where p1='b'
Cause Analysis
RESTRICT is the default option for altering a table. In the RESTRICT mode, only the metadata is changed, while the table's partition structure created before the altering operation remains unchanged. However, new partitions created after the altering operation are changed. Therefore, when values of the old partitions are queried, they are all NULL.
Procedure
alter table test_table add columns(col3 string) cascade;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot