Updated on 2024-11-29 GMT+08:00

Setting Default Values for Hudi Columns

This feature allows you to set default values for columns when adding columns to a table and allows the system to return the default values of new columns when you query historical data.

Constraints

  • If data has been rewritten before default values are set for a new column, the default values of the column cannot be returned when historical data is queried. In this case, NULL values are returned. Some or all data will be rewritten when data is imported to the database, updated, compacted, or clustered.
  • The default values of a column must match the column type. If they do not match, the type will be forcibly converted. As a result, the precision of the default values is lost or the default values are NULL values.
  • The default values of historical data are the same as the default values set for the column for the first time. Changing the default values of a column for multiple times does not affect the query result of historical data.
  • After the default value is set, the rollback operation cannot roll back the default value.
  • Currently, Spark SQL does not support the function of viewing default column values. You can run the show create table command on Hive beeline to view default column values.

Scope

Currently, only the int, bigint, float, double, decimal, string, date, timestamp, boolean, and binary data types are supported.

Table 1 Supported engines

Engine

DDL Operation Support

Write Operation Support

Read Operation Support

SparkSQL

Y

Y

Y

Spark DataSource

N

N

Y

Flink

N

N

Y

HetuEngine

N

N

Y

Hive

N

N

Y

Example

For details about the SQL syntax, see Hudi SQL Syntax Reference.

Example:

  • Create a table and specify default values for columns.
    create table if not exists h3(
    id bigint,
    name string,
    price double default 12.34
    ) using hudi
    options (
    primaryKey = 'id',
    type = 'mor',
    preCombineField = 'name'
    );
  • Add columns and specify default values for the columns.
    alter table h3 add columns(col1 string default 'col1_value');
    alter table h3 add columns(col2 string default 'col2_value', col3 int default 1);
  • Change default values of columns.
    alter table h3 alter column price set default 14.56;
  • Inset data and use column default values.
    insert into h3(id, name) values(1, 'aaa');
    insert into h3(id, name, price) select 2, 'bbb', 12.5;