Updated on 2025-11-18 GMT+08:00

Additional Fields

Scenario

  • You can add an additional field to the destination table. The value of the additional field can be the conversion result of an existing field in the source table or the metadata related to the source and destination database tables.
  • You can overwrite the value of an existing field with that of an additional field with the same name as the existing field. (Only some links support this function.)

    Supported links: Additional fields can be added to destination tables for most entire DB migration jobs. For details about whether additional fields are supported, see Using Real-Time Migration Jobs.

Notes and Constraints

  • If the name of an additional field is the same as that of an existing field at the source, the table may fail to be automatically created, and a message is displayed indicating duplicate name. In this case, you need to manually create a table.
  • When adding additional fields to an existing table, ensure that the destination table structure contains the additional fields. Otherwise, the job may fail to be executed.
  • When using UDFs, read Setting Values for Additional Fields carefully to learn how to use UDF field values.

Configuring Additional Fields

When editing a job, you can add additional fields to a specified table after refreshing the mapping between the source table and destination table.
Figure 1 Configuring additional fields

Configure an additional field as follows:

  • Field Name: Do not use special characters in the field name. Only letters, digits, and underscores (_) are recommended.
  • Field Type: Select an appropriate field type as needed.
  • Field Value: Currently, the following field value types are supported: constant, built-in variable, field variable, and UDF.

    During automatic table creation, additional fields are added to the destination table structure by default.

Setting Values for Additional Fields

  • Constants
    • Parameter description: When the field value is a constant, set it to a fixed value.
    • Job example: As shown in the following figure, the query result of the mysql_constant_extra_col field at the migration destination is fixed at AAAA.
      Figure 2 Example constant value
  • Built-in variables
    • Parameter description: Built-in variables support the following values. All of them are related to the data source metadata.
      • Source host IP address: source.
      • Source database name: source.schema
      • Source table name: source.table
      • Destination database name: target.schema
      • Destination table name: target.table
    • Example job: See the following figure.
      Figure 3 Example built-in variable

      View the output at the destination.

      Figure 4 Destination result
  • Field variables
    • Parameter description: A field variable fills an additional field with a field value in the source table. You can select a field name from the drop-down list or manually enter a source field name.
    • Example job: See the following figure.
      Figure 5 Example value of a field variable

      The data in the source table llch96.rds_source_tbl_961 is as follows.

      Figure 6 Source table data

      After data is migrated to the destination, the input is as follows.

      Figure 7 Destination table data
  • UDFs
    • Parameter description: If you select UDF for Field Value, you can select a UDF for performing secondary processing on the source field value before it is written to the additional field. The following UDFs are supported:
      • substring(#col, pos[, len]): obtains a substring of a specified length from the source column name. The substring range is [pos, pos+len).
      • date_format(#col, time_format[, src_tz, dst_tz]): formats the source column name based on a specified time format. The time zone can be converted using src_tz and dst_tz.
      • now([tz]): obtains the current time in a specified time zone.
      • if(cond_exp, str1, str2): returns str1 if the condition expression cond_exp is met and returns str2 otherwise.
      • concat(#col[, #str, ...]): concatenates multiple parameters, including source columns and strings.
      • from_unixtime(#col[, time_format]): formats a Unix timestamp based on a specified time format.
      • unix_timestamp(#col[, precision, time_format]): converts a time into a Unix timestamp of a specified time format and precision.

        Nested UDFs are supported. Example: substring(date_format(#col, yyyyMMdd), 0, 4)

      The usage and examples of UDFs are as follows:
      • substring

        substring(#col, pos[, len]) description:

        • Truncates the value of the source col field from a specified character specified by pos (including that character) to the end by default.
        • The len parameter is optional. It indicates the length of the truncated characters.
        • pos is 1 for the first character, 2 for the second character, and so on.
        • If pos is 0, it also indicates the first character.
          Set col to abcdefg, a string of 7 characters.
          substring(#col, 1) = "abcdefg"
          substring(#col, 1, 3) = "abc"
          substring(#col, 0, 3) = "abc"
          substring(#col, 3, 4) = "cdef"
          substring(#col, 1, 100) = "abcdefg"
      • date_format
        Description of date_format(#col, time_format[, src_tz, dst_tz]):
        • Formats the time format field col at the source based on the new format time_format. You can set the time zone of the source col field to src_tz and that of the destination to dst_tz.
        • The value of col must be in the time format, for example, '2025-12-11 00:49:00.123456'. Otherwise, a job exception may occur during the conversion.
        • The time_format parameter must be in a valid time format and can contain yyyy, MM, dd, HH, mm, ss and SS, for example, yyyy-MM-dd HH:mm:ss.SSS. Do not use quotation marks to enclose time_format when spelling the date_format expression. Otherwise, a job exception will occur. For example, date_format(#col, yyyy) is valid, but date_format(#col, 'yyyy') is invalid.
        • src_tz indicates the time zone of the input time field col. The expression template for configuring the input time zone is date_format(#col, time_format, src_tz, dst_tz). For example, in the date_format(#col, yyyy-MM-dd, -8, +8) expression, -8 indicates the source time zone.
        • dst_tz indicates the time zone expected from the UDF. The expression template for configuring the output time zone is date_format(#col, time_format, dst_tz) or date_format(#col, time_format, src_tz, dst_tz). For example, date_format(#col, yyyy-MM-dd, +8) and date_format(#col, yyyy-MM-dd, -8, +8) both indicate that the output time zone is +8.
        • If the col type of the source field is timestamp, the field value contains the input time zone information by default. In this case, the configured input time zone does not take effect.
        • date_format(#col, time_format): If the source col is not a timestamp, the source and destination are in the same time zone, and time zone conversion is not performed. If the source col is a timestamp, the result is output based on the system time zone by default.
        • date_format(#col, time_format): If the source col is not a timestamp, the default time zone at the source is UTC.
          Table 1 Time zone rules

          Expression

          Source Field Type

          Source Time Zone

          Destination Time Zone

          date_format(#col, time_format)

          String/datetime

          Time zone of the current region

          Time zone of the current region

          Timestamp

          Source time zone

          Time zone of the current region

          date_format(#col, time_format, dst_tz)

          String/datetime

          UTC

          dst_tz

          Timestamp

          Source time zone

          dst_tz

          date_format(#col, time_format, src_tz, dst_tz)

          String/datetime

          src_tz

          dst_tz

          Timestamp

          Source time zone

          dst_tz

          Example of a datetime field:

          Set col to 2023-08-01 09:00:00.000000. The field type is datetime or string.
          #If the time zone is not configured, it is the same as that of the destination by default.
          date_format(#col, yyyy-MM-dd HH:mm:ss) = "2023-08-01 09:00:00"   
          #If the source field type is not timestamp, the source time zone is UTC by default. In this example, UTC is configured for the destination, and the time zone of the source is the same as that of the destination.
          date_format(#col, yyyy-MM-dd HH:mm:ss, UTC) = "2023-08-01 09:00:00"   
          # If the source field type is not timestamp, the default time zone at the source is UTC. In this example, the time zone of the destination is +8.
          date_format(#col, yyyy-MM-dd HH:mm:ss, +8) = "2023-08-01 17:00:00"   
          date_format(#col, yyyy-MM-dd HH:mm:ss, GMT+8) = "2023-08-01 17:00:00"
          date_format(#col, yyyy-MM-dd HH:mm:ss, UTC+8) = "2023-08-01 17:00:00" 
          #If the source field type is not timestamp, set the time zone at the source to +1 and that at the destination to +8.
          date_format(#col, yyyy-MM-dd HH:mm:ss, +1, +8) = "2023-08-01 16:00:00"  
          #If the source field type is not timestamp, set the time zone at the source to -8 and that at the destination to +8.
          date_format(#col, yyyy-MM-dd HH:mm:ss, -8, +8) = "2023-08-02 01:00:00"  

          Example of a timestamp field:

          Assume that col is set to 2023-07-05 00:00:15.123456, the field type is timestamp, the source time zone is +8, and the time zone of the current Huawei Cloud region is +8 (for example, Asia/Singapore).
          #The source field type is timestamp, the source time zone is +8, and the destination time zone is the time zone of the Huawei Cloud region, that is, +8.
          date_format(#col, yyyy-MM-dd HH:mm:ss) = "2023-07-05 00:00:15"   
          #The source field type is timestamp, the source time zone is +8, and the destination time zone is +16.
          date_format(#col, yyyy-MM-dd HH:mm:ss, +16) = "2023-07-05 08:00:15"  
          #The source field type is timestamp. The configured src_tz does not take effect. The actual time zone of the source is +8, and the time zone of the destination is +8.
          date_format(#col, yyyy-MM-dd HH:mm:ss, -8, +8) = "2023-07-05 00:00:15"  
          #The source field type is timestamp. The configured src_tz does not take effect. The actual time zone of the source is +8, and the time zone of the destination is +16.
          date_format(#col, yyyy-MM-dd HH:mm:ss, -8, +16) = "2023-07-05 08:00:15" 
      • now

        now([tz]) description:

        • System time when data is read by a real-time job
        • tz is optional and specifies the time zone of the system time. The default value is the time zone of the region where Huawei Cloud services are located.

        Example:

        now() = "2025-07-30 14:46:48.969"
        now(+16) = "2025-07-30 22:46:48.969"
      • if

        if(cond_exp, str1, str2) description:

        • The output is determined based on the configured logic judgment expression cond_exp. If the judgment result of cond_exp is true, the output is str1. If the judgment result of cond_exp is false, the output is str2.
        • Currently, only the "equal to" and "not equal to" logic judgment of built-in variables is supported. The constant on one side of the logic judgment expression must be enclosed in single quotation marks, for example, if(#{mgr.source.table} = 'src_tb', 'a', 'b').

        Example:

        if(#{mgr.source.table} = 'src_tb', 'a', 'b') = 'a' 
        if(#{source.host} = '10.xxx.x.xx', 'abcd', 'qaz') = 'qaz'
      • concat

        concat(#col[, #str, ...]) description:

        • It concatenates field values, including source field values and string constants, for example, concat(#col, '&', #col2, '_', aaa).
        • If any input parameter is null, the output of the entire expression is null.

        Example:

        # col is the source field and its value is col_data. #{mgr.source.table} is the reference of the built-in variable source table name and its value is src_tb.
        concat(#col, '_', #{mgr.source.table}) = "col_data_src_tb"  
        # col is the source field and its value is col_data. col2 is the source field and its value is col2_data. The third input parameter is a string constant aaa.
        concat(#col, '&', #col2, '_', aaa) = "col_data&col2_data_aaa"  
        # col_null is a null field. The output of the entire expression is null.
        concat(#col, #col2, #col_null) = null  
      • from_unixtime

        from_unixtime(#col[, time_format]) description:

        • It converts a long integer timestamp field into a time value. The col field is the long integer Unix timestamp at the source , and the time_format parameter is the expected output time format.
        • It cannot return the time zone of a specified time. By default, the output is based on the time zone of the current Huawei Cloud region.
        • If time_format is not specified, the output is in yyyy-MM-dd HH:mm:ss.SSS format by default.

        Example:

        # col = 17042536161231
        from_unixtime(#col, yyyy-MM-dd HH:mm) = "2024-01-03 11:46"
        from_unixtime(#col, yyyy-MM-dd HH:mm:ss:SSSSSS) = "2024-01-03 11:46:56:123100" 
        # col= 1704253616
        from_unixtime(#col) = "2024-01-03 11:46:56.000"
        # col= 942379893
        from_unixtime(#col, yyyy-MM-dd) = "1999-11-12" 
        # col= 17042536161
        from_unixtime(#col, yyyy-MM-dd) = "2024-01-03"
      • unix_timestamp

        unix_timestamp(#col[, precision, time_format]) description:

        • It converts a time into a long integer Unix timestamp. col is the source field in the time format. precision specifies the precision of the output Unix timestamp below the second. The default precision is 3. time_format specifies the time format of the input data.
        • time_format specifies the time format of the input data. time_format must match the format of the col field. Otherwise, precision loss or inaccurate conversion may occur. If time_format is not configured, the input data is parsed in yyyy-MM-dd HH:mm:ss[.fffffffff] format.
        • unix_timestamp cannot specify the time zone information of the col field. If the col field is a timestamp, the input value is parsed based on the source time zone. If the col field is a datetime or string, the input value is parsed based on the time zone of the region where Huawei Cloud services are located by default.

        Example:

        # col = "2024-01-03 11:46:56.123"
        unix_timestamp(#col) = "1704253616123" 
        # col = "2024-01-03 11:46:56"
        unix_timestamp(#col) = "1704253616000"
        # col = "2024-01-03 11:46:56.123456"
        unix_timestamp(#col, 9) = "1704253616123456000"
        # col = "2024-01-03 11:46:56.123456789"
        unix_timestamp(#col, 9) = "1704253616123456789"
        # col = "2024-01-03 11:46"
        unix_timestamp(#dateStr1, 0, yyyy-MM-dd HH:mm) = "1704253560"
      • Nested expressions

        You can use nested UDFs.

        Example:

        # col = "2025-11-11 00:14:23"
        substring(date_format(#col, 'yyyyMMdd'), 0, 4) = "2025"
    • Job example: The following figure shows an example.
    Figure 8 Example of the structure for creating a source table

    View the source data.

    Figure 9 Source data

    View the job configuration.

    Figure 10 Job configuration

    View the synchronization result.

    Figure 11 Synchronization result

Overwriting the Value of an Existing Field with That of an Additional Field with the Same Name

This function is supported by some links. In this scenario, you need to manually create tables. Tables cannot be automatically created.

Example job:

  1. View the structure of the source table.
    Figure 12 Structure of the source table
  2. View the source data.
    Figure 13 Source data
  3. View the result of manual table creation at the destination.
    Figure 14 Manual table creation result at the destination
  4. Configure job parameters.
    Add sink.extra.column.first = true to the advanced settings of the destination.
    Figure 15 Adding custom parameters
  5. Check the change result.
    Figure 16 Refresh

    The value of the age field has been overwritten by that of the additional field.

    Figure 17 Viewing the change result