在实时集成作业中,如果遇到附加字段UDF类型中表达式里配置的输入字段不存在怎么办?
问题描述
在实时集成任务中,用户配置了附加字段,字段值类型选择了UDF。提交作业后运行出现异常,错误信息为:data transform error: required col [xxx] from xxx is not existed。
报错信息样例:
2025-07-28 14:19:16,528 ERROR com.huawei.clouds.dataarts.migration.connector.hudi.sink.transform.RowDataToHoodieFunction [] - serialize error, table identifier: llch96.rds_source_upper_tbl_0726_2, raw data: {"before":{},"after":{"ID":5,"COL_TEXT_NEW":"2023-07-05 22:11:01","COL_INT_NEW":3,"COL_FLOAT":1.3265,"COL_DOUBLE":null},"source":{"name":"mysql_binlog_source","db":"TCDB","table":"RDS_SOURCE_UPPER_TBL","pkNames":["ID"],"sqlType":{"ID":4,"COL_TEXT_NEW":12,"COL_INT_NEW":4,"COL_FLOAT":6,"COL_DOUBLE":8},"mysqlType":{"ID":"INT(11)","COL_TEXT_NEW":"VARCHAR(50)","COL_INT_NEW":"INT(11)","COL_FLOAT":"FLOAT(-1)","COL_DOUBLE":"DOUBLE(-1)"}},"tableChange":{"type":"CREATE","previousId":null,"id":{},"table":{}},"tsMs":1753683511000,"op":"c","dataSourceName":"mysql_binlog_source","dbType":"mysql"}
java.lang.IllegalArgumentException: migration.10000430: Transforming extra columns failed, please check your extra column configs. Error stack: [TCDB.RDS_SOURCE_UPPER_TBL]->[llch96.rds_source_upper_tbl_0726_2]:
transform expression failed, expression is : [from_unixtime(#aaa, yyyyMMdd)], cause: data transform error: required col [aaa] from TCDB.RDS_SOURCE_UPPER_TBL is not existed
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_362]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_362]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_362]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_362]
at org.apache.inlong.common.exception.DMExceptions.create(DMExceptions.java:81) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at org.apache.inlong.common.exception.DMExceptions.create(DMExceptions.java:61) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at org.apache.inlong.sort.base.util.transformer.UDFTransformer.transformInternal(UDFTransformer.java:181) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at org.apache.inlong.sort.base.util.transformer.UDFTransformer.transform(UDFTransformer.java:92) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at org.apache.inlong.sort.base.util.transformer.ExtraColTransformer.getUDFTransformData(ExtraColTransformer.java:53) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at org.apache.inlong.sort.base.util.transformer.ExtraColTransformer.transform(ExtraColTransformer.java:36) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at com.huawei.clouds.dataarts.migration.connector.hudi.sink.schema.RowDataExtendTool.setExtraColumn(RowDataExtendTool.java:526) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at com.huawei.clouds.dataarts.migration.connector.hudi.sink.schema.RowDataExtendTool.generateOneRow(RowDataExtendTool.java:438) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at com.huawei.clouds.dataarts.migration.connector.hudi.sink.schema.RowDataExtendTool.generateGenericRowData(RowDataExtendTool.java:351) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at com.huawei.clouds.dataarts.migration.connector.hudi.sink.transform.RowDataToHoodieFunction.toHoodieRecord(RowDataToHoodieFunction.java:258) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at com.huawei.clouds.dataarts.migration.connector.hudi.sink.transform.RowDataToHoodieFunction.processElement(RowDataToHoodieFunction.java:233) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-35ce8ffabe04d9976ded23a41ab44d38:?]
at com.huawei.clouds.dataarts.migration.connector.hudi.sink.transform.RowDataToHoodieFunction.processElement(RowDataToHoodieFunction.java:86) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-
原因分析
当用户在UDF表达式中引用的字段名在Hudi表中实际不存在时,例如字段aaa在Hudi表和源端表中均未找到。
此外,早期版本的Migration引擎不支持将审计字段cdc_last_update_date、logical_is_deleted、_hoodie_event_time配置为UDF表达式引用字段。
解决方案
- 修改UDF表达式中的字段名为有效的字段。
- 如果是由于Hudi审计字段引用失败,请联系运维人员协助升级引擎包。