文档首页/ 数据治理中心 DataArts Studio/ 常见问题/ 数据集成(实时作业)/ Hudi作为目标端时,如果作业启动失败且错误信息包含“Custom defined extra column is missing in schema which is predefined in config”怎么办?
更新时间:2025-11-03 GMT+08:00
分享

Hudi作为目标端时,如果作业启动失败且错误信息包含“Custom defined extra column is missing in schema which is predefined in config”怎么办?

问题描述

用户在为作业配置附加字段后,作业启动失败。配置的附加字段名在Hudi表中未找到,错误信息中包含关键字“Custom defined extra column is missing in schema which is predefined in config”。

报错信息详情:

java.lang.
IllegalArgumentException: Custom defined extra column is missing in schema which is predefined in config
.
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.schema.SchemaParser.parseExtraField(SchemaParser.java:120) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-5d1a8c677d8652bd7d54ecbb6d3ae814:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.schema.SchemaParser.<init>(SchemaParser.java:70) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-5d1a8c677d8652bd7d54ecbb6d3ae814:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.schema.RowDataExtendTool.<init>(RowDataExtendTool.java:130) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-5d1a8c677d8652bd7d54ecbb6d3ae814:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.transform.RowDataToHoodieFunction.open(RowDataToHoodieFunction.java:177) ~[blob_p-b7cce999877870001e778eb40e09ff21b660374d-5d1a8c677d8652bd7d54ecbb6d3ae814:?]
    at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) ~[flink-core-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:101) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.api.operators.AbstractProcessOperator.open(AbstractProcessOperator.java:68) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:713) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:688) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) [flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751) [flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:573) [flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at java.lang.Thread.run(Thread.java:750) [?:1.8.0_362]
2025-07-28 12:00:28,736 WARN  org.apache.flink.runtime.taskmanager.Task                    [] - Call stack:
    at java.lang.Thread.getStackTrace(Thread.java:1564)
    at org.apache.flink.runtime.taskmanager.Task.transitionState(Task.java:1139)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:801)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:573)
    at java.lang.Thread.run(Thread.java:750)

原因分析

用户配置的附加字段必须是Hudi表中的已有字段。在作业启动时会进行校验,如果附加字段在Hudi表中不存在,校验将不通过,导致作业异常。

解决方案

用户需要核查作业编辑配置中的附加字段名是否为Hudi表中已存在的字段,及时修改附加字段名或重建Hudi表。

相关文档