在实时集成作业中,并发写入Hudi表导致作业异常,报错信息包含关键字“Receive an unexpected event for instant”怎么办?
问题描述
在实时集成作业中,作业运行时出现异常,报错信息包含关键字“Receive an unexpected event for instant”。
报错信息详情:
java.lang.IllegalStateException: 
Receive an unexpected event for instant
 20250728162940486 from task 0
    at com.huawei.clouds.dataarts.shaded.org.apache.hudi.common.util.ValidationUtils.checkState(ValidationUtils.java:73) ~[?:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.
StreamWriteOperatorCoordinator.handleWriteMetaEvent
(StreamWriteOperatorCoordinator.java:734) ~[?:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.StreamWriteOperatorCoordinator.lambda$null$6(StreamWriteOperatorCoordinator.java:545) ~[?:?]
    at com.huawei.clouds.dataarts.shaded.org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:99) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_362]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_362]
    at java.lang.Thread.run(Thread.java:750) [?:1.8.0_362]
2025-07-28 16:29:42,401 INFO  org.apache.flink.runtime.jobmaster.JobMaster                 [] - Trying to recover from a global failure.
org.apache.flink.util.FlinkException: Global failure triggered by OperatorCoordinator for 'Sink: bucket_write_llch96.rds_source_tbl_961_0726' (operator 2460c47fb2039644fa5ad82006f0efce).
    at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:556) ~[flink-dist-1.15.0-h0.cbu.dli.321.r2.jar:1.15.0-h0.cbu.dli.321.r2]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.StreamWriteOperatorCoordinator.lambda$null$1(StreamWriteOperatorCoordinator.java:256) ~[?:?]
    at com.huawei.clouds.dataarts.shaded.org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:109) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_362]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_362]
    at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
Caused by: com.huawei.clouds.dataarts.shaded.org.apache.hudi.exception.HoodieException: Executor executes action [handle write metadata event for instant 20250728162827526] error
    ... 5 more
Caused by: java.lang.IllegalStateException: Receive an unexpected event for instant 20250728162940486 from task 0
    at com.huawei.clouds.dataarts.shaded.org.apache.hudi.common.util.ValidationUtils.checkState(ValidationUtils.java:73) ~[?:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.StreamWriteOperatorCoordinator.handleWriteMetaEvent(StreamWriteOperatorCoordinator.java:734) ~[?:?]
    at com.huawei.clouds.dataarts.migration.connector.hudi.sink.StreamWriteOperatorCoordinator.lambda$null$6(StreamWriteOperatorCoordinator.java:545) ~[?:?]
    at com.huawei.clouds.dataarts.shaded.org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:99) ~[?:?]
 原因分析
实时集成作业中,Hudi作为数据源时不支持多个作业(包括Flink、Spark或任意其他实时集成作业),同时对Hudi表进行写操作,会造成Hudi表丢数、作业异常。
解决方案
排查是否存在多个作业并发写入同一张Hudi表,及时停止冗余作业。