更新时间:2025-07-12 GMT+08:00
Flink客户端执行flink命令报错,提示“ClusterRetrieveException”
问题现象与背景
客户端执行flink run/list/cancel命令时报错,报错信息如下:
org.apache.flink.util.FlinkException: Failed to retrieve job list. at org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:448) at org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:430) at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:985) at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053) at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) Caused by: java.util.concurrent.TimeoutException at org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:795) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
可能原因
- 使用集群外客户端。
- 客户端连接指定的flink集群已经不存在,或失败退出。
- 使用了错误的客户端,该客户端配置与提交任务使用的客户端配置不同导致。
解决方法
- 确认所使用客户端是否为集群外客户端,若是请将客户端IP添加至集群/etc/hosts文件中。
- 登录yarn服务原生界面上查看客户端连接的flink集群状态是否为Running状态。
- 确认提交任务所使用客户端与当前使用客户端是否为同一个,配置项“high-availability.zookeeper.path.root”是否相同。
父主题: 使用Flink