Error Message "Could Not Connect to the Leading JobManager" Is Displayed When a Command Is Executed on the Flink Client
Symptom
During the creation of the Flink cluster, the following error message is displayed after the yarn-session.sh command execution is suspended for a while:
2018-09-20 22:51:16,842 | WARN | [main] | Unable to get ClusterClient status from Application Client | org.apache.flink.yarn.YarnClusterClient (YarnClusterClient.java:253) org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running. at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:861) at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248) at org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:516) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:717) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511) Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway. at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:79) at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:856) ... 10 common frames omitted Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
Possible Causes
The SSL communication encryption is enabled for Flink, but no correct SSL certificate is configured.
Solution
For MRS 2.x or earlier, perform the following operations:
Method 1:
security.ssl.internal.enabled: false
Method 2:
Enable the Flink SSL communication encryption and retain the default value of security.ssl.internal.enabled.
- If the keystore or truststore file path is a relative path, allow the Flink client directory where the command is executed to access this relative path directly.
security.ssl.internal.keystore: ssl/flink.keystore security.ssl.internal.truststore: ssl/flink.truststore
Add -t option to the CLI yarn-session.sh command of Flink to transmit the KeyStore and TrustStore files to each execution node.
yarn-session.sh -t ssl/ 2
- If the keystore or truststore file path is an absolute path, the keystore or truststore files must exist in the absolute path on Flink Client and all nodes.
security.ssl.internal.keystore: /opt/client/Flink/flink/conf/flink.keystore security.ssl.internal.truststore: /opt/client/Flink/flink/conf/flink.truststore
For MRS 3.x or later, perform the following operations:
Method 1:
security.ssl.enabled: false
Method 2:
Enable the Flink SSL communication encryption and retain the default value of security.ssl.enabled.
- If the keystore or truststore file path is a relative path, allow the Flink client directory where the command is executed to access this relative path directly.
security.ssl.keystore: ssl/flink.keystore security.ssl.truststore: ssl/flink.truststore
Add -t option to the CLI yarn-session.sh command of Flink to transmit the KeyStore and TrustStore files to each execution node.
yarn-session.sh -t ssl/ 2
- If the keystore or truststore file path is an absolute path, the keystore or truststore files must exist in the absolute path on Flink Client and all nodes.
security.ssl.keystore: /opt/client/Flink/flink/conf/flink.keystore security.ssl.truststore: /opt/client/Flink/flink/conf/flink.truststore
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.