DataArts Studio Failed to Schedule Spark Jobs
Issue
DataArts Studio fails to schedule jobs, and a message is displayed indicating that data in the /thriftserver/active_thriftserver directory cannot be read.
Symptom
DataArts Studio fails to schedule jobs, and the following error is reported indicating that data in the /thriftserver/active_thriftserver directory cannot be read:
Can not get JDBC Connection, due to KeeperErrorCode = NoNode for /thriftserver/active_thriftserver
Cause Analysis
When DataArts Studio submits a Spark job, Spark JDBC is invoked. Spark starts a ThriftServer process for the client to provide JDBC connections. During the startup, JDBCServer creates the active_thriftserver subdirectory in the /thriftserver directory of ZooKeeper, and registers related connection information. If the connection information cannot be read, the JDBC connection is abnormal.
Procedure
Check whether the ZooKeeper directory contains the target directory and registration information.
- Log in to any master node as user root and initialize environment variables.
source /opt/client/bigdata_env
- Run the zkCli.sh -server 'ZookeeperIp:2181' command to log in to ZooKeeper.
- Run the ls /thriftserver command to check whether the active_thriftserver directory exists.
- Log in to Manager and check whether the active/standby status of the Spark JDBCServer instance is unknown.
- If yes, go to 5.
- If no, contact O&M personnel.
- Restart the two JDBCServer instances. Check whether the status of the active and standby instances is normal and whether the target directory and data exist in ZooKeeper. If yes, the job is restored. If the instance status is not restored, contact Huawei Cloud technical support.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.