Help Center/ MapReduce Service/ Troubleshooting/ Using Spark/ DataArts Studio Failed to Schedule Spark Jobs
Updated on 2023-01-11 GMT+08:00

DataArts Studio Failed to Schedule Spark Jobs

Issue

DataArts Studio fails to schedule jobs, and a message is displayed indicating that data in the /thriftserver/active_thriftserver directory cannot be read.

Symptom

DataArts Studio fails to schedule jobs, and the following error is reported indicating that data in the /thriftserver/active_thriftserver directory cannot be read:

Can not get JDBC Connection, due to KeeperErrorCode = NoNode for /thriftserver/active_thriftserver

Cause Analysis

When DataArts Studio submits a Spark job, Spark JDBC is invoked. Spark starts a ThriftServer process for the client to provide JDBC connections. During the startup, JDBCServer creates the active_thriftserver subdirectory in the /thriftserver directory of ZooKeeper, and registers related connection information. If the connection information cannot be read, the JDBC connection is abnormal.

Procedure

Check whether the ZooKeeper directory contains the target directory and registration information.

  1. Log in to any master node as user root and initialize environment variables.

    source /opt/client/bigdata_env

  2. Run the zkCli.sh -server 'ZookeeperIp:2181' command to log in to ZooKeeper.
  3. Run the ls /thriftserver command to check whether the active_thriftserver directory exists.

    • If the active_thriftserver directory exists, run the get /thriftserver/active_thriftserver command to check whether it contains the registered configuration information.
      • If yes, contact Huawei Cloud technical support.
      • If no, go to 4.
    • If the active_thriftserver directory does not exist, go to 4.

  4. Log in to Manager and check whether the active/standby status of the Spark JDBCServer instance is unknown.

    • If yes, go to 5.
    • If no, contact O&M personnel.

  5. Restart the two JDBCServer instances. Check whether the status of the active and standby instances is normal and whether the target directory and data exist in ZooKeeper. If yes, the job is restored. If the instance status is not restored, contact Huawei Cloud technical support.