Help Center/ MapReduce Service/ Troubleshooting/ Using Spark/ DataArts Studio Failed to Schedule Spark Jobs
Updated on 2023-11-30 GMT+08:00

DataArts Studio Failed to Schedule Spark Jobs

Symptom

DataArts Studio fails to schedule jobs, and the following error is reported indicating that data in the /thriftserver/active_thriftserver directory cannot be read.

The error information is as follows:

Can not get JDBC Connection, due to KeeperErrorCode = NoNode for /thriftserver/active_thriftserver.

Cause Analysis

When DataArts Studio submits a Spark job, Spark JDBC is invoked. Spark starts a ThriftServer process for the client to provide JDBC connections. During the startup, JDBCServer creates the active_thriftserver subdirectory in the /thriftserver directory of ZooKeeper, and registers related connection information. If the connection information cannot be read, the JDBC connection is abnormal.

Procedure

Check whether the ZooKeeper directory contains the target directory and registration information.

  1. Log in to any master node as user root and initialize environment variables.

    source /opt/client/bigdata_env

  2. Run the zkCli.sh -server 'ZooKeeper instance IP address:ZooKeeper connection port' command to log in to ZooKeeper.

    Generally, the ZooKeeper connection port number is 2181. You can obtain the port number from the ZooKeeper configuration parameter clientPort.

  3. Run the ls /thriftserver command to check whether the active_thriftserver directory exists.

    • If the active_thriftserver directory exists, run the get /thriftserver/active_thriftserver command to check whether it contains the registered configuration information.
      • If yes, contact Huawei Cloud technical support.
      • If no, go to 4.
    • If the active_thriftserver directory does not exist, go to 4.

  4. Log in to Manager and check whether the active/standby status of the Spark JDBCServer instance is unknown.

    • If yes, go to 5.
    • If no, contact O&M personnel.

  5. Restart the two JDBCServer instances. Check whether the status of the active and standby instances is normal and whether the target directory and data exist in ZooKeeper. If yes, the job is restored. If the instance status is not restored, contact Huawei Cloud technical support.