Updated on 2022-12-14 GMT+08:00

Failed to Start a Local Task

Symptom

  1. When operations such as JOIN are performed for a small amount of data, a local task will be started. However, the execution fails and reports the following error:
    jdbc:hive2://10.*.*.*:21066/> select a.name ,b.sex from student a join student1 b on (a.name = b.name);
    ERROR : Execution failed with exit status: 1
    ERROR : Obtaining error information
    ERROR : 
    Task failed!
    Task ID:
      Stage-4
    ...
    Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask (state=08S01,code=1)
    ...
  2. The HiveServer log shows that the local task fails to start.
    2018-04-25 16:37:19,296 | ERROR | HiveServer2-Background-Pool: Thread-79 | Execution failed with exit status: 1 | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,296 | ERROR | HiveServer2-Background-Pool: Thread-79 | Obtaining error information | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,297 | ERROR | HiveServer2-Background-Pool: Thread-79 |
    Task failed!
    Task ID:
      Stage-4
    Logs:
     | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,297 | ERROR | HiveServer2-Background-Pool: Thread-79 | /var/log/Bigdata/hive/hiveserver/hive.log | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,297 | ERROR | HiveServer2-Background-Pool: Thread-79 | Execution failed with exit status: 1 | org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInChildVM(MapredLocalTask.java:342)
    2018-04-25 16:37:19,309 | ERROR | HiveServer2-Background-Pool: Thread-79 | FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    ...
    2018-04-25 16:37:36,438 | ERROR | HiveServer2-Background-Pool: Thread-88 | Error running hive query:  | org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:248)
    org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
            at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:339)
            at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:169)
            at org.apache.hive.service.cli.operation.SQLOperation.access$200(SQLOperation.java:75)
            at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:245)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:422)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
            at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:258)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
  3. The hs_err_pid_*****.log file in the HiveServer log directory /var/log/Bigdata/hive/hiveserver contains an error about insufficient memory.
    # There is insufficient memory for the Java Runtime Environment to continue.
    # Native memory allocation (mmap) failed to map 20776943616 bytes for committing reserved memory.
      ...

Cause Analysis

When Hive executes JOIN for a small amount of data, MapJoin is generated. During MapJoin execution, a local task is started. JVM memory launched by the local task inherits the memory of the parent process.

When multiple JOIN operations are executed, multiple local tasks are started. If the host is out of memory, the local tasks fail to start.

Solution

  1. Go to the Hive configuration page.

    • For versions earlier than MRS 2.0.1: Log in to MRS Manager, choose Services > Hive > Service Configuration, and select All from the Basic drop-down list.
    • For MRS 2.0.1 or later: Click the cluster name on the MRS console, choose Components > Hive > Service Configuration, and select All from the Basic drop-down list

      If the Components tab is unavailable, complete IAM user synchronization first. (On the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users.)

    • For MRS 3.x or later: Log in to FusionInsight Manager and choose Cluster. Click the name of the target cluster, and choose Services > Hive > Configurations > All Configurations.

  2. Search for the hive.auto.convert.join parameter and change the value of hive.auto.convert.join in Hive to false. Save the configuration and restart the service.

    The value change may deteriorate service performance. You can perform the next step to avoid adverse impacts on the performance.

  3. Search for the HIVE_GC_OPTS parameter and decrease the value of Xms based on service requirements. The minimum value is half that of Xmx. After the modification, save the configuration and restart the service.