Updated on 2022-12-14 GMT+08:00

Failed to Start a Local Task

Symptom

  1. When operations such as JOIN are performed for a small amount of data, a local task will be started. However, the execution fails and reports the following error:
    jdbc:hive2://10.*.*.*:21066/> select a.name ,b.sex from student a join student1 b on (a.name = b.name);
    ERROR : Execution failed with exit status: 1
    ERROR : Obtaining error information
    ERROR : 
    Task failed!
    Task ID:
      Stage-4
    ...
    Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask (state=08S01,code=1)
    ...
  2. The HiveServer log shows that the local task fails to start.
    2018-04-25 16:37:19,296 | ERROR | HiveServer2-Background-Pool: Thread-79 | Execution failed with exit status: 1 | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,296 | ERROR | HiveServer2-Background-Pool: Thread-79 | Obtaining error information | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,297 | ERROR | HiveServer2-Background-Pool: Thread-79 |
    Task failed!
    Task ID:
      Stage-4
    Logs:
     | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,297 | ERROR | HiveServer2-Background-Pool: Thread-79 | /var/log/Bigdata/hive/hiveserver/hive.log | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    2018-04-25 16:37:19,297 | ERROR | HiveServer2-Background-Pool: Thread-79 | Execution failed with exit status: 1 | org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInChildVM(MapredLocalTask.java:342)
    2018-04-25 16:37:19,309 | ERROR | HiveServer2-Background-Pool: Thread-79 | FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask | org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:1016)
    ...
    2018-04-25 16:37:36,438 | ERROR | HiveServer2-Background-Pool: Thread-88 | Error running hive query:  | org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:248)
    org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
            at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:339)
            at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:169)
            at org.apache.hive.service.cli.operation.SQLOperation.access$200(SQLOperation.java:75)
            at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:245)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:422)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
            at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:258)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
  3. The hs_err_pid_*****.log file in the HiveServer log directory /var/log/Bigdata/hive/hiveserver contains an error about insufficient memory.
    # There is insufficient memory for the Java Runtime Environment to continue.
    # Native memory allocation (mmap) failed to map 20776943616 bytes for committing reserved memory.
      ...

Cause Analysis

When Hive executes JOIN for a small amount of data, MapJoin is generated. During MapJoin execution, a local task is started. JVM memory launched by the local task inherits the memory of the parent process.

When multiple JOIN operations are executed, multiple local tasks are started. If the host is out of memory, the local tasks fail to start.

Solution

  1. Search for the hive.auto.convert.join parameter and change the value of hive.auto.convert.join in Hive to false. Save the configuration and restart the service.

    The value change may deteriorate service performance. You can perform the next step to avoid adverse impacts on the performance.

  2. Search for the HIVE_GC_OPTS parameter and decrease the value of Xms based on service requirements. The minimum value is half that of Xmx. After the modification, save the configuration and restart the service.