Updated on 2024-11-29 GMT+08:00

Spark Log Overview

Log Description

Log paths:

  • Executor run log: ${BIGDATA_DATA_HOME}/hadoop/data${i}/nm/containerlogs/application_${appid}/container_{$contid}

    The logs of running tasks are stored in the preceding path. After the running is complete, the system determines whether to aggregate the logs to an HDFS directory based on the Yarn configuration. For details, see Common YARN Parameters.

  • Other logs: /var/log/Bigdata/spark

Log archiving rule:

  • When tasks are submitted in yarn-client or yarn-cluster mode, executor log files are stored each time when the size of the log files reaches 50 MB. A maximum of 10 log files can be reserved without being compressed.
  • By default, JobHistory log files are compressed and stored once when the file size reaches 100 MB. A maximum of 100 log files are retained.
  • By default, JDBCServer log files are compressed and stored once when the file size reaches 100 MB. A maximum of 100 log files are retained.
  • By default, IndexServer log files are compressed and stored once when the file size reaches 100 MB. A maximum of 100 log files are retained.
  • By default, JDBCServer audit log files are compressed and stored once when the file size reaches 20 MB. A maximum of 20 log files are retained.
  • The log file size and the number of compressed files to be reserved can be configured on FusionInsight Manager.
Table 1 Spark log file list

Log Type

Name

Description

SparkResource log

spark.log

Spark initialization log

prestart.log

Prestart script log

cleanup.log

Cleanup log file for instance installation and uninstallation

spark-availability-check.log

Spark health check log

spark-service-check.log

Spark service check log

JDBCServer log

JDBCServer-start.log

JDBCServer startup log

JDBCServer-stop.log

JDBCServer stop log

JDBCServer.log

JDBCServer run log on the server

jdbc-state-check.log

JDBCServer health check log

jdbcserver-omm-pid***-gc.log.*.current

JDBCServer process GC log

spark-omm-org.apache.spark.sql.hive.thriftserver.HiveThriftProxyServer2-***.out*

JDBCServer process startup log. If the process stops, the jstack information is printed.

JobHistory log

jobHistory-start.log

JobHistory startup log

jobHistory-stop.log

JobHistory stop log

JobHistory.log

JobHistory running process log

jobhistory-omm-pid***-gc.log.*.current

JobHistory process GC log

spark-omm-org.apache.spark.deploy.history.HistoryServer-***.out*

JobHistory process startup log If the process stops, the jstack information is printed.

IndexServer log

IndexServer-start.log

IndexServer startup log

IndexServer-stop.log

IndexServer stop log

IndexServer.log

IndexServer run log on the server

indexserver-state-check.log

IndexServer health check log

indexserver-omm-pid***-gc.log.*.current

IndexServer process GC log

spark-omm-org.apache.spark.sql.hive.thriftserver.IndexServerProxy-***.out*

IndexServer process startup log. If the process stops, the jstack information is printed.

Audit Log

jdbcserver-audit.log

ranger-audit.log

JDBCServer audit log

Log levels

Table 2 describes the log levels provided by Spark. The priorities of log levels are ERROR, WARN, INFO, and DEBUG in descending order. Logs whose levels are higher than or equal to the specified level are printed. The number of printed logs decreases as the specified log level increases.

Table 2 Log levels

Level

Description

ERROR

Error information about the current event processing

WARN

Exception information about the current event processing

INFO

Logs of this level record normal running status information about the system and events.

DEBUG

Logs of this level record the system information and system debugging information.

To modify log levels, perform the following operations:

By default, the service does not need to be restarted after the Spark log levels are configured.

  1. Log in to FusionInsight Manager.
  2. Choose Cluster > Services > Spark and click Configurations.
  3. Select All Configurations.
  4. On the menu bar on the left, select the log menu of the target role.
  5. Select a desired log level.
  6. Click Save. Then, click OK.

Log Format

Table 3 Log Format

Type

Format

Example

Run log

<yyyy-MM-dd HH:mm:ss,SSS>|<Log level>|<Name of the thread that generates the log>|<Message in the log>|<Location where the log event occurs>

2014-09-22 11:16:23,980 INFO DAGScheduler: Final stage: Stage 0(reduce at SparkPi.scala:35)