Help Center/ MapReduce Service/ Troubleshooting/ Using HDFS/ "ArrayIndexOutOfBoundsException: 0" Occurs When HDFS Invokes getsplit of FileInputFormat

Updated on 2025-08-19 GMT+08:00

View PDF

"ArrayIndexOutOfBoundsException: 0" Occurs When HDFS Invokes getsplit of FileInputFormat

Issue

When HDFS invokes the getSplit method of FileInputFormat, "ArrayIndexOutOfBoundsException: 0" is displayed. The log is as follows:

java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:708)
at org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:675)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:210)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)

Procedure

The rack information of each block is in the format of /default/rack0/:,/default/rack0/datanodeip:port.

Blocks are damaged or lost. As a result, the IP address and port number of the host corresponding to the blocks are empty. To handle this problem, use hdfs fsck to check the health status of the file blocks, delete the damaged or lost blocks, and run task again.

Parent topic: Using HDFS

Previous topic: HDFS Client Failed to Delete Overlong Directories

Next topic: A Large Number of Blocks Are Lost in HDFS due to the Time Change Using ntpdate