My hadoop job crashes with "Split metadata size exceeded"

Symptoms

My hadoop job crashes with an exception like following:

java.io.IOException: Split metadata size exceeded 10000000. Aborting job ...

Solutions

Open /conf/mapred-site.xml on your hadoop path. Add property “mapreduce.jobtracker.split.metainfo.maxsize” and set its value to -1, like this:

<!-- In: conf/mapred-site.xml -->
<property>
    <name>mapreduce.jobtracker.split.metainfo.maxsize</name>
    <value>-1</value>
</property>

Description

When the hadoop job is submitted, HDFS cuts off the whole set of input files into slices named “splits”, and stores them to each node with its metadata. From then, But there is a limit to the count of splits’ metadata - the property “mapreduce.jobtracker.split.metainfo.maxsize” determines this limitation and its default value is 10 million. You can circle around this limitation by increasing this value or, more radically, unlock the limitation by setting its value to -1.

* Note: This post is written based on hadoop 1.0.3, ubuntu 12.10, and openjdk 7.