Changing port no. of Hadoop Cluster

Symptoms

The machines in my Hadoop cluster cannot connect & communicate each other. But my master/slave configurations are fine.

Solutions

You must change your ssh port number. Assume that the port number that will be used in a Hadoop cluster is 20002. What you have to do is twofold: first, change the port number that will be used in ssh communication. Second, notify updated port number to the Hadoop cluster.

* Disclamer: Following instructions are written based on Hadoop 1.2.1 & Ubuntu 12.04. With different versions & systems, commands can be a little bit differ.

step 1: changing ssh daemon configuration

open /etc/ssh/sshd_config with with your favorite text editor. In this example, I will work with vi:

sudo vi /etc/ssh/sshd_config

update following part:

From:

# What ports, IPs and protocols we listen for
Port 22

To:

# What ports, IPs and protocols we listen for
Port 20002

after updating, restart the ssh daemon:

sudo service sshd restart

step 2: changing Hadoop configuration

update your conf/hadoop-env.sh:

From:

# Extra ssh options.  Empty by default.
# export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"

To:

# Extra ssh options.  Empty by default.
export HADOOP_SSH_OPTS="-p 20002 -o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"

after updating, restart the cluster:

stop-all.sh
start-all.sh

Description

Hadoop uses ssh protocol for its intra-cluster communication, i.e, master node logs on & run commands in slave machines via ssh. 22 is the default port of ssh communication, which is generally disabled - as malicious attackers can target this channel to acquire administrator rights. It is why most organisations (e.g., university or research institutes) prohibit this port number1.

Changing port number is the only way to work around of it.

  1. For details, refer here.

My hadoop system stops in initializing

Symptoms

When I run the “start-all.sh” to boot the cluster, It freezes to “Initializing” status and never goes into Running status. If I stop the cluster by running “stop-all.sh”, the message “No datanode to stop” is displayed.

Solutions

Unfortunately, the reason of this error is not clear. Just reset your HDFS system: clear all HDFS related directories set with configuration variables listed below and restart your hdfs daemon.

  • hadoop.tmp.dir (in conf/core-site.xml) : This variable specified the temporary directory that will be used on running hadoop operations.
  • dfs.name.dir (in conf/hdfs-site.xml) : This variable specifies the directory which hdfs namenode data. (The HDFS name table is stored in the directory specified by this parameter, on the master node.)
  • dfs.data.dir  (in conf/hdfs-site.xml) : This variable specifies the directory which holds hdfs data. The HDFS data will be stored in the directory specified by this parameter on each of slave node.

To avoid critial situation, I strongly recommend you to backup whole hdfs data into secure storage periodically. It is the only way to avoid data loss.

Safe Unboxing in Java

In Java, there are two categories of data types: Primitives and Wrapper Classes. Wrapper Classes are some special types of Objects, in other words, Referenced Types. Each primitive has its corresponding wrapper class, for example, int for Integer, double for Double, and so on. You can easily convert each primitive variable into its corresponding wrapper class variable. This feature is called Boxing. In the opposite direction, wrapper class variables also can be converted into primitive variables. It is called Unboxing, and their short example is displayed below:

float f1 = 3.14f;         // Primitive Variable
Float f2 = new Float(f1); // Boxing

System.out.println(f2);   // print '3.14'

Double d1 = new Double(4.2);  // Wrapper Class Variable
double d2 = d1.doubleValue(); // Unboxing
System.out.println(d2);       // print '4.2'

As primitives cannot be stored in Collections(List, Set, …), this feature is used very frequently. But there is a crack. All variables of object types including wrapper classes can be null. In contrast, primitive variables cannot be null.

So, If you try to unbox null value, NullPointerException is occurred. Because of that, Safe Unboxing has nothing to do but being cumbersome:

Double d1 = new Double(4.2);
double d2 = 0.0;

// try - catch: it is cumbersome!!
try {
    d2 = d1.doubleValue();
} catch (NullPointerException e) {
    e.printStackTrace();
}

System.out.println(d2);

The Autoboxing / Unboxing feature that is introduced from Java 5 made the situation worse. As This feature performs boxing & unboxing automatically, NullPointerException is not handled properly. It is why this feature is controvertible.

Double d1 = null;
double d2 = d1; // auto unboxing - NullPointerException will be Occurred.

One easy solution for this problem is this: If you can be convinced that the variable you are trying to unbox cannot be null, use auto unboxing. In other situations, use a utility method to handle this problem. What I use is like following:

public class Numbers {
    private Numbers() { }

    public static boolean valueOf(Boolean b) {
        return valueOf(b, false);
    }

    public static boolean valueOf(Boolean b1, boolean b2) {
        if (null == b1) {
            return b2;
        } else {
            return b1.booleanValue();
        }
    }

    public static byte valueOf(Byte b) {
        return valueOf(b, (byte)0);
    }

    public static byte valueOf(Byte b1, byte b2) {
        if (null == b1) {
            return b2;
        } else {
            return b1.byteValue();
        }
    }

    public static char valueOf(Character c) {
        return valueOf(c, 'u0000');
    }

    public static char valueOf(Character c1, char c2) {
        if (null == c1) {
            return c2;
        } else {
            return c1.charValue();
        }
    }

    public static short valueOf(Short s) {
        return valueOf(s, (short)0);
    }

    public static short valueOf(Short s1, short s2) {
        if (null == s1) {
            return s2;
        } else {
            return s1.shortValue();
        }
    }

    public static int valueOf(Integer i) {
        return valueOf(i, 0);
    }

    public static int valueOf(Integer i1, int i2) {
        if (null == i1) {
            return i2;
        } else {
            return i1.intValue();
        }
    }

    public static long valueOf(Long l) {
        return valueOf(l, 0L);
    }

    public static long valueOf(Long l1, long l2) {
        if (null == l1) {
            return l2;
        } else {
            return l1.longValue();
        }
    }

    public static float valueOf(Float f) {
        return valueOf(f, 0.0f);
    }

    public static float valueOf(Float f1, float f2) {
        if (null == f1) {
            return f2;
        } else {
            return f1.floatValue();
        }
    }

    public static double valueOf(Double d) {
        return valueOf(d, 0.0d);
    }

    public static double valueOf(Double d1, double d2) {
        if (null == d1) {
            return d2;
        } else {
            return d1.doubleValue();
        }
    }
}

See Also

My hadoop job crashes with "Split metadata size exceeded"

Symptoms

My hadoop job crashes with an exception like following:

java.io.IOException: Split metadata size exceeded 10000000. Aborting job ...

Solutions

Open /conf/mapred-site.xml on your hadoop path. Add property “mapreduce.jobtracker.split.metainfo.maxsize” and set its value to -1, like this:

<!-- In: conf/mapred-site.xml -->
<property>
    <name>mapreduce.jobtracker.split.metainfo.maxsize</name>
    <value>-1</value>
</property>

Description

When the hadoop job is submitted, HDFS cuts off the whole set of input files into slices named “splits”, and stores them to each node with its metadata. From then, But there is a limit to the count of splits’ metadata - the property “mapreduce.jobtracker.split.metainfo.maxsize” determines this limitation and its default value is 10 million. You can circle around this limitation by increasing this value or, more radically, unlock the limitation by setting its value to -1.

* Note: This post is written based on hadoop 1.0.3, ubuntu 12.10, and openjdk 7.