Friday, April 22, 2016

Decommission a Data Node

Existing hadoop cluster: hadoop1-6, decommissioning a data node: hadoop5.

1. Edit hdfs-site.xml on hadoop1 to include:
<property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/hadoop/etc/hadoop/dfs.exclude</value>
</property>

2. Edit /home/hadoop/hadoop/etc/hadoop/dfs.exclude to add a line:
hadoop5

3. Run: distribute-exclude.sh dfs.exclude

4. Run:  refresh-namenodes.sh

Thursday, April 21, 2016

Adding Data Node to an Existing Hadoop Cluster

Existing hadoop cluster: hadoop1-5, adding a new data node: hadoop6.

1. Clone an existing data node VM to hadoop6.

2. Edit /etc/hosts file to include hadoop6 ip address and hostname, then copy to the rest nodes in the cluster.

3. Edit slaves file to include hadoop6 hostname, then copy to the rest nodes in the cluster.

4. Delete HADOOP_DATA_DIR on hadoop6.

5. Start data node on hadoop6.
    hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode

6. Balance data node
    hdfs balancer