chaalpritam: Hadoop 2.7.0 Multi Node Cluster Setup on Ubuntu 15.04

Namenode > hadoopmnmaster > 192.168.56.11

Datanodes > hadoopmnslave1 > 192.168.56.12

hadoopmnslave2 > 192.168.56.13

hadoopmnslave3 > 192.168.56.14

Clone Hadoop Single node cluster as hadoopmaster

Hadoopmaster Node

$ sudo nano /etc/hosts

hadoopmnmaster 192.168.56.11

hadoopmnslave1 192.168.56.12

hadoopmnslave2 192.168.56.13

hadoopmnslave3 192.168.56.14

$ sudo nano /etc/hostname

hadoopmnmaster

$ cd /usr/local/hadoop/etc/hadoop

$ sudo nano core-site.xml

replace localhost as hadoopmnmaster

$ sudo nano hdfs-site.xml

replace value 1 as 3 (represents no of datanode)

$ sudo nano yarn-site.xml

add the following configuration

<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoopmnmaster:8025</value>
<property>
<property>

<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoopmnmaster:8030</value>
<property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopmnmaster:8050</value>
</property>
</configuration>

$ sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

remove dfs.namenode.name.dir property section

$ sudo rm -rf /usr/local/hadoop/hadoop_data

$ sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode

$ sudo chown -R chaal:chaal /usr/local/hadoop

Reboot hadoopmaster node

Clone Hadoopmaster Node as hadoopslave1, hadoopslave2, hadoopslave3

Hadoopslave Node (conf should be done on each slavenode)

$ sudo nano /etc/hostname

hadoopmnslave<number>

reboot all nodes

Hadoopmaster Node

$ sudo nano /usr/local/hadoop/etc/hadoop/masters

hadoopmnmaster

$ sudo nano /usr/local/hadoop/etc/hadoop/slaves

remove localhost and add

hadoopmnslave1

hadoopmnslave2

hadoopmnslave3

$ sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

replace dfs.datanode.data.dir property section

as dfs.namenode.name.dir

$ sudo rm -rf /usr/local/hadoop/hadoop_data

$ sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode

$ sudo chown -R chaal:chaal /usr/local/hadoop

$ hadoop namenode -format

$ start-all.sh

$ jps (check in all 3 datanodes)

http://hadoopmnmaster:8088/

http://hadoopmnmaster:50070/

6 comments:

Sascha Kruszka (TinyDragon)May 24, 2015 at 3:15 PM
If you experience trouble with the datanodes and you see something like "in_use.lock acquired by nodename" in datanode log file, you can try this.

On Master:

stop-all.sh

On Slaves:

sudo rm -Rf /usr/local/hadoop/hadoop_store/hdfs/datanode
sudo mkdir -p /usr/local/hadoop/hadoop_store/hdfs/datanode
sudo chown -R chaal:chaal /usr/local/hadoop

On Master:

start-all.sh

Further Information:

The slave nodes may have old nameNode information. With this steps you are clearing the datanode directory. The slave node will reinitiate the datanode directory on next start

BR
chaal pritamMay 24, 2015 at 9:09 PM
yes thats what i do all time :)
DineshJune 25, 2015 at 11:11 PM
Is YARN required? Why should I use this here? And please tell me how to setup this if I don't want to use YARN?
chaal pritamJune 28, 2015 at 9:38 PM
don't mention it in xml
check documentation for clear steps
DineshJuly 1, 2015 at 12:24 AM
ok. but tell me what is use of YARN
chaal pritamJuly 3, 2015 at 4:26 PM
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

chaalpritam

Hadoop 2.7.0 Multi Node Cluster Setup on Ubuntu 15.04

6 comments:

Popular Posts

Blog Archive

Total Pageviews

Labels

Flickr Photostream

Twitter Updates