Namenode > hadoopmnmaster > 192.168.56.11
Datanodes > hadoopmnslave1 > 192.168.56.12
hadoopmnslave2 > 192.168.56.13
hadoopmnslave3 > 192.168.56.14
Clone Hadoop Single node cluster as hadoopmaster
Hadoopmaster Node
$ sudo nano /etc/hosts
hadoopmnmaster 192.168.56.11
hadoopmnslave1 192.168.56.12
hadoopmnslave2 192.168.56.13
hadoopmnslave3 192.168.56.14
$ sudo nano /etc/hostname
hadoopmnmaster
$ cd /usr/local/hadoop/etc/hadoop
$ sudo nano core-site.xml
replace localhost as hadoopmnmaster
$ sudo nano hdfs-site.xml
replace value 1 as 3 (represents no of datanode)
$ sudo nano yarn-site.xml
add the following configuration
<value>hadoopmnmaster:8030</value>
<property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopmnmaster:8050</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address </name>
<value>hadoopmnmaster:8025</value>
<property>
<property>
<name>yarn.resourcemanager.scheduler.address</name><property>
<value>hadoopmnmaster:8025</value>
<property>
<value>hadoopmnmaster:8030</value>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopmnmaster:8050</value>
</property>
</configuration>
$ sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml
remove dfs.namenode.name.dir property section
$ sudo rm -rf /usr/local/hadoop/hadoop_data
$ sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode
$ sudo chown -R chaal:chaal /usr/local/hadoop
Reboot hadoopmaster node
Clone Hadoopmaster Node as hadoopslave1, hadoopslave2, hadoopslave3
Hadoopslave Node (conf should be done on each slavenode)
$ sudo nano /etc/hostname
hadoopmnslave <number>
reboot all nodes
Hadoopmaster Node
$ sudo nano /usr/local/hadoop/etc/hadoop/masters
hadoopmnmaster
$ sudo nano /usr/local/hadoop/etc/hadoop/slaves
remove localhost and add
hadoopmnslave1
hadoopmnslave2
hadoopmnslave3
$ sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml
replace dfs.datanode.data.dir property section
as dfs.namenode.name.dir
$ sudo rm -rf /usr/local/hadoop/hadoop_data
$ sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode
$ sudo chown -R chaal:chaal /usr/local/hadoop
$ hadoop namenode -format
$ start-all.sh
$ jps (check in all 3 datanodes)
http://hadoopmnmaster:8088/
http://hadoopmnmaster:50070/
If you experience trouble with the datanodes and you see something like "in_use.lock acquired by nodename" in datanode log file, you can try this.
ReplyDeleteOn Master:
stop-all.sh
On Slaves:
sudo rm -Rf /usr/local/hadoop/hadoop_store/hdfs/datanode
sudo mkdir -p /usr/local/hadoop/hadoop_store/hdfs/datanode
sudo chown -R chaal:chaal /usr/local/hadoop
On Master:
start-all.sh
Further Information:
The slave nodes may have old nameNode information. With this steps you are clearing the datanode directory. The slave node will reinitiate the datanode directory on next start
BR
yes thats what i do all time :)
ReplyDeleteIs YARN required? Why should I use this here? And please tell me how to setup this if I don't want to use YARN?
ReplyDeletedon't mention it in xml
ReplyDeletecheck documentation for clear steps
ok. but tell me what is use of YARN
ReplyDeletehttp://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
ReplyDelete