Once you are logged into your Ubuntu 16.04 instance, run the following command to update your base system with the latest available packages.
apt-get update -y
apt-get upgrade
reboot
Next, create a new user account for Hadoop. You can do this by running the following command:
adduser hadoop (and existing password)
Next, you will also need to set up SSH key-based authentication. First, login to Hadoop user with the following command:
su – hadoop
2 – Getting Started
Hadoop is written in Java, so you will need to install Java to your server. You can install it by just running the following command:
apt-get install default-jdk -y
Once Java is installed, verify the version of the Java using the following command:
java -version
Install Python
apt install python2.7 python-pip
apt-get install python-dev
Install GCC
apt install build-essential
apt-get install manpages-dev
gcc –version
Maximum Open Files Requirements
ulimit -Sn
ulimit -Hn
ulimit -n 10000
Set Up Password-less SSH (On NameNode Cluster)
ssh-keygen -t rsa -b 4096
head ~/.ssh/id_rsa
copy from namenode .ssh/authorized_keys to newnode
ssh-copy-id root@10.1.1.4
Enable NTP on the Cluster and on the Browser Host
~ apt-get install ntp
~ update-rc.d ntp defaults
Edit the Network Configuration File on all clusters
~ vi /etc/sysconfig/network (If folder not exists, create folder and file)
NETWORKING=yes
HOSTNAME=IPADDRESS
Prepare your hostfile in all your machines
nano /etc/hosts
10.1.1.1 nn01
10.1.1.2 dn01
10.1.1.3 dn02
10.1.1.4 dn03
Configure Firewall and iptables
ufw disable
iptables -X
iptables -t nat -F
iptables -t nat -X
iptables -t mangle -F
iptables -t mangle -X
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
Ambari Agent Installation
When adding your datanode is fail
then ssh-keyscan -H dn03 >> ~/.ssh/known_hosts
Next Step from Ambari, Have Fun