Apache Hadoop Installation on Single Node

The objective of this tutorial is to describe step by step process to install Hadoop 3 on Ubuntu 18.04.4 LTS (Bionic Beaver), once the installation is completed you can run commands for HDFS and map-reduce.

Platform

Operating System (OS). You can use Ubuntu 18.04.4 LTS version or later version, also you can use other flavors of Linux systems like Redhat, CentOS, etc.
Hadoop. We have used Apache Hadoop 3.1.2 version you can use Cloudera distribution or other distribution as well.

Download Software

VMWare Player for Windows

https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/7_0

Ubuntu

http://releases.ubuntu.com/18.04.4/ubuntu-18.04.4-desktop-amd64

Eclipse for windows

https://www.eclipse.org/downloads/

Putty for windows

http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

Winscp for windows

http://winscp.net/eng/download.php

Hadoop

https://archive.apache.org/dist/hadoop/common/hadoop-3.1.2/hadoop-3.1.2.tar.gz

Steps to Install Ubuntu 18.04.4 LTS on VMware 7 64 Bit Platform

Please follow the below steps to install Ubuntu 18.04.4 LTS on VMware 7 64 Bit platform.

Step 1. Start the downloaded vmware_player/7_0 and click on the file and select New Virtual Machine.

Step 2. Select the downloaded Ubuntu 18.04.4 LTS image.

Step 3. Enter the username/password and click on next.

Step 4. Enter the name of Virtual Machine and click on next.

Step 5. Leave the default configuration and click on next to proceed with the installation. Now the installation of Ubuntu 18.04.4 LTS will take around 30 minutes post which system would be ready for Hadoop installation.

Please follow the below steps once the above software/ Ubuntu 18.04.4 LTS configuration on VM is completed.

Steps to Install Hadoop 3 on Ubuntu 18.04.4

Step 1. Please download Hadoop 3.1.2 from the below link.

On Windows: https://archive.apache.org/dist/hadoop/common/hadoop-3.1.2/hadoop-3.1.2.tar.gz

On Linux: $wget https://archive.apache.org/dist/hadoop/common/hadoop-3.1.2/hadoop-3.1.2.tar.gz

Step 2. Install Java 8 using the below command.

cloudduggu@ubuntu:-$sudo apt-get install openjdk-8-jdk

Press Y to continue the installation.

Once the java installation is completed please verify it by running the below command.

cloudduggu@ubuntu:-$java –version

SSH should be installed and in running state to use Hadoop Script also PDSH also should be installed to provide better SSH resource management.

Step 3. Install SSH on your system using the below step.

cloudduggu@ubuntu:-$sudo apt-get install ssh

Please enter the password for the sudo user and press enter.

Press Y to continue the installation.

Once the installation is completed, the above notification will come.

Step 4. Now install PDSH using the below command.

cloudduggu@ubuntu:-$sudo apt-get install pdsh

Press Y to continue installation.

Step 5. Now open the .bashrc file using any editor, we will use nano to edit the .bashrc file and enter export PDSH_RCMD_TYPE=ssh.

Press CTRL + O to save the file. Once the file is saved press CTRL+X to exit from the editor.

Step 6. Now we will configure SSH by running the below command.

cloudduggu@ubuntu:-$ssh-keygen -t rsa -P ""

Press Enter when it asks for a filename.

Step 7. Now copy the public key to the authorized key using the below command.

cloudduggu@ubuntu:-$cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Step 8. Please verify SSH setup by running the below command.

cloudduggu@ubuntu:-$ssh localhost

Step 9. Now update the source file using the below command.

cloudduggu@ubuntu:-$sudo apt-get update

Step 10. Now we are ready to install Hadoop. In our case, it is present at the below location.

Please check your download folder to locate the Hadoop tar file.

/home/cloudduggu/hadoop-3.1.2.tar.gz

Step 11. Let us extract it by using the below command and rename the folder to Hadoop to make it meaningful.

cloudduggu@ubuntu:-$tar xzf hadoop-3.1.2.tar.gz

cloudduggu@ubuntu:-$mv hadoop-3.1.2 hadoop

Step 12. Now we will set up a java home in the Hadoop-env. sh file.

Hadoop-env.sh file location:/home/cloudduggu/hadoop/etc/hadoop/

JAVA file location: /usr/lib/jvm/java-8-openjdk-i386/

Enter java home location in Hadoop-env. sh file and save it (use CTRL+O).