The objective of this tutorial is to describe step by step process to install Spark 2.4.5 (Version spark-2.4.5-bin-hadoop2.7) on Ubuntu 18.04.4 LTS (Bionic Beaver), once the installation is completed you can play with Spark.
- Operating System (OS). You can use Ubuntu 18.04.4 LTS version or later version, also you can use other flavors of Linux systems like Redhat, CentOS, etc.
- Spark. We have used Spark 2.4.5 (Version spark-2.4.5-bin-hadoop2.7).
- VMWare Player for Windows
- Eclipse for windows
For VMware and Ubuntu installation please refer to “Hadoop Installation on Single Node” in the Hadoop section.Click Here To Download - spark-2.4.5-bin-hadoop2.7.tar (264.2 MB)
Steps to Install Spark 2.4.5 on Ubuntu 18.04.4 LTS
Step 1. Please download Spark 2.4.5 from the below link.
On Windows: https://downloads.apache.org/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz
On Linux: $wget https://downloads.apache.org/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz
Step 2. Install Java 8 using the below command.
$sudo apt-get install openjdk-8-jdk
Press Y to continue the installation.
Once the java installation is completed please verify it by running the below command.
Step 3. Now install Apache Spark. Download it from cmd using the below command.
In our case, it is present at the below location.
Step 4. Now extract the tar file by using the below command and rename the folder to spark to make it meaningful.
$tar xzf spark-2.4.5-bin-hadoop2.7.tgz
$mv spark-2.4.5-bin-hadoop2.7 spark
Step 5. Now edit .bashrc file using nano editor and export JAVA home and Spark home. In our case below is the location please verify yours.
Now save the changes by pressing CTRL + O and exit from nano editor by pressing CTRL + X.
Step 6. So now Spark installation is completed. Let us start the Spark shell by using the below command. Run it from spark home.