less than 1 minute read

Apache Spark Installation on EC2

Installation Steps

1. Update System

# Update system packages
# Follow instructiom in https://docs.aws.amazon.com/linux/al2023/ug/updating.html for latest version
sudo dnf --releasever=2023.8.20250818 update 

2. Install Java and Scala

# Install Java 
sudo dnf install java-17-amazon-corretto-headless

# Install Scala
curl -fL https://github.com/coursier/coursier/releases/latest/download/cs-x86_64-pc-linux.gz | gzip -d > cs && chmod +x cs && ./cs setup

3. Install pip

# python3 is already installed
# Install Python 3 and pip
sudo dnf install python3-pip

4. Install Spark

# Create directory for Spark
sudo mkdir -p /opt/spark
cd /opt/spark

# Download Spark (adjust version as needed)
sudo wget  https://dlcdn.apache.org/spark/spark-4.0.1/spark-4.0.1-bin-hadoop3.tgz

# Unpack Spark
sudo tar -xzf spark-4.0.1-bin-hadoop3.tgz