Skip to content

BigDataPlayground178/mjolnir-stack-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mjolnir Stack Scripts

HDFS + YARN + Spark Stack initialization

To start cluster (HDFS + YARN + Spark), run the 'hdfs_yarn_init.sh' script from the phase1 folder.

Ingestion and HDFS (alternative)

Exec the hdfs-start.sh script in order to prepare HDFS.

The hdfs-start.sh will create a volume where the contents of $HADOOP_HOME will be available. In this way NiFi will be able to read the Hadoop Configuration files. Inside /data you will find the contents of the $PWD/nifi folder

When the containers start, exec the initialize.sh script that you will find in the /data/ folder.

DO NOT CLOSE THE BASH FOR THE CONTAINER OR IT WILL BE STOPPED.

Exec the nifi-start.sh script from the local system. Insert the dataset file (d14_filtered.csv) inside the /nifi/spooldir directory so that NiFi can start ingest the data.

Start a Redis cluster (on localhost:6379).

Now you are ready to exec the spark script:

$SPARK_HOME/bin/spark-submit --class it.uniroma2.sabd.mjolnir.MjolnirSparkSession path-to/mjolnir-1.0-jar-with-dependencies.jar hdfs=localhost:54310 houseid=*houseID*

Being the query3 final resolution demanded to Redis, you can submit as many executions as the houses. So using: ZRANGE mjolnir/results/query3/plugsrank 0 -1 WITHSCORES you will get the orderd values for each plug identified by houseID_householdID_plugID

About

Bash scripts utilities in order to properly handle the clusters deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages