hadoop

Data Management Systems

Software Description

The Hadoop Map/Reduce framework harnesses a cluster of machines and executes user defined Map/Reduce jobs across the nodes in the cluster.  On itasca, a script exists to create an ephemeral Hadoop cluster on the set of nodes assigned by the scheduler.  The script setup_cluster will format a HDFS filesystem on the local scratch disks. This resource is best-suited for application benchmarking, and algorithm testing.  All data must be moved to HDFS after the cluster is brought up when the jobs starts.  Any data that you wish to save must be moved to your home directory before the job completes.  Many job scripts will follow the pattern:

Set up cluster

move data to hdfs with \"hadoop fs -put\"

execute test program

move data to home directory with \"hadoop fs -get\"


Info

Module Name

hadoop

Last Updated On

08/29/2023

Support Level

Secondary Support

Software Access Level

Open Access

Home Page

http://hadoop.apache.org

Documentation

Software Description

The Hadoop Map/Reduce framework harnesses a cluster of machines and executes user defined Map/Reduce jobs across the nodes in the cluster.  On itasca, a script exists to create an ephemeral Hadoop cluster on the set of nodes assigned by the scheduler.  The script setup_cluster will format a HDFS filesystem on the local scratch disks. This resource is best-suited for application benchmarking, and algorithm testing.  All data must be moved to HDFS after the cluster is brought up when the jobs starts.  Any data that you wish to save must be moved to your home directory before the job completes.  Many job scripts will follow the pattern:

Set up cluster

move data to hdfs with \"hadoop fs -put\"

execute test program

move data to home directory with \"hadoop fs -get\"

General Linux

To load this module for use in a Linux environment, you can run the command:

module load hadoop

Depending on where you are working, there may be more than one version of hadoop available. To see which modules are available for loading you can run:

module avail hadoop

Agate Modules

Default

1.0.3

Other Modules

1.0.3, 2.5.1, 2.7.1, 2.7.1.java8, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Mangi Modules

Default

1.0.3

Other Modules

1.0.3, 2.5.1, 2.7.1, 2.7.1.java8, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Mesabi Modules

Default

1.0.3

Other Modules

1.0.3, 2.5.1, 2.7.1, 2.7.1.java8, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0