Hadoop

Software Summary

Mesabi

Default Module: 

1.0.3

Other Modules Available: 

1.0.3, 2.5.1, 2.7.1.java8, 2.7.1, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Last Updated On: 

Tuesday, October 29, 2019

Mesabi K40

Default Module: 

1.0.3
Other Modules Available: 

1.0.3, 2.5.1, 2.7.1.java8, 2.7.1, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Last Updated On: 

Tuesday, October 29, 2019

Mangi

Default Module: 
1.0.3
Other Modules Available: 

1.0.3, 2.5.1, 2.7.1.java8, 2.7.1, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Last Updated On: 

Tuesday, October 29, 2019

Mangi v100

Default Module: 

1.0.3
Other Modules Available: 

1.0.3, 2.5.1, 2.7.1.java8, 2.7.1, 1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Last Updated On: 

Tuesday, October 29, 2019

NICE

Default Module: 

0.23.11

Other Versions Available: 

1.0.3, 2.7.1, 2.8.5, testing.3.2.0

Last Updated On: 

Tuesday, October 29, 2019

Last Updated On: 

Tuesday, October 29, 2019

Support Level: 
Secondary Support
Software Access Level: 
Open Access
Software Categories: 
Data Management Systems
Software Description

The Hadoop Map/Reduce framework harnesses a cluster of machines and executes user defined Map/Reduce jobs across the nodes in the cluster.  On itasca, a script exists to create an ephemeral Hadoop cluster on the set of nodes assigned by the scheduler.  The script setup_cluster will format a HDFS filesystem on the local scratch disks. 

This resource is best-suited for application benchmarking, and algorithm testing.  All data must be moved to HDFS after the cluster is brought up when the jobs starts.  Any data that you wish to save must be moved to your home directory before the job completes.  Many job scripts will follow the pattern:

  1. Set up cluster
  2. move data to hdfs with "hadoop fs -put"
  3. execute test program
  4. move data to home directory with "hadoop fs -get"