Compute nodes on MSI systems have access to primary storage, global scratch, and local scratch disk. On occasion, there are workloads that demand more scratch disk than is available on a single node. If this is the case, a researcher can use the global scratch or BeeOND. Global scratch has the most space available, but can suffer from variable performance because it is a shared resource. In this case, you might want to consider using BeeOND.

BeeOND establishes an ephemeral BeeGFS ( distributed filesystem by aggregating the local scratch on all the nodes used by your job. It is generally suggested that you request at least 12 nodes as in the example below. Please also note that you should request all the resources on the nodes. In the example below, we’re using mesabi nodes that have 60GB of memory and 220GB of local scratch disk per node. On Agate we would set these to 480G and 800G, and select the aglarge partition.



#SBATCH -C beeond
#SBATCH -n 24
#SBATCH --mem 60G
#SBATCH --tmp 220G
#SBATCH -p large

#SBATCH --mail-type=BEGIN,FAIL
#SBATCH --mail-user=<userid>

#SBATCH -e beeondjob-%j.err
#SBATCH -o beeondjob-%j.out


MY_DATA_DIR=$HOME/Slurm/BeeOND/data #source data to be ‘stagein’
BEEOND_DIR=/tmp #a beegfs mounted space pooled from all nodes

# Generate a node list file

scontrol show hostname > ${WORKING_DIR}/nodes_list
    if [[ $? != 0 ]]; then
        echo "Fail to create nodes_list"
        echo "nodes list"
        cat ${WORKING_DIR}/nodes_list


# Move data to beeond space using BeeOND stage in
beeond-cp stagein -n ${WORKING_DIR}/nodes_list -g  ${MY_DATA_DIR} -l 
    if [[ $? != 0 ]]; then
        echo "Fail beeond-cp stagein"
        echo "beeond-cp stagein completed"

# Run some applications

# Move data out to my own space using BeeOND stage out
beeond-cp stageout -n ${WORKING_DIR}/nodes_list -g  ${MY_DATA_DIR} -l 
    if [[ $? != 0 ]]; then
        echo "Fail beeond-cp stageout"
        echo "beeond-cp stageout completed"