Agate is a heterogeneous high-performance Linux cluster. It features nodes with AMD processors and NVIDIA GPUs with a high speed communication network. All nodes have local solid-state scratch storage and AMD 7763 processors with 64-128 CPU cores per node.
GPU Compute Nodes
A total of 264 NVIDIA A100 GPUs are available in two configurations. 50 nodes have 4 A100 GPUs connected via NVLink and 512 GB of memory. Another 8 have 8 A100s and 1 TB of memory.
CPU Compute Nodes
A total of 344 CPU-only compute nodes are available. 244 have 512 GB of memory, and 100 have 2TB of memory.
Interactive GPU Nodes
Ten GPU nodes, each with 8 A40 GPUs, 128 cores, and 512 GB of memory are made available for interactive work though Jupyter or command line sessions. [interactive nodes page]
The name Agate comes from the Minnesota state rock, the Lake Superior Agate.
(image from Wikipedia)
Agate features a heterogeneous architecture described in more detail here. You can schedule single-core, multi-core, or multi-node jobs as parallel threaded or MPI jobs. The scheduler options are detailed here.
All of the nodes are connected with an HDR100 Infiniband to primary storage and global scratch. In addition to these filesystems, multi-node jobs can allocate an ephemeral storage space using BeeOND.
Agate can be accessed through a terminal environment or web-based interfaces such as Jupyter. Once connected to Agate, you can submit jobs to the scheduling system and run interactive applications.