From the NCCL website:
Bazel is a build tool from Google currently available in a public beta.
CUDA, (the Compute Unified Device Architecture), is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives program developers direct access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.
KDiff3 is a diff and merge program that compares or merges two or three text input files or directories, shows the differences line by line and character by character (!), provides an automatic merge-facility and an integrated editor for comfortable solving of merge-conflicts, supports Unicode, UTF-8 and other codecs, autodetection via byte-order-mark "BOM", supports KIO on KDE (allows accessing ftp, sftp, fish, smb etc.), printing of differences, manual alignment of lines, automatic merging of version control history ($Log$), and has an intuitive graphical user interface.
The Hoard memory allocator is a fast, scalable, and memory-efficient memory allocator for Linux, Solaris, Mac OS X, and Windows. Hoard is a drop-in replacement for malloc that can dramatically improve application performance, especially for multithreaded programs running on multiprocessors and multicore CPUs.
The h5py package is a Pythonic interface to the HDF5 binary data format.
Eclipse is a software development environment for developing programs in a variety of languages.
Python is a high level programming language that aims to combine remarkable power with very clear syntax. The Enthought Python Distribution is a cross-platform environment for scientific computing in Python, and includes the Canopy IDE and package manager. MSI has installed an academic-licensed version that includes hundreds of modules, including tools that enable efficient parallel computations.
The Hadoop Map/Reduce framework harnesses a cluster of machines and executes user defined Map/Reduce jobs across the nodes in the cluster. On itasca, a script exists to create an ephemeral Hadoop cluster on the set of nodes assigned by the scheduler. The script setup_cluster will format a HDFS filesystem on the local scratch disks.
The Intel(R) VTune(TM) Amplifier XE provides information on the code performance for users developing serial and multithreaded applications On Linux systems, VTune Amplifier XE works as a standalone GUI client. You can benefit from using the command-line interface for collecting data remotely or for performing regression testing. VTune Amplifier XE helps you analyze the algorithm choices and identify where and how your application can benefit from available hardware resources.