From the NCCL website:
Bazel is a build tool from Google currently available in a public beta.
CUDA, (the Compute Unified Device Architecture), is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives program developers direct access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.
Python is a high level programming language that aims to combine remarkable power with very clear syntax. Anaconda is a free cross-platform Python distribution from Continuum Analytics. It comes built in with various scientific Python packages such as NumPy, SciPy, Pandas, Matplotlib, Numba, etc.
KDiff3 is a diff and merge program that compares or merges two or three text input files or directories, shows the differences line by line and character by character (!), provides an automatic merge-facility and an integrated editor for comfortable solving of merge-conflicts, supports Unicode, UTF-8 and other codecs, autodetection via byte-order-mark "BOM", supports KIO on KDE (allows accessing ftp, sftp, fish, smb etc.), printing of differences, manual alignment of lines, automatic merging of version control history ($Log$), and has an intuitive graphical user interface.
The Hoard memory allocator is a fast, scalable, and memory-efficient memory allocator for Linux, Solaris, Mac OS X, and Windows. Hoard is a drop-in replacement for malloc that can dramatically improve application performance, especially for multithreaded programs running on multiprocessors and multicore CPUs.
The h5py package is a Pythonic interface to the HDF5 binary data format.
Eclipse is a software development environment for developing programs in a variety of languages.
Python is a high level programming language that aims to combine remarkable power with very clear syntax. The Enthought Python Distribution is a cross-platform environment for scientific computing in Python, and includes the Canopy IDE and package manager. MSI has installed an academic-licensed version that includes hundreds of modules, including tools that enable efficient parallel computations.
The Intel(R) VTune(TM) Amplifier XE provides information on the code performance for users developing serial and multithreaded applications On Linux systems, VTune Amplifier XE works as a standalone GUI client. You can benefit from using the command-line interface for collecting data remotely or for performing regression testing. VTune Amplifier XE helps you analyze the algorithm choices and identify where and how your application can benefit from available hardware resources.