Intel® VTune™ Amplifier XE 2013 is the premier performance profiler for C, C++, C#, Fortran, Assembly and Java*. It is available on all MSI Linux systems for users to eveluate the performance of your applications (identify and remove the hotspots). The objective is to enable all applications to run efficiently on any MSI systems. Certainly, experienced users can deeply explore each of the performance metrics embedded in Vtune. The performance evaluation process itself can be very benedicial for users to learn and understand the cutting-edge technologioes available in the HPC world.
The module vtune has been set on all systems. One can profile their applications not only through the graphic interface amplxe-gui, but alos by the use of command-line interface amplxe-cl. The former fits the need of short-time interactive profiling while the latter is usefulf for collecting infromation during the run-time. Users who need to do the Interactive profiling, please go to the section of Find Hotspot for the details.
Table 1: Profiling metrics associated with micro-architecture on differen systems
|System Name||Sub-sytem specific features|
Itasca - Nehalem processor
|General Exploration, Read Bandwidth; Write Bandwith; Memory Access; Cycles and Ops; Frond End Investigation.|
Itasca- Sandy Bridge processor
|General Exploration, Memory Bandwidth; Access contention; Branch Analysis; Client Analysis; Core port Saturation; Cycles and Ops.|
Cascade- Knights Corner, phi processor
|Lightweight Hotspots; Memory Bandwidth; General exploration|
Cascade- Core i7 980x processor
|Lightweight Hotspots; Hotspots; Concurrency; Locks and Waits.|
|Lab Limux workstations||Lightweight Hotspots; Hotspots; Concurrency; Locks and Waits.|
The command-line interface amplxe-cl provides users with the convenience to profile a real application. Users need to load the vtune module and specify the analysis type of interests. Here are the basic format:
module load vtune
amplxe-cl -collect $analysis_type -result-dir $yourprof_dir -- myApplication
where $analysis_type is the options that users can chose for analyzing the performance on different sub-sysmtem processor (see the Table 2 for the supported analysis type on different platforms); $yourprof_dir is the directory in which the profiling information is to save; myApplication is the program that you want to prfile. After the job finishes, you can view the profiling results by either graphic interface:
or the command-line interface:
amplxe-cl -report $report_type -result-dir $yourprof_dir
where the $report_type should match the selected $analysis_type
Table 2 Available Analysis Types for different micro-architectures
|System Name||Options available on different sub-systems|
Sandy Bridge processor
|Nehalem/Westmere processor|| nehalem-cycles-uops
Please note that the genral analysis-type in Table 2 applies to every platform on which you want to use vtune. One can find the details about one analysis type of particular interest by
amplxe-cl --help $analysis_type
amplxe-cl --help concurrency
MPI jobs can be analyzed by using Vtune over the the implementation of Intel MPI. Here are the simplified commands for profling MPI jobs:
module load intel impi vtune
mpirun -r ssh -f $PBS_NODEFILE -np 256 amplxe-cl -collect $analysis_type -result-dir $yourprof_dir ./test > run.out
After the job runs successfully, one can view the profiling results either graphic or commd-line interface.
Comprehensive information can be found from the software document - Analyzing MPI applications.