HPC December Workshop PartI HPC Tuning Workshop

Monday, Dec 6 - Wednesday, Dec 8  2010

Center for Computing and  Communication
RWTH Aachen University

Seminar Rooms 1 + 2 Kopernikusstraße 6

Kindly supported by: 


  • Day 1:
    Monday, December 6

 09:00 - 18:00

 Presentations, Hands-on, bring in your own code!

On Windows & Linux systems.

Participation upon individual consultation only!

  • Day 2:
    Tuesday, December 7
 09:00 - 18:00
  • Day 3:
    Wednesday, December 8

 09:00 - 18:00


There will be a Social Dinner on Wednesday December 08 at 7p.m at Restaurant Elisenbrunnen.


A major part of the RWTH Compute-Cluster comprises of 192 nodes equipped with the latest Intel processors ("Nehalem-EP"). The Intel Xeon 5570 processors (Codename “Nehalem”) are Quadcore Processors where each core can run 2 hardware threads (HyperThreading). Each processor has its own memory controller and is connected to a local part of the main memory. The processors can access the remote memory via Intel's new interconnect called “Quick Path Interconnect”. So these machines are the first Intel machines which build a ccNUMA architecture. This processor type will be the mainstay of our cluster for the foreseeable future and it has many new features to take advantage of.

  • This Tuning Workshop consists of presentations and hands-on sessions:
    1. A series of presentations teach you more about tuning, especially on Nehalem EP & EX processors.

    2. During extended hands-on sessions we want to give a selected number of projects an opportunity to improve the performance of their codes. Experts from Intel and from the HPC Team of the RZ will be there to assist you. We will reserve a few Nehalem-based machines running Linux and Windows for your experiments.
      Performance tuning is still often a matter of some experimentation, but we can give you advise on a best effort basis. Hopefully this will lead to a noticeable performance improvement, but guarantees cannot be given.
      When parallelizing an application, it is important to have tuned for single processor ("serial") performance. Otherwise, one can more quickly run into scalability problems. Therefore most of the focus will be on serial performance, but we will also consider shared memory parallelization with OpenMP where relevant and desired.
      To maximize the efficiency of the workshop, we would like to ask you to prepare a test case that reflects a typical production run, but does not take too long to execute. In the ideal case, a run should not take more than 5 to 10 minutes to finish.
      It is also important to have an easy way of verifying that the results of this test run are correct.


Attendees are kindly requested to prepare and bring in their own code. It is recommended to have good knowledge in the programming language (C/C++/Fortran) of the code and basic knowledge of multi-threading parallelization paradigms, especially OpenMP, and if necessary MPI. The presentations will be given in English. Windows as well as Linux systems will be used during the Hands-on sessions. Assistance in porting your application to the Nehalem-Cluster prior to the event will be available, if asked for.


Topics (tentative)

      • Introduction to Nehalem microprocessor architecture, covering memory subsystem, caches, latencies of caches and memory
      • Strategies for code tuning (single core)
      • Performance analyzer tools (single core) like VTune and/or PTU
      • Optimization examples (cache optimization, vectorization)
      • Optimization of compiler settings
      • Parallel efficiency analyzer tools like Intel Trace
      • Scalability improvement examples
      • Intel MPI runtime Environment
Monday, December 06
09:00 - 10:00 CPU Architecture Refresher (generic)
10:00 - 10:30 OpenMP Refresher
10:45 - 11:30 Performance Tuning Methodology
11:30 - 12:30 Nehalem Pipelines, Sandybridge & AVX close to metal
14:00 - 15:00 VTune Guide into Microarchitecture Tuning (Amplifier)
15:00 - 16:00 Hands-On Work
16:15 - 17:30 Microarchitecture Tuning for Core i7 (Nehalem)
17:30 - 18:00 Hands-On Work


Tuesday, December 07
 09:00 - 10:30 Loop Independence and Compiler Vectorization and Threading of the Loops (AVX)
10:45 - 12:30 Hands-On Work
11:30 - 12:30 (optional) Visual Studio
14:00 - 15:00 Hands-On Work
15:00 - 16:00 Threading and Cluster Tools Overview
16:15 - 18:00 Hands-On Work
Wednesday, December 08
09:00 - 09:30 cc-NUMA Architectures and Multi-threading
09:30 - 10:00 Multi-threading Tuning including NUMA for Intel Nehalem
10:00 - 12:30 Hands-On Work
14:00 - 15:00 (optional) Debugging
14:00 - 16:00 Hands-On Work
15:00 - 16:00 (optional) Beyond Single Node Performance
16:15 - 18:00 Hands-On Work
19:00 - Social Dinner - Restaurant Elisenbrunnen

 Course Material

Learning Material

Material from our Nehalem Tuning Workshop 2009:


HPC December Workshop Part II

Be sure to also consider part II of our december workshop: The Array Building Blocks Tutorial (Intel Ct)


Christian Iwainsky
Tel.: +49 241 80 24362
Fax: +49 241 80 22134
E-mail: iwainsky@rz.rwth-aachen.de

Thomas Reichstein
Tel.: +49 241 80 24924
Fax: +49 241 80 22134
E-mail: reichstein@rz.rwth-aachen.de



  • Keine Stichwörter