Monday, Oct 8 - Wednesday, Oct 10, 2012

Center for Computing and Communication

RWTH Aachen Unversity

sponsored by:

      Intel  ScaleMP Logo


The number of cores per processor chip is still increasing. In order to be prepared to program and use future computing systems efficiently, we provide a large number of "fat" compute nodes as part of the RWTH Compute-Cluster. These nodes are equipped with 16 eight-core Intel® Xeon™ processors each and they provide a large shared memory of up to 2 TB, which offers additional opportunities.

Furthermore, we operate a ScaleMP-Cluster, consisting of 16 nodes with 4 eight-core Intel® Xeon™ processors each and 4 TB of accumulated main memory. These nodes are coherently coupled with an innovative software solution provided by the company ScaleMP which creates a virtual shared memory over all nodes running a single Linux operating system.

The efficient use of these systems requires a sensible way of NUMA-aware programming.

While message passing with MPI is the dominating paradigm for parallel programming in the domain of high performance computing (HPC), with the growing number of cores per cluster node the combination of MPI with shared memory programming is gaining importance. In order to exploit different levels of parallelism, namely through shared memory programming within a node and message passing across the nodes, good performance becomes increasingly difficult to obtain.

This tuning workshop will cover tools and methods to program big SMP systems in detail.


Attendees are kindly requested to prepare and bring in their own code, if applicable. It is recommended to have good knowledge in MPI and/or OpenMP and the programming language (C/C++/Fortran) of the code. If you do not have an own code, but you are interested in the presented topics, you can work on prepared exercises during the lab time.
The machines are all operated under Linux, so all codes need to run there.
Participants from RWTH Aachen university are welcome, but also participants from other universities or industrie companies.
The presentations will be given in English.
Bring in your own code
We have tuning experts from Intel, Bull, ScaleMP and from the HPC Team of the computing center available. If you have a code that you want to tune for our machines, this is a good opportunity to get help in doing so. Instead of doing exercises during the lab sessions you can try the presented tools and techniques on your application. To make things run as smoothly as possible, you should make sure that your application and input data fulfills the following requirements:
- The application must run on our Linux cluster.
- It is a plus for some of the presented tools, if the Intel Compiler is used to compile the application.
- A test run with a representative dataset should finish in a few minutes, so that some test runs can be made during the workshop.
- The source code must be available and you should be familiar with the build system of your application.
If you have an application that does not fulfill these requirements and you like to tune it during the workshop, please contact us in advance. We will then try to help you in porting your application and preparing it for the workshop.


There is no workshop fee, but the number of seats in our seminar room is limited. So, please register on the link below. If you like to bring in your own code, please make a short note under “remarks”. This helps us to better prepare the courses and lab sessions. There will be a sponsored social dinner on Tuesday, here we also need to know how many people will attend. Please add a node “I will attend the social dinner.” or “I will not attend the social dinner.” under “remarks” as well.



We are happy to announce, that some of the presentations are held by the following guest speakers:

  • Michael Klemm (Intel)
  • Nir Paikowsky (ScaleMP)
  • Denis Gutfreund (Bull)


  • Monday, 8.10
    • OpenMP Programming:

(11:00 – 12:30 Review of OpenMP Basics)
14:00 – 14:30 Welcome and Introduction
14:30 – 15:30 NUMA aware programming with OpenMP
15:30 – 16:00 Coffee Break
16:00 – 16:45 Tasks on NUMA architectures / Outlook on OpenMP 4.0
16:45 – 17:30 Lab Time: Bring in your own code or work on exercises

  • Tuesday, 9.10
    • Introduction of our machines and tools to use them:

09:00 – 10:00 The Bull BCS Systems
10:00 – 10:30 Using the Intel Compiler
10:30 – 11:00 Coffee Break
11:00 – 11:45 Intel VTune
11:45 – 12:30 Lab Time: Bring in your own code or work on exercises
12:30 – 14:00 Lunch Break
14:00 – 14:30 vSMP Foundation introduction and architecture
14:30 – 15:00 Success stories and ‘real-life’ examples
15:00 – 15:30 Introduction to ScaleMP's Developer Productivity Tools
15:30 – 16:00 Coffee Break
16:00 – 17:30 Lab Time: Bring in your own code or work on exercises
19:00 – 22:00 Sponsored Social Dinner

  • Wednesday, 10.10

    • Hybrid Programming:

09:00 – 09:30 Introduction to Hybrid Programming
09:30 – 10:00 Tuning for OpenMPI
10:00 – 10:30 Tuning for Intel MPI
10:30 – 11:00 Coffee Break
11:00 – 12:00 Performance Tools for Hybrid Programs
12:00 – 12:30 Lab Time: Bring in your own code or work on exercises
12:30 – 14:00 Lunch Break
14:00 – 15:30 Lab Time: Bring in your own code or work on exercises
15:30 – 16:00 Coffee Break
16:00 – 17:30 Lab Time: Bring in your own code or work on exercises

Course Material

The course material can be found here .