Current Topics in High-Performance Computing (HPC)


High-performance computing (HPC) is applied to speedup long-running scientific applications, for instance the simulation of computational fluid dynamics (CFD). Today's supercomputers often base on commodity processors, but also have different facets: from clusters over (large) shared-memory systems to accelerators (e.g. GPUs). Leveraging these systems, parallel computing with e.g. MPI, OpenMP or CUDA must be applied.

This seminar focuses on current research topics in the area of HPC and bases on conference and journal papers. Topics might cover e.g. parallel computer architectures (multicore systems, Xeon Phis, GPUs etc.), parallel programming models, performance analysis & correctness checking of parallel programs or performance modeling.


The topics are assigned at the beginning of the lecture period (October 21th, 2016, 10-11.30am). Then, the students work out the topics over the course of the semester. The corresponding presentations take place as block course one day at the end of the lecture period or at the beginning of the exam period. Attendance is compulsory.
More information in L²P:


The goals of a seminar series are described in the corresponding Bachelor and Master modules.
In addition to the seminar thesis and its presentation, Master students will have to lead one set of presentations (roughly 1-3 presentations) as session chair. A session chair makes sure that the session runs smoothly. This includes introducing the title of the presentation and its authors, keeping track of the speaker time and leading a short discussion after the presentation. Further instructions will be given during the seminar.
The attendance of the lecture "Introduction to High-Performance computing" (Müller/ Bientinesi) is helpful, but not required.
We prefer and encourage students to do the report and presentation in English. But, German is also possible.
Some topics are already described below to get an idea of the range of topics. A comprehensive description of all topics is coming soon.

Host-Independent Accelerators for Future HPC Systems

The accelerator technology becomes more emergence in  current supercomputing systems because of their energy efficiency. Today's accelerators require a local host CPU to configure and operate them. This limits the number of accelerators per host. Network-attached accelerators are an architectural approach for scaling the number of accelerators and host CPUs independently. It is expected the detailed description of hardware and software design for the communication architecture and evaluation of the known results.
Supervisor: Alesja Dammer

Optimization of Algebraic Multigrid Solver for Multi-Core based distributed parallel systems

Algebraic multigrid (AMG) is a complex linear solver, known for its excellent scalability. Performance of AMG is generally limited by memory bandwidth. It is expected a representation of the optimized AMG implementation, based on the Hypre-library, and a representation of the analysis results of this implementation.  
Supervisor: Alesja Dammer

A comparison of visualization techniques for Performance data of HPC applications

This work should survey current techniques in visualizing and interpreting performance data of parallel application in high-performance computing. A special focus should lie in how individual techniques deal with guiding users to relevant parts of the investigated data in the presence of large-scale parallel executions on current HPC platforms.

Supervisor: Marc-André Hermanns

Computer under a power capping

As supercomputers (clusters) get bigger, more power is needed. At some point, the infrastructure cannot supply enough power as needed, therefore the whole cluster cannot perform with its peak performance, i.e. there is a power capping for an overprovisioned cluster. Under this capping, a core concern is how to maximize the throughput of the cluster. For that, technologies to cap the power as well as strategies to maximize the throughput should be investigated deeply. Some works have already been done, such as Paper: O. Sarood, A. Langer, A. Gupta and L. Kale, Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget
Supervisor: Bo Wang

SYCL - C++ Single-source Heterogeneous Progamming for OpenCL

In a heterogeneous HPC world, it is important to write code that can run on different architectures. OpenCL has been out since 2009 and enables programming a wide range of devices (such as GPUs) using C. SYCL was released in May 2015. It provides high-level C++ features with a single-source code style for OpenCL devices and thereby combining the ease of use and flexibility of C++ and the portability and efficiency of OpenCL.

This seminar thesis is expected to provide a general overview about SYCL, its comparison and integration to the OpenCL ecosystem and
a brief description of an application example.
Supervisor: Sandra Wienke

External Impact Factors on the Total Cost of Ownership of HPC Centers

HPC Centers spent a huge amount of money for purchasing cluster systems, operating them and programming them. These costs are called “total cost of ownership” (TCO). External factors that have impact on TCO are for example electricity service providers. Since HPC clusters demand a huge amount of power that might be very variable over time, e.g. due to maintenances, electricity prices might be negotiated based on power consumption, date or continuity.

This seminar thesis is expected to provide an overview of external impact factors on TCO of HPC Centers. Especially, electricity prices and their dependencies should be discussed.

Supervisor: Sandra Wienke

Efficient MPI collective communication on hierarchical systems

High-Performance Computing systems are becoming increasingly hierarchical. For example, the typical cluster node of today has two or more multi-core processor packages that present at least one level of NUMA. The addition of co-processors (such as Intel Xeon Phi) further increases the levels in the hierarchy. This creates a challenge for the MPI implementers when it comes to providing efficient collective operations such as barrier synchronisation, broadcasts and global reductions. The seminar article and talk are expected to provide an overview of the recent development of algorithms for efficient implementation of collective communication in hierarchical computer systems as well as the state of their adoption by the main MPI implementations.
Supervisor: Hristo Iliev

Heterogeneous computing with FPGAs

Modern computing is largely based on the von Neumann architecture with universally programmable processing elements (PEs). The universality comes at a very high energy price as the PEs have to pack a huge number of logical elements in the form of multiple functional and data movement units. On the opposite side of the spectrum, the Field-Programmable Gate Arrays (FPGAs) allow for certain functions to be synthesised with far less logical gates and thus executed at a reduced power cost and possibly at a greater speed. The work should focus on presenting the key aspects of FPGAs such as: what are FPGAs, how they are programmed and used, and which are the key scientific computing algorithms that they are potentially best suited for.
Supervisor: Hristo Iliev

Parallel BFS in the Graph500 Benchmark

The most prominent listing of Supercomputers is the Top500 list which ranks the fastest computers based on their performance reached with the Linpack benchmark. Linpack solves a huge dense linear equation system, which is not representative for the average application workloads on many computers. Therefore other benchmarks emerged, like the Graph500 benchmark, which ranks computers based on their performance to process large graph structures.
Focus of this seminar topic is to give an overview of the Graph500 benchmark and give an overview of certain optimization techniques for efficient parallel execution on large clusters.
Supervisor: Dirk Schmidl


An Overview of Deep Neural Network Training on BG/Q

Deep neural networks are popular in the field of machine learning. The training of such networks requires a huge amount of data and compute resources. Therefore, High-performance computing resources are attractive to solve such problems.
Focus of this seminar topic is to give an overview of deep neural networks and their use of supercomputers. As an example optimizations made for a Blue Gene system with 4096 processors shall be investigated and explained.
Supervisor: Dirk Schmidl


Prof. Matthias S. Müller
Alesja Dammer
Marc- André Hermanns
Hristo Iliev
Dirk Schmidl
Bo Wang
Sandra Wienke

  • No labels