Thursday, February 23, 2023

High performance computing

 High-Performance Computing (HPC or supercomputer) is omnipresent in today’s society. For example, every time you watch Netflix, the recommendation algorithm leverages HPC resources remotely to offer you personalized suggestions. HPC stands for High-Performance Computing. The ability to carry out large scale computations to solve complex problems, that either need to process a lot of data, or to have a lot of computing power at their disposal. Basically, any computing system that doesn’t fit on a desk can be described as HPC. 

HPC systems are actually networks of processors. The key principle of HPC lies in the possibility to run massively parallel code to benefit from a large acceleration in runtime.  A common HPC capability is around 100,000 cores. Most HPC applications are complex tasks which require the processors to exchange their results. Therefore, HPC systems need very fast memories and a low-latency, high-bandwidth communication systems (>100Gb/s) between the processors as well as between the processors and the associated memories.

We can differentiate two types of HPC systems: the homogeneous machines and the hybrid ones. Homogeneous machines only have CPUs while the hybrids have both GPUs and CPUs. Tasks are mostly run on GPUs while CPUs oversee the computation. 

They have more computing power since GPUs can handle millions of threads simultaneously and are also more energy efficient. GPUs have faster memories, require less data transfer and are capable to exchange with other GPUs, which is the most energy-intensive part of the machine.


High Performance Computing used to be strictly defined with high speed network to allow strong interconnections between cores. The rise of AI applications led to an architecture based on more independent clusters but still massively parallel.

HPC systems also include the software stack. That can be divided into three categories. First the user environment encompasses the applications known as workflows. Then the middleware linking applications and their implementation on the hardware. It includes the runtimes and frameworks. Last, the Operating system, at system level with the job scheduler, management software for load balancing and data availability. Its role is to assign tasks to the processors and organize the exchange of data between the processors and the memories to ensure the best performance.

HPC applications
HPC provides many benefits and value when used for commercial and industrial applications. Applications that can be classified in five categories:

- Fundamental research aims to improve scientific theories to better understand natural or other phenomena. HPC enables more advanced simulations leading to breakthrough discoveries.

- Design simulation allows industries to digitally improve the design of their products and test their properties. It enables companies to limit prototyping and testing, making the designing process quicker and less expensive.

- Behavior prediction enables companies to predict the behavior of a quantity which they can’t impact but depend on, such as the weather or the stock market trends. HPC simulations are more accurate and can look farther into the future thanks to their superior computing abilities. It is especially important for predictive maintenance and weather forecasts.

- Optimization is a major HPC use case. It can be found in most professional fields, from portfolio optimization to process optimization, to most manufacturing challenges faced by the industry.

HPC is more and more used for data analysis. Business models, industrial processes and companies are being built on the ability to connect, analyze and leverage data, making supercomputers a necessity in analyzing massive amounts of data.

The 5 fields of HPC Applications.

Another major application for HPC is in the fields of medical and material advancements. For instance, HPC can be deployed to:

Combat cancer: Machine learning algorithms will help supply medical researchers with a comprehensive view of the U.S. cancer population at a granular level of detail.

Identify next-generation materials: Deep learning could help scientists identify materials for better batteries, more resilient building materials and more efficient semiconductors.

Understand patterns of disease: Using a mix of artificial intelligence (AI) techniques, researchers will identify patterns in the function, cooperation, and evolution of human proteins and cellular systems.

HPC needs are skyrocketing. A lot of sectors are beginning to understand the economic advantage that HPC represents and therefore are developing HPC applications. 

Industrial companies in the field of aerospace, automotive, energy or defence are working on developing digital twins of a machine or a prototype to test certain properties. This requires a lot of data and computing power in order to accurately represent the behavior of the real machine. This will, moving forward, render prototypes and physical testing less and less standard.

The HPC dynamics and industrial landscape


The limits of a model :
Unfortunately, supercomputers are revealing some limits. First of all, some problems are not currently solvable by a supercomputer. The race to the exascale (a supercomputer able to realize 10^18 floating point operations per second) is not necessarily going to solve this issue. Some problems or simulations might remain unsolvable, or at least, unsolvable in an acceptable length of time. For example, in the case of digital twins or molecular simulation, calculations have to be greatly simplified in order for current computers to be able to make them in an acceptable length of time (for product or drug design).

Moreover, a second very important challenge is the power consumption. The consumption of computing and data centers represents 1% of power consumption in the world and this is bound to significantly increase. It shows that this model is unsustainable in the long term, especially since exascale supercomputers will most surely consume more than current ones. Not only is it technically unsustainable, it is also financially so. Indeed, a supercomputer can cost as much as 10mUSD per year in electricity consumption.

The new chips revolution
CPUs and GPUs are not the only solutions to tackle the two previously stated issues.

Although most efforts are focused on developing higher-performance CPU and GPU-powered supercomputers in order to reach the exascale, new technologies, in particular “beyond Silicon”, are emerging. Innovative chip technologies could act as accelerators like GPUs did in the 2010s and significantly increase the computing power. Moreover, some technologies, such as quantum processors for example, would be able to solve new categories of problems that are currently beyond our reach.

In addition, 70% of the energy consumption in a HPC is accounted for by the processors. Creating new chips, more powerful and more energy efficient would enable us to solve both problems at once. GPUs were the first step towards this goal. Indeed, for some applications, GPUs can replace up to 200 or 300 CPUs. Although one GPU individually consumes a bit more than a CPU (400W against 300W approximately), overall, a hybrid supercomputer will consume less than a homogeneous supercomputer of equal performance.

The model needs to be reinvented to include disruptive technologies. Homogeneous supercomputers should disappear, and it is already underway. In 2016, only 10 out of the supercomputers in the Top500 were hybrid. By 2020, within only four years, it rose to 333 out of 500, including 6 in the top 10.

At Quantonation, are convinced that innovative chips integrated in hetereogeneous supercomputing architectures, as well as optimized softwares and workflows, will be key enablers to face societal challenges by significantly increasing sustainability and computing power. We trust that these teams are ready to face the challenge and be part of the future of compute:

  1. Pasqal’s neutral atoms quantum computer, highly scalable and energy efficient;
  2. Lighton’s Optical Processing Unit, a special purpose ligh-based AI chip fitted for tasks such as Natural Language Processing;
  3. ORCA Computing’s fiber based photonic systems for simulation and fault tolerant quantum computing;
  4. Quandela’s photonic qubit sources that will fuel next generation of photonic devices;
  5. QuBit Pharmaceuticals software suites leveraging HPC and quantum computing resources to accelerate drug discovery ;
  6. Multiverse Computing’s solutions using disruptive mathematics to resolve finance’s most complex problems on a range of classical and quantum technologies.

IBM Cloud HPC IaaS for building HPC environments using IBM’s Virtual Private Cloud (VPC). It enables you to create your own configuration for Compute Instances; High-performance Storage and Networking like Public Gateways, Load Balancers and Routers. Multiple connectivity options are available upto 80Gbps and IBM Cloud offers the highest level of security and encryption with FIPS 140-2 Level 4. Also available is IBM Code Engine, a fully managed serverless platform to run containers, applications or batch jobs.
– Spectrum Computing provides intelligent dynamic hybrid cloud capabilities which enables organizations to use cloud resources according to defned policies. Spectrum LSF and Symphony allows you to burst workloads to the cloud, dynamically provision cloud resources and intelligently move data to manage egress costs. It also enables the ability for auto scaling to take full advantage of consumption-based pricing and pay for cloud resources only when they are needed.
– Spectrum Scale is an enterprise grade High Performance File System (HPFS) that delivers scalable capacity and performance to handle demanding data analytics, content repositories and HPC workloads. Spectrum Scale architecture allows it to handle tens of thousands of clients, billions of fles and petabytes of data written and retrieved as fles or objects with low latency. Optionally, IBM Aspera can be used for high speed data movement using the FASP protocol.

Use Cases
– Financial Services: Monte Carlo simulation, risk modeling, actuarial sciences
– Health and Life Sciences: Genome analysis, drug discovery, bio-sequencing, clinical treatments, molecular modeling
– Automotive: Vehicle drag coeffcient analysis, crash simulation, engine combustion analysis, air flow modeling
– Aerospace: Structural, fluid dynamics, thermal, electromagnetic and turbine flow analysis
– Electronic Design Automation (EDA): Integrated Circuit (IC) and Printed Circuit Board (PCB) design and analysis
– Oil and Gas: Subsurface terrain modeling, reservoir simulation, seismic analysis
– Transportation: Routing logistics, supply chain optimization
– Energy & Utility: Severe storm prediction, climate, weather and wind modelling
– Education/Research: High energy physics, computational chemistry 

The reign and modern challenges of the Message Passing Interface (MPI)

All good, but why do you guys doing numerical linear algebra and parallel computing always use the Message Passing Interface to communicate between the processors?”

MPI begun about 25 years ago and has been since then, undoubtedly, the “King” of HPC. What were the characteristics of MPI that made it the de-facto language of HPC?

MPI-3 has added several interfaces to enable more powerful communication scheduling, for example nonblocking collective operations and neighborhood collective operations. 

Much of the big data community moved from single-nodes to parallel and distributed computing to process larger amounts of data using relatively short-lived programs and scripts. Thus, programmer productivity only played a minor role in MPI/HPC while it was one of the major requirements for big data analytics. While MPI codes are often orders-of-magnitude faster than many big data codes, they also take much longer to develop. And that is most often a good trade-off.

MPI I/O has been introduced nearly two decades ago to improve the handling of large datasets in parallel settings. It is successfully used in many large applications and I/O libraries such as HDF-5. 

MPI predates the time when the use of accelerators became commonplace. However, when these accelerators are used in distributed-memory settings such as computer clusters, then MPI is the common way to program them. The current model, often called MPI+X (e.g., MPI+CUDA), combines traditional MPI with accelerator programming models (e.g., CUDA, OpenACC, OpenMP etc.) in a simple way. In this model, MPI communication is performed by the CPU. Yet, this can be inefficient and inconvenient and recently, we have proposed a programming model called distributed CUDA (dCUDA) to perform communication from within a CUDA compute kernel [3]. This allows to use the powerful GPU warp scheduler for communication latency hiding. In general, integrating accelerators and communication functions is an interesting research topic.

Programming at the transport layer, where every exchange of data has to be implemented with lovingly hand-crafted sends and receives or gets and puts, is an incredibly awkward fit for numerical application developers, who want to think in terms of distributed arrays, data frames, trees, or hash tables. 

Everyone uses MPI” has made it nearly impossible for even made-in-HPC-land tools like Chapel or UPC to make any headway, much less quite different systems like Spark or Flink, meaning that HPC users are largely stuck with using an API which was a big improvement over anything else available 25 years ago, 

Chapel is a modern programming language designed for productive parallel computing at scale. Chapel's design and implementation have been undertaken with portability in mind, permitting Chapel to run on multicore desktops and laptops, commodity clusters, and the cloud, in addition to the high-end supercomputers for which it was originally undertaken.



No comments:

Post a Comment