In the past, many would say that you simply can’t virtualize high-performance computing workloads. They require dedicated sets of resources, the workloads themselves are very heavy, and a lot of architectures never took virtualization into consideration.
Today, that view is pretty different. Virtualization technologies allow HPC and other workloads to leverage resources ever more efficiently and allow more scalability by bursting into the cloud. For example, VMs are a convenient way to package and deploy scientific applications across heterogeneous system. Applications can be packaged with their required libraries and support programs, including (perhaps) a distributed file system that would otherwise be difficult or impossible to install without special privilege.
From a vHPC perspective, there a number of key aspects to consider when creating a virtual HPC architecture.
The modern hypervisor has come a long way. So, if you’re in the HPC world and are still skeptical about creating a vHPC cluster, consider some of the following. It’s important to explain here what it means to run compute-intensive code on a modern hypervisor. Because of direct paravirtualization optimizations, that workload is basically running on the bare metal architecture. Yes, there is another level in the memory hierarchy, but powerful virtualization technologies have shown that this generally does not introduce performance issues given hardware support from chip vendors. In fact, the continuing development of hardware acceleration techniques for virtualization is another major point to consider:
- Advancements around CPU virtualization
- Optimized memory virtualization
- Modern I/O virtualization techniques
These all help to reduce the overheads for applications running virtualized HPC workloads.
Resource and Data Control
Once moved to a virtualized environment, VM abstraction offers some additional benefits beyond being able to bring your own software onto the cluster. Separation of workloads into multiple VMs can add value as well:
- For organizations centralizing multiple groups onto a cluster or for teams with per-project data security issues (for example in a Life Sciences environment where access to genomic data may need to be controlled and restricted to specific researchers), VM abstraction offers security separation that isn’t available in traditional HPC environments. In those environments, the batch scheduler schedules jobs based on available compute resources, placing jobs from different teams within the same OS instance. Using multiple batch queues for separation results in lower cluster utilization and therefore isn’t a good approach.
- In bare-metal environments, running multiple users’ jobs within the same OS instance can result in more than just data leakage. If jobs disrupt the OS (fill/tmp, crash daemons, etc.), those failures can affect other, unrelated jobs. VM abstraction can protect a user’s jobs from failures caused by other workloads.
- Visibility into the resource layer is critical. Not only are you able to create cost-based scenarios for what end-users need to operate, you’re also enabling your environment for better security and data controls. For example, being able to archive a VM is an easy way of ensuring that the exact software environment used for some type of HPC workload is saved. Similarly, for academic and other institutions concerned about the reproducibility of their scientific research – or subsequent auditing of their research results, being able to save and then later restore the exact software environment used during their research can be very important.
Virtualization and cloud computing play a big part in creating a powerful vHPC cluster. Imagine an ecosystem where commercial, government, and pharma platforms can dynamically scale their HPC workloads into a controlled cloud environment. Using these technologies, organizations are able to create private cloud environments with self-service portals. These portals then allow entire research groups to effectively check out a pre-configured vHPC cluster that has been sized precisely to their requirements. Now, imagine being able to scale this private cloud architecture into a hybrid cloud model.
Is it time to virtualize your HPC workloads? Maybe. The power of the cloud, the ability to replicate data, and the need to process even more information makes virtualization a very real option for HPC workloads. Remember, the nature of information and the ability to quantify it quickly will only continue to evolve. The digital world is ever-expanding and more parallel applications are being deployed to compute very complex processes. Fears around overhead, system stability, and even management should all be corralled as virtual technologies have come to a point where HPC systems can directly benefit from this type of architecture. Remember, the idea is to make your systems run more optimally and help your IT functions better align with the goals of your organization.