Figen Ülgen, PhD, is General Manager of Rack Scale Design for Intel's Data Center Group.
As we saw in my previous Industry Perspective, many aspects of model training—a foundational tool for many deep learning and artificial intelligence (AI) use cases—require the computational horsepower and throughput of high performance computing (HPC). Fortunately, much of this work is highly parallelizable. By taking advantage of HPC and parallel processing for model training, AI developers can increase the timeliness and accuracy of their working models. Developers also can create models capable of tackling more complex inferences and solving harder problems by churning through more and deeper layers in the neural networks.
But, for many data scientists and domain experts who want to incorporate AI capabilities into their business or research projects, HPC resources are too difficult to access. These experts may train their deep learning model on a powerful cluster, but they cannot access the parallel power of the cluster’s HPC resources unless their data center, cloud service provider, or algorithms provide ready access to the cluster’s parallelization assets. Instead, these teams have little choice but to run their codes sequentially, or go up a steep learning curve to construct a parallel environment themselves.
Simplifying HPC Adoption with OpenHPC
If we are going to achieve the vision of AI everywhere, we have to make it easier for business and research projects to leverage the power of HPC for compute-intensive AI tasks such as model training. Fortunately, the HPC community has had an intense focus for more than a decade on removing entry barriers and democratizing access to HPC, leading to progress on a number of fronts.
One that I’ve been closely involved with has led to the establishment of OpenHPC. Launched by the Linux Foundation in November 2015, OpenHPC has grown to become a global ecosystem of more than 30 organizations working together to build a consistent software platform for HPC systems.
OpenHPC is an open, vendor-neutral software stack for HPC infrastructure. Much as Linux has spurred rapid innovation by providing an open-source operating system for scientific and technical computing, OpenHPC fosters HPC innovation by acting as a virtual operating system for HPC software on multiple platforms running Linux. Organizations can download pre-built software ingredients from the OpenHPC package repository, as well as use validated recipes to create standards-based HPC platforms from the ground up. Elements of the software stack include Linux operating systems, I/O libraries and services, numerical and scientific libraries, provisioning and management tools, and other elements. This helps reduce the work of creating and maintaining HPC systems on the software side while exposing the full parallel power of the underlying hardware.
Building on OpenHPC
AI researchers, data scientists and domain experts: Don’t be afraid of HPC. Recognize its ability to improve model training. Talk with your data center manager or cloud service provider about your need to access parallel computing constructs and HPC resources. Encourage them to take advantage of the OpenHPC software stack to help facilitate your work.
Chief information officers, data center managers, and cloud service providers: Work with your users to understand their AI roadmaps and assess how your IT environment can best support them. Implement scalable, industry-standard infrastructure and a consistent architecture across model training, inferencing deployment, and other data center workloads. Use OpenHPC software or a commercial version to simplify system administration, including setting up, monitoring, maintaining, and managing an integrated HPC platform. Provide users with easy, parameter-based access so they can focus on their research and development objectives and avoid the need for deep-dive knowledge of platform hardware.
Commercial algorithm developers: AI and deep learning are picking up industry momentum. Recognize the opportunities AI presents for your customers. Add value to your algorithms and frameworks by layering them on top of the OpenHPC software platform. You’ll simplify your own workload and maximize your development efforts while empowering users to concentrate on (and accelerate progress in) their areas of expertise. Don’t overlook the need for innovation to simplify data cleansing—you’ll earn customer loyalty and market share if you can crack this nut!
This is an exciting time for the AI and HPC communities. By embracing HPC’s power to advance AI and by using the OpenHPC stack to facilitate access to HPC resources, we can help create a smarter world and realize the vision of the AI revolution.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Informa.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating.