At Data Center World 2023, Omdia’s Vlad Galabov and AMD’s Kumaran Siva discussed processors and AI, specifically how they affect performance, cost, power consumption, and sustainability.
This is Part 2 of their conversation at DCW. Click here for Part 1.
The transcript follows below. Minor edits have been made for clarity.
Vlad Galabov: We’re talking about the latest process node, one of the very hot topics. What’s often correlated with process nodes and correlated with computing performance is AI. Could you talk about what have people found to be the most useful feature of a processor? Very often we talk about matrix multiplication and tend to obsess about benchmarking in our industry. But the reality is that people don't run benchmarks on their servers – they use real workloads. So, what aspects of AI application performance have you seen be impacted by what aspects of a processor?
Kumaran Siva: So, if you look at the AMD processors today, they're deployed with hyperscalers that are doing AI on them as we speak. So, we have some major companies that have talked about their AI usage. For example, Tencent and WeChat talked about how they utilized the recommendation engine solution. And many, many others are utilizing AMD's silicon for it.
Where we shine is that we provide just the best general-purpose processor. And you look at inference and it's a series of pipeline steps. So, you have to get the data, you have to pre-process the data, then you have to do the inference, and then you have to do something with the result. And so, that whole pipeline itself lends itself to having a very efficient core that scales. And that's the advantage that many of our customers have been able to take advantage of.
If you just focus on, for example, a benchmark or a matrix multiply, yeah, you can probably find more efficient solutions. But then end to end, it gets diluted to the point where it doesn't matter. And that is one of the keys for AI on both general-purpose CPUs and in particular the AMD architecture.
Vlad: Interestingly, I was with Tencent last week, and one of the things that they were looking at is figuring out how to innovate for sustainability – how to lower their power consumption for all workloads. I think what you're indicating is that it comes with the ingredients within the rack, with the server and the CPUs that you use. And most interestingly, they're looking at also how to reduce the physical infrastructure power consumption within their data center. Their early experiment is liquid cooling, and there's kind of a very nice synergy there in terms of being efficient or also running one of the hottest workloads in the world.
So, I think that we can't sit down in 2023 to talk about computing without not talking about the elephant in the room. Since this time last year when we did our first Omdia Analyst Summit, one of the things that kind of took the world by storm is generative AI. ChatGPT has become mainstream. It was adopted so fast. And one of the things that we've done a lot as analysts is research how the training is done on ChatGPT. That's been quite well documented. It uses a high-performance computing cluster and an architecture that most of us are familiar with.
But from what we have seen as analysts, the big challenge is how to turn a solution like ChatGPT into a commercially viable product and to be able to do it cost-effectively – to run the inference cost-effectively, to run the inference in a way that you can be sustainable and not use too much power. Could you talk about AI inference as a workload? What requirements do you believe it has? How can we optimally run AI inference? Because I don't think it's just about computing; it’s also about cost. It's also about power efficiency and sustainability.
Kumaran: Yeah, absolutely. So, there are a few thoughts there. Obviously, AMD being a broad semiconductor provider, we have our GPUs, as well. So, we're involved in the AI trend and generative AI. So, we're involved in different aspects here. One of the things from a CPU standpoint you can think about again is that end-to-end picture, where it isn't just the response that you need to get back, but there's some analysis, some pre-processing. All of those add up as well as the actual inference itself. And so, CPUs do offer a unique value proposition.
Within the actual inference itself, we have partnered with folks working on sparsity. One company, Neural Magic, for example, has shown some tremendous results on Genoa CPUs – up to 1,000 times better than kind of off-the-shelf, unoptimized, Onyx runtime type of code. And this opens up doors for even thinking about how you could do generative AI inference, perhaps on a CPU.
But a broader picture, we’re seeing AI becoming part of a toolbox that programmers use. The headlines are taken up by ChatGPT, but the reality is that even in everyday programming, you’re starting to see smaller and medium-sized models come up. And these get integrated as part of the code flow, and then naturally fall into applications.
I think this is one of the ways that we're going to see, for example, enterprises adopting AI. In the ISV applications themselves, they will have smaller and medium models that are just used to help analyze data and to present data visuals more appropriately. I mean, you'll see right now, even with Microsoft Office, they have that little sidebar that shows all the different recommendations or how you want your slide to look, and that's probably a small AI model. So, you have things like that starting to kind of percolate their way into user interface design and data analytics. Those kinds of elements will start to proliferate.
Vlad: Oh, absolutely. One of the things that we looked at with my colleagues that specialize in looking at how you program an AI computer is that, in reality, there's a whole bunch of boring AI that we don't talk about that has tremendous business value. They've done a few case studies of just how much money a retail store has been able to save by deploying some very simple AI models. So, I think, yes, we did talk about the headlines of the coolest thing, but there's a lot of untapped business value in pretty simple AI models that could easily be run even on your phone. So, certainly on a highly efficient processor.
Kumaran: Yeah, absolutely.
Vlad: Well, we're very much looking forward to hearing more about the AMD sustainability journey and how one can utilize CPU innovation to change how the structure of the data center is done. So, I look forward to you joining us at the Omdia Analyst Summit, and I thank you for your time.
Kumaran: Thank you very much, Vlad. It was a pleasure.