Moore’s Law doesn’t appear to be working anymore.
In 1965, Intel founder Gordon Moore observed that there was a virtuous cycle to expanding the breadth of transistors “crammed,” as Moore himself put it, onto an integrated circuit design. By managing the rate of decline in unit costs, he wrote, a manufacturer could effectively double that transistor count every 18 months to two years. Consumers’ expectations for performance increases would then meet that demand perfectly, ensuring the revenue necessary to justify the development and production expenditures.
College texts still in use continue to teach using Moore’s Law as a forecasting tool. Intel itself suggests the Law may be utilized not only to predict future processing power, but also electricity use. In an e-book titled “The Second Era of Digital Retail” [PDF], the company writes, “Moore’s Law doesn’t just give you double the computing capability every few years. It can also be used to deliver the same computing capability at half the price, and half the power consumption.”
At one point, auto maker Toyota employed the double-every-18-months coefficient as a measure of how much it should increase the inventory of parts in its factories. Water treatment plant engineering firm Brown and Caldwell wrote, “It’s only a matter of time before Moore’s Law overtakes brick-and-mortar law,” suggesting the principle could soon dictate the rate of flow of concrete in pouring foundations for construction projects.
A Matter of Time
In the middle of the last decade, the laws of physics finally had their say, denying transistors the ability to be shrunken any more than they already were. Betrayed by what one songwriter called “the flash of a neon light that split the night,” industries and economies everywhere began dumping the Law like an electorate dumping a losing candidate.
“After June 2013, all segments in the Top500 basically slowed down to the new growth rates,” reported Erich Strohmaier, senior scientist at Lawrence Berkeley National Laboratory, and one of the “Four Horsemen” who oversee the semi-annual Top500 rankings of the world’s fastest supercomputers. “Those rates are substantially lower than before.”
For the Top500’s first 25 years in existence, said Strohmaier, the growth rate in supercomputer performance was not only predictable but constant: about 80 percent annually. There were fits and starts from year to year, but in three-year increments, the growth rate stayed firm. In the interval from 2002 to 2013, performance growth multiplied by a staggering 1000x.
“Then, in 2013, that collapsed very rapidly and very instantaneously,” he continued. “Since 2013, we see again exponential growth, but at a rate that is now much lower” — approximately 40 percent per year.
There were unforeseen consequences, especially for the commercial high-performance computing industry. Investments slowed. Activity lulled. Predictions about the date and time in which already installed computing hardware would become obsolete were extended. Gordon Moore wasn’t around anymore to bang the gong.
“Moore’s Law certainly is slowing down,” declared Professor Jack Dongarra of the University of Tennessee, Knoxville, co-creator of the High-Performance Linpack (HPL) algorithm upon which Top500 scores have been based since 1993. “It’s not ending; it’s slowing down. We’re not quite getting the same push from hardware.”
“This issue about Moore’s Law slowing down, it’s been with us for a long time,” remarked Arm CEO Simon Segars during a recent company conference. Since its inception, Arm has always been considered a rival, in one form or another, of Intel. That said, even among competitors in a field, a law is a law.
“While there’s still a lot of innovation going on in materials science and semiconductor development,” continued Segars, “and there’s new transistor structures being designed as we speak, one thing that’s been really important is how architecture needs to evolve. As we go forward, putting different die together that were manufactured under completely different processes in really sophisticated packaging, that all matters. Being able to stack die together, do true 3DIC [three-dimensional integrated circuitry] — these are the innovations that are going to keep semiconductor performance improving and improving, generation after generation.”
The emerging theme here is that the Law should be resuscitated, that materials science and manufacturing processes should do whatever is necessary to make the requisite leaps and bounds to bridge the gap between the past and the future, ensuring a smoothly paved route between yesterday’s performance and tomorrow’s. And while that may seem frivolous on the surface, keep this in mind: Without a consistent, back-of-the-napkin formula for projecting performance improvements, even if much of it is conjecture, systems analysts and data center operators will fall back on conservative guesstimates as to when their systems and servers will need replacement.
As Strohmaier’s chart depicts, that’s been happening for the last seven years.
“I don’t think that Moore’s Law is over yet, or will be over in the near future,” declared Shekhar Borkar, Qualcomm’s senior director of technology and a former Intel Fellow.
To back up that assertion, during a talk at the recent Supercomputing 2020 conference, Borkar submitted the above chart into evidence. While it suggests that the various observed qualities of microprocessors scale at separate rates from one another, they all do scale up — not linearly, but along some definable, upward trajectory. His implication is that the level itself is not the law, but rather that there may be, perhaps floating in space, some neutral coefficient of upward mobility, from which every other perceived performance factor takes its cue.
About 20 years ago, Borkar noted, architectures hit what he calls the “Power Wall.” At the same time, processor clock speeds (frequency) leveled off. That’s when the multicore era officially began. Multiplying processing cores on a die led to greater throughput performance, effectively continuing the performance scaling trend, just through other means.
Borkar’s theme is that engineers have typically taken the necessary steps to continue, or extend, Moore’s Law, even if it meant rethinking designs. Even though the Law has been taught to future systems analysts as a phenomenon that dependably just happens, history shows that significant investments and genuine sacrifices have been made over the last century just to keep up appearances. Those appearances are vitally necessary, it would appear, for the technology economy as a whole to thrive, because they reassure the purse-string holders for organizations that service life may be maximized, and obsolescence can be forecast.
So, what will engineers have to do this time to ensure a smooth glide path for post-pandemic growth? Borkar’s suggestions aren’t new, at least to the people who comprise his typical audience. But for the data center industry, which in recent years has been consuming technologies first proven in high-performance laboratories, what he proposes may sound as radical as abandoning Earth and colonizing Mars as a solution to climate change.
“In the longer term — ten to twenty years — we must look for new devices to replace CMOS,” stated Borkar, referring to the Complementary Metal Oxide Superconductor fabrication process with which all commercial semiconductors are produced. “Along with that comes the circuits, the architectures that go with it, and the system-level research to use these new devices.”
Until that point, he suggests that there are ways to apply system-level innovations to improve CMOS performance up to the physical levels of its endurance. But in the near-term, during which “CMOS is it,” as Borkar put it, system efficiencies can be improved by simply deploying them in more optimum locations. Without invoking the metaphor (to his credit), he suggests that edge computing — distributing processing power to those geographical locations where networking distances are minimized — is among the very last system optimizations we can make to a CMOS-oriented world, before its lifecycle as a scalable-performance architecture runs out.
Borkar also suggests that systems should continue to be designed for improved power efficiency — and to that end, he advises that the current HPL benchmark used for ranking the Top500 doesn’t tell an efficiency story that end users may be looking for. In other words, enterprise customers down the road won’t be able to adopt HPC technologies with an eye towards power efficiency, because they were designed utilizing a metric that may have applied to a Cray stand-alone supercomputer in 1993, but nothing made in this century.
Comparing the scores yielded from HPL tests for high-performing Top 500 systems (upper row above) to scores for the same systems on the High-Performance Conjugate Gradients (HPCG) test (lower row), Borkar notes that HPCG scores are approximately 50 times lower. For example, the first chart on the top row indicates that systems over the years consistently tended to deliver 72 percent of their theoretical peak Linpack performance levels — meaning, their best perceived performance Rmax at just below three-fourths of their theoretical maximum yield Rpeak. Throughout the existence of HPCG, however, that performance delivery level has averaged just 2.3 percent, according to the first chart on the bottom row.
“The question I’m asking is, is this the value metric for today’s HPC apps?” posed Borkar, referring to HPL. “Is this what you think the end user is really looking for? Because if you believe a metric like HPCG is the one that represents the final value that the end user is going to get, then the first graph shows it’s only 2.3 percent. . . There’s a 50x difference from the metric that you believe is providing the value, versus a realistic metric as to what an end user is expecting. If you really want to go follow Moore’s Law, all you have to do is double that percentage — say, instead of 2 percent, 4 percent of Rpeak HPCG performance, and that would give you an equivalent doubling of Rmax performance. So, architect the system as an example of HPCG performance. . . even at the expense of Rmax.”
How do we know for certain that HPCG represents some real-world expectation of efficiency, whereas HPL is a relic of the past? We asked Dongarra, who had a hand in creating both benchmarks.
“The Linpack Top500 benchmark solves a dense matrix problem,” Dongarra responded. “We have a matrix, all the elements are non-zero, and we use a standard algorithm — that’s what the benchmark prescribes. In 1979, that was a good way to measure computers. Today, it’s not so good, to be honest, because we don’t really solve dense matrix problems at scale.”
The HPL test battery involves resolving an ordinary differential equation, Dongarra told us, where all the values in the matrix are populated. In real-world simulations, some or perhaps most matrix values are unpopulated. Resolving these Partial Differential Equations (PDE) requires the form of these equations to be changed, the proper word for which is “disparatized.” This way, instead of a continuous function which may not represent reality, the system resolves for discrete events. Indeed, the branch of math that deals with this is called Discrete Event Simulation (DES). At Los Alamos National Labs, software developers have come up with a library they call SimCore [PDF], which stages DES for extremely complex simulations but opens up exploration into these simulations using a language a simple as Python.
“The idea,” said Dongarra, “would be to come up with the benchmark that looks at that problem — a large sparse matrix problem. Because that’s what our computers are used for.”
The problem is, if Shekhar Borkar’s prediction plays out, the methods and algorithms currently in production for resolving these problems on a real-world scale would likely have to be completely reconfigured once the architecture that replaces CMOS — if it’s based on Josephson junctions, qubits, or whatever — becomes imminent.
“Application scientists are going to have to work hard to implement their algorithms and applications on this next-generation of architectures,” Dongarra said, “because the architectural features of those machines are different than what we’ve used before, and cause us to have to rethink and re-implement our methods.” He noted that the US Department of Energy has already planned to invest $1.8 billion in new and innovative CMOS hardware, and another $1.8 billion in the software to drive it.
But if the road ahead for CMOS dead-ends, can we count on more investments of this magnitude in the future? We asked Tom Conte, who directs the Center for Research into Novel Computing Hierarchies at the Georgia Institute of Technology.
“In general, I think the inertia in maintaining current technologies based on CMOS is large,” Conte told DCK, “but also there’s needed investment for the technologies you point out. Luckily, there is a lot of investment in technologies that relate to this.” He cited government-funded contests into quantum computing research as one example, the “afterglow” of which, as he put it, benefitting all other realms of computing research.
Emerging from the laboratory with a new approach of fabricating semiconductors, Conte believes, may take another decade — much of which may be spent, if Borkar and Dongarra are accurate, tweaking software here and there to uncover small efficiencies along the way. “But after we get out of the lab,” he said, “it’ll be a pull. It’ll be that they really need these technologies, and they’ll end up in the marketplace.”
We need to believe in steady, predictable, calculable progress, even if it takes tumultuous, revolutionary, world-altering gambles under the surface just to keep up appearances.