Resource orchestration may be the most important software-based technology to impact the management, facilitation, and even the design of data centers this year. The ability to drive server utilization and keep data center footprints small depends, in very large part, upon whether cloud infrastructure systems — today, mainly OpenStack — make optimum use of storage, compute, memory, and bandwidth resources throughout the data center, once they’ve all been pooled together.
OpenStack’s resource orchestration component, called Heat and introduced with the “Havana” release back in October 2013, was the first to tackle automated orchestration of resources. But Heat originally used templates — effectively, scripted recipes for how to spin up a requested resource on-demand for the specific infrastructure that resource would require.
“Heat provides an orchestration service for OpenStack clouds,” explained OpenStack Foundation Executive Director Jonathan Bryce, in a note to Datacenter Knowledge. “A template-driven workflow engine allows a cloud user to describe exactly the set of resources that are needed for an application, and Heat will deploy and auto-scale those resources.”
Here’s the problem: In virtualized data centers now, all these resource classes are variables. It’s harder to automate deployment using too many variables without simple scripts having to evolve into forecasting engines.
OpenStack began addressing this problem with last October’s “Liberty” release by introducing a feature that remains, even now, not too well-documented. Called Convergence or Convergence Engine (and begging to be called something else) this new feature of Heat would be capable of orchestrating workloads in something closer to real-time, by way of an “observe-and-notify” approach to monitoring performance.
The idea here is to enable a database table to represent the desired state of the workload or the application — the properties it should exhibit, given the current conditions of the hosting platform and the OpenStack cluster as a whole. To accomplish this while, at the same time, continuing to support the templates approach that already existed, Heat’s Convergence would maintain separate database tables for resource properties observed and properties desired, enabling changes to be made to both by way of remote procedure calls (RPCs) placed by monitoring software.
OpenStack’s engineers didn’t say so at the time, but their new system was well aligned with a property that telecommunications engineers have been desiring for quite some time: intent-based configuration. Put another way, if the orchestrator is capable of expressing the properties it needs a workload to exhibit, the “convergence” would be the process of reconciling desire with practicality. (Nearly every successful marriage is forged on this principle.) This way, the orchestrator responds to the intent of the configuration as best it can.
“As an example, I could describe a Web application that requires 3 Web servers, 2 database servers, a caching server, and needs to run a post-install script to cluster the database servers together. Rather than running all of the calls independently, I can create a Heat template that describes the desired state and the Heat service will automatically deploy everything,” explained Bryce.
The Telco Factor
Up to now, telcos and big data centers had been relegating OpenStack to orchestration “low-priority” workloads, such as periodic accounting and database control, as opposed to actual service delivery. Their complaint had been that OpenStack hasn’t been able to scale to the speed and demands they require.
The Mitaka release addresses that complaint directly, said the OpenStack Foundation’s Bryce. “The updates to Heat allow the Heat service itself to be clustered across multiple machines,” he told us. “This allows horizontal scaling across a cluster of servers that can break a Heat template up and run the steps in parallel. This provides better performance and scaling for executing orchestration workflows, as well as better reliability.”
Yet an intent-based configuration system would be another major step in their direction. At the Open Networking Summit a few weeks ago in Santa Clara, CA, Huawei Chief Architect David Lenrow told attendees that intent-based architecture could be the catalyst that would drive multiple telcos together toward forging a common standard for northbound interfaces (NBI) — a way for applications, especially SDN, to contact network components using a common API.
“What we’re really focusing on is intent-based networking,” said Lenrow, “and a different model for operating the network. We’re trying to establish that as this foundation for a common interface, and then get lots of major vendors and lots of major operators to work together and cause this interface to become widely deployed.” Once that deployment level reaches critical mass, he believes, both major and minor communications players will want to become involved simply because others in their space are doing so.
So there’s a lot at stake with respect to how soon the open source contributors to OpenStack can get cranking, with respect to Convergence. For instance, the success of AT&T’s hyper-accelerated effort to modernize its service and data centers, by open sourcing its ECOMP service delivery platform, may rest upon — among many other matters — whether intent-based configuration is production-ready.
‘Use with Caution’
While OpenStack Havana brought Convergence formally into the production stack, release notes published at the time warned that the tool “has not been production tested and thus should be considered beta quality – use with caution.” That came as a confusing signal for some who thought OpenStack had plenty of opportunity to perfect “beta-quality” code during the actual beta process.
Last Thursday’s formal release into general availability of the “Mitaka” edition of OpenStack came with some warm reassurances that Convergence has been battle-tested now, and is ready for prime time. But as OpenStack Foundation Chief Operating Officer Mark Collier tells Datacenter Knowledge, Mitaka’s propagation into the space of deployed OpenStack platforms may not be immediate.
“Upgrade urgency really depends on the individual user and their needs, ” stated Collier in a note to DCK. “Now that we are on the 13th release, the compute, storage, and networking APIs, and code behind them, have been stable for several releases, so users certainly don’t feel forced to upgrade immediately. That said, one of the big improvements in the software is actually in the area of upgrades, to make those upgrades less painful.
“Secondly, Mitaka brings a lot of improvements in ease of use both for operators and end users, so those are as big of a draw as features per se and are based on feedback from those companies operating Juno, Kilo, and Liberty — so many are eager to put them to use. The last nuance to understand is that many users rely on downstream commercial distributions that typically take a few weeks to produce the latest release so that’s part of the timing to keep in mind.”
As Rackspace Senior Product Director Bryan Thompson told Datacenter Knowledge, Rackspace is currently concluding what it calls the “design process” for integrating OpenStack Mitaka into its Private Cloud services. As part of that process, it’s looking into the extent to which certain additions made to OpenStack some months earlier have matured — specifically, whether they’ve matured to the degree that Rackspace may consider them “fully supported.”
“All major releases, including new features in those releases, are rolled out only after thorough testing, documentation and training to enable our teams to fully support these components for our customers,” wrote Thompson. “Major upgrades (e.g., Juno to Liberty) are typically done between three to six months after GA by the OpenStack Foundation, as we work through all of the processes to update tooling and augmenting components to the new OpenStack bits, complete thorough testing, and roll out training and documentation for our supporting teams. We typically introduce at least one ‘minor’ release within a given OpenStack series, where we will introduce new projects or extended features to our prescribed deployment of OpenStack, and critical updates to address any vulnerabilities and/or high-impact defects are performed as needed, in the form of revision releases after testing.”
Bug fixes and security patches are considered critical, Thompson added, and can be rolled out quite soon; but every other stage requires a thorough testing process. It seems as though there are now two testing phases for OpenStack. The first one is conducted by open source software developers, followed by implementers working with code that OpenStack itself has made generally available – even if it warns, from time to time, that the code may be only “beta-quality.”
If you do the math, a feature being worked into OpenStack may take between 18 months and two years to merit “maturity.” Historically speaking, that’s actually a short period of time. And for data centers, two years of thorough testing may be an absolute requirement anyway. But for some customers in the telco field — especially AT&T, whose milestone dates for ECOMP still read “2017” — two years may be too long.