By David Flynn, CTO, Primary Data
Many enterprises today are adopting a “cloud first” mentality that advises IT to evaluate whether cloud storage is a workable option for all requests they receive. This is understandable, as the cloud offers many benefits, including facilitating collaborative work, increasing flexibility and agility with elastic performance and capacity, and providing a cost-effective data archive, not to mention the cost savings. In fact, Sid Nag of Gartner reports that, "Growth of public cloud is supported by the fact that organizations are saving 14 percent of their budgets as an outcome of public cloud adoption."
Yet, the implementation of “cloud first” policies remains slow, as Nag also noted, “…the aspiration for using cloud services outpaces actual adoption. There's no question there is great appetite within organizations to use cloud services, but there are still challenges for organizations as they make the move to the cloud.”
There are many reasons holding enterprises back from the cloud, but it can be much easier for enterprises to accelerate enterprise cloud adoption. Let’s take a closer look at how.
See What Data Is Hot and What Is Not
Enterprises commonly begin their cloud initiatives with archival, as moving data that is no longer being used is low risk. However, low risk does not mean cloud archival projects are any easier. IT must conduct extensive research into which applications are retired and where their data is located. They must then identify which storage resources are co-hosting business-critical data, and plan migrations around active application use. IT must then schedule and perform the migration to the cloud during off-hours to protect business continuity. This can take time since data is typically migrated to the cloud using slow internet bandwidth. In fact, some enterprises have even run sneakernets to ensure data can be moved quickly and securely, without interrupting business.
A metadata engine makes this process much simpler. As a data management software layer, it can enable enterprises to add cloud storage as another tier in a global namespace. Once the cloud storage is added, the metadata engine can automatically load balance cold data to the new cloud resource, according to policies set by admins. For example, a metadata engine can automatically identify data activity and archive any data that has not been active within a time window that IT defines, such as 30 days, six months, or three years. Data can move between on-premises storage and one or multiple clouds without disrupting an application's access, even while the data is in-flight.
Importantly, a metadata engine can help IT archive data to the cloud more intelligently than typical archival solutions. First, rather than base movement decisions on simple file creation dates, as is common with popular archival tools, a metadata engine can see whether data is being accessed at all (either by applications or users) and keep it on premises if so. Second, data is migrated to the cloud only when movement won’t impact other running applications. This protects business continuity, while allowing archive migration to occur around the clock, without IT intervention. Transfer times can be reduced through WAN optimization techniques that de-duplicate and compress data before it’s sent to the cloud, while security can be ensured through encryption for both in-flight data and data at rest.
Application Awareness Delivers Cloud Cost Savings for Active Applications
IBM has reported that, “approximately 75 percent of the data stored is typically inactive, rarely accessed by any user, process or application. An estimated 90 percent of all data access requests are serviced by new data—usually data that is less than a year old.” This means that most of the storage capacity used by applications is being wasted on data that is not being accessed.
Of course, most enterprises would love to extend cloud cost savings beyond retired applications, but it’s easy to understand why they don’t. Three key challenges make it difficult to move cold data that is associated with a live application to the cloud. First of all, if applications need the data again, IT must scramble to restore it to on-premises storage. Secondly, as public cloud providers typically charge for bandwidth to retrieve stored data, enterprises must consider when the cost to restore data outweighs the cost savings. Finally, cloud archive data is typically stored as an object, which means that applications must be modified to use retrieved object data.
A metadata engine can resolve all these challenges. It ensures all data in its global namespace remains accessible and can automatically retrieve it should applications need it again. Bandwidth charges can be minimized since a metadata engine can offer the ability to retrieve single files. Conventionally, if a company needed to restore a single file from a backup, they would still need to pay the bandwidth charge to move the entire backup bundle on premises and then rehydrate the bundle to restore the file. These bandwidth charges can be substantial if video and audio files are contained within the dataset. The ability to keep data accessible as files also means that enterprises don’t have to modify applications to use object data.
In its 2017 Roadmap for Storage, Gartner predicts that “by 2021, more than 80 percent of enterprise unstructured data will be stored in scale-out file system and object storage systems in enterprise and cloud data centers, an increase from 30 percent today.” Using a metadata engine to manage data across the enterprise and into the cloud can make this transition simple. Petabyte-scale enterprises gain the ability to automate the movement of data from creation to archival across all storage types, including the integration of public clouds as an active archive. Many core management tasks can also be automated, making it easy for companies to maximize storage efficiency and cost savings with the cloud, while ensuring the performance and protection required to meet service levels.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Penton.