Jim McGann is vice president of information management company Index Engines. Connect with him on LinkedIn.
It’s a story that plays out in the public view all too often. An unencrypted hard drive full of personal client information is loaded on a laptop, left in a cab and the entire company goes into damage control.
This is a prime example of data being in the wrong place at the wrong time. But client records aren’t the only sets of data in the wrong place. Companies can have up to 80 percent of their data in the wrong place at the wrong time.
Beyond the breach, storing your data
Well beyond data breaches, having data in the wrong place eats into company resources, can add up to millions in unnecessary expenses and creates a considerable legal risk.
Data has a habit or dying where it was born. An email sent 10 years ago that backed up to a legacy tape is sitting in an offsite storage vault. The iTunes library the intern downloaded two years ago is saved on the user share server that’s backing up to disk.
Data dies where it was born, where it was first backed up, first used. As this data ages, it loses context and its location is often forgotten. The data becomes sensitive, useless and/or expensive making its current home the wrong place at the wrong time.
Backup tape shouldn’t double as file storage and it’s definitely not an archive. Yet, there they are – boxes of tapes with nightly backups created for disaster recovery and doubling as storage and an archive in case of an eDiscovery event.
The fact is, as this data ages the value is lost. On average, less than 5 percent of tape data has business or legal value. After five years those numbers drop to 1-3 percent. This percentage is likely composed of contracts, client/employee records and other sensitive content you can’t destroy. The likelihood of these documents being accessed again are slim, but they must be maintained for compliance and legal.
Disk-based backup has been widely accepted because of its better features and convenience, but it has two faults. First, tapes are often still created off the lack end for long-term retention and second, the systems are backing up data with no business value.
Companies upgrade to premium disk-based backup technology only to resort back to using tape for long-term retention. This results in paying for expensive offsite storage and incurring costs to manage tapes even though they are a “tapeless” environment.
Disk also isn’t the most economical of storage platforms and when everything’s backed up to disk including lunch requests from the past year and the intern’s iTunes library, the cost of managing this environment can add up even faster.
In both cases organizations need to set policy on their data and clean up what is no longer required and stop simply stockpiling legacy data in offsite storage. Long-term retention of abandoned, personal, duplicate and value data in the wrong place builds up.
Cloud migration has been popular because of its affordability, easier management and lower physical presence in the data center. There’s one thing the cloud is right on track with compared to disk and tape – the vendor lock in.
With cloud companies gearing for enterprises’ business and popping up daily, organizations have to be aware of depending on one cloud backup provider that may go out of business – with your data. The cloud provider likely has the data in a proprietary format, making a virtual move a lot more frustrating.
Companies also lose physical control of the data, causing many of them to keep everything on internal servers so they can find data when they need it. This causes mass expansion of server capacity including a lot of junk.
By cleaning out servers, particularly user share servers and those belonging to high-turnover departments, capacity cost and capacity can be reduced.
Set parameters with data profiling
Data profiling takes all forms of unstructured files and document types, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and optionally what key terms are in it so companies can make smarter decisions about data retention and platform.
Leveraging a rich metadata or full text index as well as powerful Active Directory integration, content can be profiled and analyzed with a single click. High-level summary reports allow instant insight into enterprise storage providing never-before knowledge of data assets. During this process, mystery data can be managed and classified, including content that has outlived its business value or sensitive data that poses a corporate risk or liability.
Then data profiling enables companies to build and automate policies to manage this data. The built-in disposition capabilities within the engine make constructing and enforcing information management, compliance, defensible deletion or other retention policies simple, auditable and automated.
Set parameters around what data exists, who owns it, file type, when it was last accessed and where it’s located. Disposition options include migration to cloud or lower cost storage tiers, defensible deletion, archiving, and more. Identify what has value and should be kept for long-term preservation and eliminate the save-everything strategy.
The time is now
Data has a habit of dying where it was born because no policy is set around what to do with it. With new technology and the various storage platforms already maintained in the data center, it is the right time to manage, tier, remediate, encrypt and archive that data.
Policy setting can remediate legacy tape, keep disk from turning into tape and maintain server size, saving companies cost, capacity, legal risks and maybe even a data breach story on the nightly news.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.