Nine Main Challenges in Big Data Security

Aleksandr Panchenko<br/>A1QAAleksandr Panchenko

Aleksandr Panchenko is Head of Complex Web QA Department for A1QA.

Every year the protection of private and confidential information gains more and more attention. According to the World Quality Report 2015-16, the only global report for application quality, security is the most highly ranked priority in the IT strategies used by survey respondents.

Until recently, a company’s applications were mainly internal and its security was viewed as low risk. However, with the increased adoption of web-based, mobile and cloud-based applications, sensitive data has become accessible from different platforms. These platforms are highly vulnerable to hacking, especially if they are low-cost or free.

Nowadays, organizations are collecting and processing massive amounts of information. The more data is stored, the more vital it is to ensure its security. A lack of data security can lead to great financial losses and reputational damage for a company. As far as Big Data is concerned, losses due to poor IT security can exceed even the worst expectations.

What are the Main Challenges When it Comes to Big Data Security?

Almost all data security issues are caused by the lack of effective measures provided by antivirus software and firewalls. These systems were developed to protect the limited scope of information stored on the hard disk, but Big Data goes beyond hard disks and isolated systems.

Nine Big Data Security Challenges

  1. Most distributed systems’ computations have only a single level of protection, which is not recommended.
  2. Non-relational databases (NoSQL) are actively evolving, making it difficult for security solutions to keep up with demand.
  3. Automated data transfer requires additional security measures, which are often not available.
  4. When a system receives a large amount of information, it should be validated to remain trustworthy and accurate; this practice doesn’t always occur, however.
  5. Unethical IT specialists practicing information mining can gather personal data without asking users for permission or notifying them.
  6. Access control encryption and connections security can become dated and inaccessible to the IT specialists who rely on it.
  7. Some organizations cannot – or do not – institute access controls to divide the level of confidentiality within the company.
  8. Recommended detailed audits are not routinely performed on Big Data due to the huge amount of information involved.
  9. Due to the size of Big Data, its origins are not consistently monitored and tracked.

How Can Big Data Security be Improved?

Cloud computing experts believe that the most reasonable way to improve the security of Big Data is through the continual expansion of the antivirus industry. A multitude of antivirus vendors, offering a variety of solutions, provides a better defense against Big Data security threats.

Refreshingly, the antivirus industry is often touted for its openness. Antivirus software providers freely exchange information about current Big Data security threats, and industry leaders often work together to cope with new malicious software attacks, providing maximum gains in Big Data security.

Here are some additional recommendations to strengthen Big Data security:

  • Focus on application security, rather than device security.
  • Isolate devices and servers containing critical data.
  • Introduce real-time security information and event management.
  • Provide reactive and proactive protection.

What’s Next for Big Data Security?

Of immediate concern to companies using Big Data is the security of cloud-based systems. Intel Security has recently published the McAfee Labs’ Threat Predictions Report that contains their expectations for the near-future of data security. Of particular concern in this report is the supposition that legitimate cloud file hosting services such as Dropbox, Box, and Stream Nation, are at risk of being used as control servers in upcoming cyber espionage campaigns. If targeted, these popular cloud services could enable the malware to transfer commands without raising suspicion.

Malicious attacks on IT systems are becoming more complex and new malware is constantly being developed. Unfortunately, companies that work with Big Data face these issues on a daily basis. Nevertheless, every problem has a solution and finding an effective and suitable answer for your organization is indeed possible.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

Add Your Comments

  • (will not be published)


  1. I agree that “encryption and connections security can become dated and inaccessible to the IT specialists who rely on it.” But new data-centric security technologies such as tokenization have been developed to protect data at a highly granular-level, without limiting the data’s value potential in analytics and other business processes. I recently read the Gartner Report "Big Data Needs a Data-Centric Security Focus" concluding "In order to avoid security chaos, Chief Information Security Officers (CISOs) need to approach big data through a data-centric approach. Gartner also stated that "the market has so far failed to offer CISOs the data-centric audit and protection (DCAP) products they need to operate across all silos with consistency." The good news is that Big Data distributions, like Hortonworks, recently started to include the type of advanced security features that Gartner is recommending, including dynamic masking, fine grained encryption, and data tokenization. I think that several different data protection options are needed to support different use cases and provide the performance and scalability we expect from Big Data. I list a couple of data centric approaches that can be useful: Apply data protection at database, application or file-level outside Hadoop. Transfer data to staging area (edge node) and apply data protection outside Hadoop. Apply volume-level encryption within Hadoop. Extend Hbase, Pig, Hive, Flume and Sqoop job function using data protection API within Hadoop. Extend MapReduce framework with data protection API within Hadoop. Apply transparent HDFS folder and file encryption. Import de-identified data into Hadoop. Export de-identified data for input into BI applications. Export identifiable data to trusted sources. Export audit data for monitoring, reporting and analysis. Read more about this topic at Ulf Mattsson, CTO Protegrity

  2. Joe Colby

    Good article, thanks for sharing. The problem I see with reliance on antivirus and firewalls is that they are signature based. I believe another approach of implementing a combination of machine learning (to effectively handle the massive amounts of big data) and behavior analytics could fill in the void that antivirus and firewalls miss. Big data will only be getting bigger and unlawful perpetrators will continually evolve their attack plan.