Steve Wooledge is VP of Marketing, Arcadia Data.
The world of BI continues to change. These changes are driven by three major trends. Unfortunately, many organizations still approach BI with a traditional mindset and traditional models that just won’t work. In this article, I’ll discuss the trends shaping the future of BI, and take a look at a number of assumptions regarding traditional BI that we think organizations can do without.
Data is driving market disruption. Emerging companies today are focused more on data assets than physical assets. That’s big data disruption at work: providing competitive advantages in many different industries. Technologies like Apache Hadoop are leading the way for big data. In fact, leading analyst firms believe about 70 percent of organizations have implemented or are planning to implement this next generation of scale-out architecture. Depending on where they are in their deployment though, a lot of those architectures are held up by not having either the skillsets or the tools to get the value out of this data tier once it’s been put in place. This means organizations will have to leverage technologies that simplify their big data implementations.
Companies want to build analytical applications that can scale and support thousands of users. The shift here is about making analytics an inherent part of daily operations. Organizations don’t want to recreate the world in terms of reporting analytics, but they want to go beyond that and build data applications that can scale and support hundreds of thousands of users in a customer-facing environment.
Organizations are getting value from their data lakes. With a traditional BI architecture, you’re moving data from one system to another to support analytics, either at the end-user desktop or on a BI server. There’s often a complex data pipeline of extraction, transformation, loading, and normalizing that put data into the proper format. As a result, organizations often use data lakes for storage, rather than for the enterprise-wide analytics they should be using it for. Implementing a data pipeline where the data lake is just another data source loses a lot of the granularity in the data as well as a lot of the performance in terms of real-time access to the information.
Traditional Assumptions Hold Back Modern BI and Analytics
Organizations must change the way they think about their BI because the shift is happening and is undeniable. Many leaders have outdated assumptions around their BI environments that they take for granted. Given the rise of the aforementioned trends, these assumptions no longer apply and can have the potential to diminish, if not destroy, the possibilities of technology project success. Here are several assumptions organizations must attempt to do without when embarking upon modern BI and data analytics projects.
You have all your requirements ahead of time. One of the core assumptions in the traditional BI and data warehousing world is that you have your requirements ahead of time. The data is there; it’s collected, stored, and it’s ready, with a schema imposed on top of it based on the business requirements. We know what’s in there, and that the data is all in one place. This presents some constraints that we just take for granted and don’t think about. For example, we often have limited consideration for feedback loops and change, especially in a traditional data warehouse used as a read-only repository. When you inevitably face the situation when numbers don’t look right, a long cycle is involved in tracking down the problem throughout the whole stack to fix it.
Your data is persisted. Sometimes the data isn’t persisted, it’s floating on the network, it’s streaming, and it doesn’t get recorded anywhere. Or it is recorded, but only for a very short period of time, so you don’t have a historical repository of that data? How can you deal with information needs like that? Historically, that problem has been left to some other department, rather than BI, so now you’re making two departments responsible for data, just because it happens to exist in two different locations or move at two different speeds. If you don’t know what data you need, how are you going to plan for it in advance?
Your data is static. People don’t just have static uses of data. They must do some exploratory analysis of the data in order to understand what’s going on. You then need to inform others of your discoveries. You want to identify the key metrics in the data or highlight the anomalies so you can convince someone to take action.
You have the data you need. This core assumption is flawed. What happens in analysis, particularly when something new happens, or when someone has a new request, you need to understand what data you need. But data may exist outside the warehouse, data mart, or other external server. Depending on whether you know what data you need, and what data is known to be available, you may use different analysis techniques like querying, browsing, search, and exploration. Unfortunately, traditional BI only addresses the querying, in which you know what data you need and you know where it’s available.
You can build bigger and bigger cubes. One model for analytics is to pre-cache query results via fixed cubing techniques à la the OLAP model. As your data grew, you could simply build more cubes. But the challenges of this model are well known, as you have to make copies of data and you have to know what questions are being asked ahead of time.
You can bring all of your data to a dedicated BI environment outside the central repository. This approach won’t work at scale. While the intent is to gain performance in a dedicated environment, the architectural difference will prevent you from running analytics on large data sets with many concurrent users.
You can move all your data to the cloud. In this scenario, your analytics may be embedded in a database or in a server, and then you use a client tool in order to access it. But using the cloud doesn’t automatically give you the scale and performance you need. You need a platform that is architected to natively handle the scale of big data.
If these assumptions sound familiar to you, then perhaps you’ve already faced a number of challenges that keep you from delivering the BI insights your organization needs. But if not, I hope by avoiding these common misconceptions, you can save your analytics projects.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Informa.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating.