I build data warehouses, I understand why they’re important, I make a living from them… I also see that traditional, relational data warehouses are on the way out. Their demise is coming from a few technological advances but the biggest one is the growing use of in-memory reporting technologies, like QlikView.
Attributes of New Reporting Technologies
I’ve been working with QlikView for some time now as well as with some clients that are in the process of adopting it. Here are some attributes of QlikView , and of similar tools, that are killing the traditional data warehouse:
- They contain their own, non-relational, self-managing data stores.
- They can import data from multiple sources into a single, accessible data store.
- They join related data together, like a relational database.
- They provide predictable, blisteringly fast query performance
- They provide very easy, user-friendly user interfaces.
- They can contain, and rapidly summarize, atomic-level, granular data.
- They can be incrementally refreshed, enabling the storage of history.
Attributes of a Data Warehouse
So, how does this lead to the demise of the data warehouse? Bill Inmon originally defined a data warehouse as a, “Subject-oriented, integrated, nonvolatile, time variant collection of data in support of management decisions.” In layman’s terms, a data warehouse is a database used for reporting and analysis containing data that’s been collected from various data sources.
More Importantly – Goals of a Data Warehouse
More important than definitions, however, are the goals of the data warehouse:
- To give business people speedy access to data for business intelligence.
- To eliminate the slowness that can be associated with reporting summary data out of complex, source databases.
- To protect the performance of the source databases by offloading compute-intensive reporting to other computers.
- To make reporting easy and user-friendly.
- To provide an integrated view of the organization; to make it appear as though its data weren’t spread across a bunch of separate systems; to make it look like the company was really operating from one, central database.
- To save and provide access to history that is frequently discarded or overwritten in source systems.
Have no doubt, a well-designed data warehouse can be great at doing these things – at great cost and with significant complexity.
Do New Reporting Technologies Meet These Goals?
So, can a tool like QlikView replace a traditional data warehouse? Well, I frankly see nothing on the above list of goals that these tools can’t do, especially if a company has a master data management program in place that ensures their systems already share common keys.
While these new reporting technologies can be pricey, the overall cost of implementing them is almost certainly going to be less than the cost of designing, building and maintaining data warehouses and then purchasing these same, or similar, tools to query from those warehouses.
Caveats – Why You May Still Need a Relational Data Warehouse
There are reasons why you might still need a traditional, relational data warehouse. These include:
- You have specific needs for specialized business intelligence tools that can only be used against SQL-based databases.
- You have a need for real-time reporting of transactions (although, in most cases, this reporting should be done out of operational systems or intermediate operational data stores anyway)
- Your data must support multiple BI tools. Right now, the databases behind tools like QlikView can only be accessed with their own, proprietary BI user interfaces. Thus you can’t, for example, access a QlikView associative database with tools like Business Objects, Cognos, MicroStrategy or Excel.
- Your source data is so massive that it will overwhelm the capabilities of your BI tool’s database. Applications like telephone call detail come to mind here.
- Your source data cannot be directly loaded into a new-generation BI tool and must be staged somewhere. An example of this is some Cloud-based systems that don’t provide strong programming interfaces for data access.
- Your source data does not share common keys and requires significant massaging to make it useful.
Is the Data Warehouse Dead – or Just Morphing?
Not all new BI technologies will kill the data warehouse. Some very powerful ones are SQL-based. SQL-based tools still need relational data sources, i.e. data warehouses.
Finally, it’s incorrect to say that the data warehouse is dead. It’s really just morphing, or better put, evolving. The definition of the data warehouse says nothing about the kind of storage technology that must be used. Thus, storing that data in an associative database, a multidimensional database, or even on punched cards doesn’t mean you don’t have a data warehouse. The trick, of course, is to make sure you’ve got something that supports your current, and future, needs at a reasonable cost.
I’d be very interested in your thoughts, post your comments below. Thanks!
EDITOR’S NOTE: Tom Carroll was a Dataspace consultant in the late 1990′s. He left for an IT job at GM which quickly morphed into a finance job for GM’s OnStar subsidiary. Tom was, effectively, the person on the user side at OnStar who was responsible for delivering financial reports to management. We were thrilled when, in May, Tom came back to Dataspace as a lead consultant. Not only does he have about 20 years of BI experience but he now, also, has the perspective of a user.
I am excited to the back at Dataspace after an 11 year absence. While I was gone I learned a whole lot about what it means to be a business intelligence end user, and over the course of a few postings I’d like to share some of what I’ve learned. While much of what we read about data warehousing and business intelligence is focused on technology, it really is the end user that will determine whether your warehouse effort is successful or not.
Regardless of What IT Says, I Have a Job to Do
I guess the first dirty secret is that reporting end users really don’t care much about IT and its issues. Are you shocked or insulted by this? Don’t be. The end user has a job to do and is being evaluated on whether and how well they get that job done. If IT can help them do their job, that’s great, but if not the job still has to be done. Having trouble getting that tax data loaded to the warehouse? OLAP cube didn’t build last night due to database issues? Guess what, the books still have to be closed so as an end user I’m going to come up with some other way to get it done. It may not be 100% correct, but as an end user I don’t have time to wait for IT to figure out its problems.
The Tool I Use to do That Job? Excel
Second, all those crazy spreadsheets and Access databases that have popped up in Finance and other departments over the years? You know, the ones you’ve tried to analyze in order to ferret out reporting and data requirements? In almost all cases those are the result of end users coming up with the best solution they can muster using the tools they know best and whatever data they have access to. Not to disparage all end users, but when it comes to designing data stores, they wouldn’t be my first stop (big surprise, huh?). Not that they’re not smart people, but it’s just not their area of expertise. Given easy access to well-structured, integrated and more complete data (can you say data warehouse?) that solves their problem, they (or at least their management) would get rid of them in a heartbeat.
So what do you think? What is the state of the relationship between IT and end users in your organization? Are there any processes or user groups in your organization that help to foster this relationship? I look forward to hearing your thoughts.
When you look at how Business Intelligence tools are marketed, you’d think that the secret to a wildly successful operation is to simply have executives sit at their desks looking at beautifully laid out dashboards, clicking here and there on charts, graphs, and gauges, drilling down, rolling up, and slicing and dicing their data. After all, that’s what the vendors of Business Intelligence systems portray in their marketing communications (and we’re guilty of using eye candy in our own materials, too).
I’m the CEO of a Business Intelligence consultancy. Organizing and presenting data in ways that enable business decisions is all that we’ve done for the 15 years since I founded Dataspace. Before that, I did it at MicroStrategy. I’ve, even, co-authored three books on the topic. Of all people, you might expect me to be sitting at my desk, slicing and dicing to my heart’s content. But you know what? I have a business to run. I’ve got to spend my time on attracting new clients, ensuring my team delivers flawlessly, and conduct a variety of back office functions from tracking payables and receivables to minimizing my overhead. And while we have implemented Business Intelligence tools at Dataspace to help me manage my operation, with the data collected, integrated and presented in a manner specific to my needs, I find I actually spend very little time using these systems. And typically for only two purposes: 1) to investigate a particular problem; 2) to check in once a week or so to see whether things are on track. I recently estimated how much time I spend using on these systems, and found I don’t spend more than an hour a week in them.
Do successful managers spend their days clicking around in BI systems? I don’t think so. Successful managers spend their time managing: making decisions and interacting with people – customers, employees, partners, suppliers, etc. Well-designed BI systems quickly give managers a view of what’s going on – of what decisions they need to make and what conversations they need to have. Well-designed BI systems get the answer across quickly and then get out of the way.
I’m proud that I use my system less than 2% of the time. After all, well-designed BI systems enable use of that 2% to identify the decisions that need to be made, and the conversations that need to be had with the other 98%.
Want to discuss? Feel free to contact me at firstname.lastname@example.org.– Ben