Metadata: Mundane… Yet Critical For Analytics Success
Years ago I saw Bill Inmon speak at a conference and one thing he said resonates with me to this day. He said, “Let’s face it, no self-respecting developer ever went back and documented his code.” Not only can this be applied to program documentation but it can also be applied to the other metadata that describes a business intelligence system.
But that metadata can be crucial, especially if you’re trying to get out of the business of creating reports.
The New Model of Analytics & BI
As I’ve written recently, more and more companies are finding BI success by getting IT out of the business of creating reports. They’re seeing more users jump on the BI bandwagon if the responsibility for creating reports is distributed to the departments that need those reports (frequently to reporting gurus in those departments). In this model, IT becomes the data provider and the departments become the data consumers.
Why the New Model Screams Out for Decent Metadata
As effective as this model is, however, you need to beware of some potential pitfalls. One of the biggest is in how these user departments find and interpret the data that IT provides. Put another way, if the user can choose between twenty data elements that all sound similar, how do they know which to put onto their reports?
Furthermore, how do you ensure that departments A and B are both interpreting the available data in the same ways?
As should be obvious, if you don’t provide some tool for understanding and interpreting data, reporting remains a series of departmental stovepipes that can’t be reconciled and will very likely be misinterpreted.
What You Should be Doing About Metadata
There are a few, relatively simple, things you can do to ensure that you don’t face the Tower of Babel that led you to build a data warehouse in the first place.
1) CAPTURE METADATA IN DATA MODELS: It always surprises me when I visit a mature IT department that isn’t using a competent data modeling tool. There comes a point when data models in Visio, or Excel, or notebook paper, or pure grey matter just don’t cut it. Not only will it help IT folks keep track of what’s going on but a good data modeling tool is also a great place to store metadata about your systems. For example, this screen from Embarcadero ER/Studio Data Architect shows some of the metadata you can capture about an individual field. At the bottom of this screen you’ll notice that the data modeler has entered some user-friendly information about what the field UniqueFloorKey represents.
2) GENERATE METADATA INTO YOUR DATA DICTIONARIES: A lot of folks don’t realize that most relational databases come complete with administrative tables to hold data dictionaries. The databases use these so their engines can understand the database structures. However, these tables can also hold a good deal of optional information about the databases. The best part of it is, you can write simple SQL queries to get at this information! I’ve used this capability for a few things in the past including writing queries to generate data dictionary documentation and writing reports in BI tools so users can access this information themselves. I’ve, even, written queries against these data dictionaries to generate other queries (probably not advisable for novice users).
Better yet, if you’ve stored your metadata in your data model, you can automatically have it generated into your database, along with the rest of the data model.
For more information on how to use these capabilities, look into the data dictionary in Oracle, the extended properties in SQL Server and similar items for other databases.
3) DEVELOP MECHANISMS FOR GETTING USABLE METADATA INTO END USER HANDS: Once you’ve captured this metadata, makes sure you have a way to get it into your end users’ hands. Some tools, such as Business Objects and Microstrategy, have the ability to read these data dictionaries into their tool-specific metadata. Once this is done, users can easily see your definitions, sometime by right clicking on a reporting object and, sometimes, with no extra work at all.
Other ways to expose metadata to your users might be some piece of documentation, predefined reports built into your BI environment, or a metadata tool such as IBM’s Business Glossary. Dedicated metadata tools will provide even more functionality than providing the basic column and table definitions we’re talking about here.
In any case, to successfully deliver intelligence, you need to also deliver information about the data (i.e. you need to deliver metadata). You don’t want to overdo it but you do need to make sure that all your users are working from the same script.