Monday, September 24, 2007

The Analytic Value Chain - Locate relevant data

Related posts
The Analytic Value Chain introduced
Defining the problem
Determine Data Requirements

This post overlaps with the previous Value Chain post on determining the data requirements, but is meant to be a bit more tactical. If you have followed my advice from the previous post, you have a good sense of what data you are going to need. Now you need to find it. In most organizations I have worked with, the data to solve any given business problem exists. The challenge is that the data often exists in a place that is not accessible to most of the folks in the organization. The data may not be in a production environment (it is sitting on a server under someone's desk) and if it is in production, the data warehouse might be so large that no one really knows what is in there (this is not an uncommon problem in real warehouses. I had an expert in physical warehousing once tell me that a really good warehouse knows where a specific pallet can be found 80% of the time). I once had both problems at the same time. I found two data sets that answered a critical business problem when combined, but were running on different desktops in two different parts of my client's organization. I found the data by luck, but wound up doing a very impactful analysis. My value was in carrying the data sets, on floppy disks, back to my PC. Obviously, you can't analyze data that you can't find.

So, what to do? I don't have a ton of advice here, but I have found somethings work pretty well in identifying the data that is out there and making it accessible. First, treasure your staff that really know your data infrastructure because they have hard fought knowledge (for those in my organization, you know who you are. And you know how much I value your contributions. You also know that I am understating.) There is no replacement for just having experience in your data infrastructure. Having said that we rely on people power, metadata helps. And even the best staff are not going to be able to find useful data if you don't have data dictionaries for every dataset.

Second, create tables that aggregate your most useful data. We do this and can get our hands on useful data sets pretty quickly; in minutes. We evaluate the variables in those tables about once a year (or after any major strategy shift) to ensure that the data set maintains its usefulness. This data set has an additional value in that it can be shared with your entire organization and forms a common "fact base" for the organization.

Third, try to think ahead and ask your team to be on the lookout for certain types of data. There are business questions that I know I am going to want to take a look at and by communicating to the team my topics of interest, they can make serendipitous discoveries.

No comments: