Oct 072013

When taking on an analytics project, or designing a reporting system (dashboard or otherwise) a core component to superior execution is to properly understand the question(s) the vehicle is expected to answer.  This may seem like an obvious statement, but it is amazing how often the metrics of focus are done so for convenience rather than impact. Additionally, dashboards and reports are often (at least initially) put together by individuals with little training in design and business reporting. A monkey can make a graph, but it takes a bit of thought and planning to make something impactful.  I would argue that the state of business intelligence in general suffers from this issue – people undervalue the opportunities for using data to make great business decisions because they have learned that the data available to them is not useful for doing so. Instead of insisting that the metrics and reports be useful for business decisions, they instead write off the full potential of the data and go back to the inefficiencies of traditional gut based decision making. What they fail to realize or are not made aware of, is the wide variety of data available that is not being used. Empowering decision makers to utilize data is a core purpose of an analyst.

While assigning fault for the poor state of BI affairs isn’t particularly helpful, it’s worth noting that it is a systemic issue based on the past delivery of inferior metrics and reports coupled with limited decision making timeframes. This can also be compounded by the general ignorance of what data is available. The analyst’s job is to right these wrongs and must retrain the organization around how data is approached, assembled, and utilized for business decisions. This reorganization begins with the most basic step in the analytics chain: Defining the right questions.

The reason that the question definition is key is because all further analytical planning actions stem from it. Question definition, while the job of the analyst, requires input from the consumers of the information. Interviewing those who will use the analysts’ output is key to deciding what any given analytical product will contain including how many metrics are needed, in what format, and how frequently they will need to be updated.  The unsophisticated organization derives metrics by default, in a post-hoc manner based around whatever data has happened to be collected. This approach is not likely to have the impact that a more carefully planned set of information tailored to the business needs will.

Additionally, some decision makers will believe they know exactly what data they want and need, and it is important that the analyst probe and make sure this is correct. Finding out what a particular metric will be used for, or why a specific piece of information is useful can uncover unintended interpretation mistakes (e.g. when a unique-id is not indicative of a unique user due to cookie expiration).  It is safe to say that while business owners often do not understand the data used to create particular metrics, they often have strong opinions about what the metrics mean. It is the job of the analyst to educate and redefine based around these shortcomings. Furthermore, the analyst should be aware of the myriad of data sources that are available for creating metrics from, helping to aid the business owner through discovery. This is a major reason it is critical to get the BI consumer/business decision maker to talk about what the questions they are trying to answer are rather than to expect them to rattle off a full list of needed metrics. The analyst defines the metrics based on the needs of the business owner. It is crucial that the analyst take an active, participatory role in the design stage rather than a passive “fast food order” style of design. You are a data educator – act like one.

In closing, there are a number of additional steps to defining the mechanism necessary for answering the questions settled upon. Some questions require multiple metrics, new data collection, derivative metrics (e.g. time between purchases is derived from last purchase timestamp minus previous purchase timestamp), or a re-statement of the question due to limitations in available data. This layer of the design lies in between the initial question definition and the visual display of data, but again is key to successful execution of an analytic output. The analyst is an artist, a teacher, a designer, a data counselor, and a storyteller as well as being the one who knows the data. You can’t design an output mechanism if you don’t know what the input question is.

Question –> translates to data –> summarized by metrics –> interpreted by analysis –> converted to display –> answers or informs question