May 132013

There are a number of often unspoken jobs that the analyst must perform in order to be useful, helpful and thereby successful.  These include being a question definer, a translator, a data expert, a storyteller, and an anticipator of post analysis questions. Without approaching a business question (which should inform a specific business action) carefully, the opportunity for error or underwhelming findings is greatly increased.

Definer of THE QUESTION – When I think about analysis I always start with “what is the question we are trying to answer?” The question to be answered is never as simple as whatever my boss has asked me; data is messy, the question is complex, and more often than not, the initial question is wrong. By wrong, I mean that it is too general, makes assumptions about the answer that may be wrong, or just does not make sense from a business point of view. One way to think about whether the question is a good one or not is to come up with a fake answer and see if it would change anything. “How many women in their 40’s are tweeting about our product?” the boss asks you… the answer, by itself, is probably pretty useless – “135”. What the boss really wants to know is “what are the demographics of people tweeting about our products, I think it’s mostly demographic X?” Assuming you have access to the demographics, you can present a plethora of data – by product, by cohort, etc. That approach makes your answer to the initial (not good) question make more sense “Looks like 135, which is 32% of all folks tweeting about us at all.” Engaging in a Socratic dialogue with the data consumer (or yourself for that matter) can help you to understand the kernel, the impetus for the question and thereby redefine it in a way that extends the usefulness of the answer and guides additional analyses necessary for understanding more deeply the phenomenon you are investigating.

Rosetta Stone – The analyst must be able to translate fluidly between many entities. Business and marketing oriented groups will not speak the same language as data miners and data scientists. The audience for consuming data is constantly changing. The analyst must anticipate these differences, speak the different languages and be sensitive to the fact that her job is to increase understanding (rather than expecting their customers, the data consumers, to do that research after the data is presented). Again, if the analyst was not needed to act as a translator, she would be easily replaced by a tool at some point. Once the question to be answered has been defined, the analyst will most likely have to interact with entities to extract the raw data. One man’s “how many” is another man’s “SELECT * FROM table_name”.

Master of Data – At the end of the day, the analyst must know the data they are transforming and interpreting. They must understand the structures, the nuances, and the quality. This means a part of an analyst’s job is to investigate, on their own, the data sources they interact with. Blindly extracting data is a great way to create false conclusions due to poor quality data, null fields, strange conventions and more. I recall running an analysis whereby I found the “gender” column of a master table contained the values “male”, “female”, and “child”. Imagine if I had tried to do some sort of deductive analysis whereby I got a count of total uniques and total males, and then took a shortcut (uniques – males) to derive females. Oops. There is no universal taxonomy, there is no universal ontology. When it comes to data, you have to check and double check the sources to be sure you understand the nature of the data you are extracting value from.

Storyteller – The analyst needs to take data, in its raw form, and mold it into something that can be understood and easily retained by the audience. Answers to business questions are rarely straightforward (and if they were, the analyst would be replaced by a dashboard) and often require a contextual back story. The analyst must determine, to the best of his or her ability, the relevant context for the answers presented. This context often goes beyond the data and requires (gasp) talking to others or (shudder) using the product that generated the data. Without the appropriate data context and the ability to describe that context to the audience, the analyst is nothing more than an overpaid calculator moving numbers into tables and charts for someone else to interpret. A great interview on the storytelling aspect of data analysis was given by Cole Nussbaumer (whose blog I am adding to the blogroll on the front page) for the website

Anticipator of Questions – I believe that a successful presentation of data provides clear information about a specific set of business questions such that decisions can be made. I also believe that a successful analysis generates more questions. While that may seem counter intuitive  (if you answered the original questions why are people asking more?) in my experience, the questions asked after a successful analysis are those that build upon the insights you presented (rather than being unrelated or confrontational/doubtful). If you get no questions, I fear you have bored or confused your audience. That said, anticipating the most likely follow up questions and running the analyses on a few is generally low cost with high reward as it shows you have defined the question space well, translated it in the appropriate manner, retrieved the data to answer those questions above and beyond expectations, have mastered the data and are now weaving it into the epilogue of your story, delighting your audience and giving them valuable information. If nobody asks the questions you anticipated, you can always present them (quickly, as you have spent most of your time talking about the core question) as bonus material “I was curious about…” this also leaves your audience with the assurance that you actually care about unearthing insights and frankly, if you really don’t, you should find a new job.