Apr 202011

As a business analyst, I live and die by logging. This makes me vigilant about what products are being developed by my organization, and how they change from concept to wireframes to implementation. Rarely do these three stages look the same, and sometimes the end product is a far cry from the original beast due to time pressures, build vs. buy decisions, scope creep, and a number of other fun issues. Regardless of my vigilance, I find that logging, and thoughts around instrumentation almost always come last. I am not alone in my observations as other analyst friends have made the same comment. In fact, this was verified by a development lead at a large organization recently when he commented to me “you know, we always wait until it’s too late to add logging, if we even consider it in the first place.”

Why is it that engineers have such an aversion to extended, non-performance instrumentation, and find it so onerous or unimportant? They write unit tests. They instrument for speed of throughput, heartbeat, and error messaging but tend to ignore the basics of user behavior on the products they have built.  It is seen as extraneous, performance impacting, nonsensical even. This is unfortunate.

When I was in graduate school my dissertation focused on how individual’s beliefs about the degree to which their organization in general, and their supervisor specifically, impacted their work behaviors.  In other words, if you think your supervisor cares about you as a person, does that make you work harder? What about your overall organization – does that matter? Are there special traits of supervisors that make you more or less likely to do your job well, to help others, to protect the organization from lawsuits or other problems, to decide to stay instead of quitting?  It took me almost 2 years to collect enough data to answer this set of questions. Two years. Today, I can ask interesting, in-depth questions about the data I collect every 2 minutes. The only reason this is possible is because the damn products are instrumented like mad to tell me everything the user is doing, seeing, interacting with (and choosing to ignore). This information is powerful for understanding usability, discovery, annoying product issues like confusing pages or buttons. Predictive analytic models can be built off of this behavior (user X likes this stuff, hates that stuff, buys this stuff, ignores that stuff etc…) but only if it is logged. With both a strong BI opportunity and predictive analytics opportunities, why is logging so often ignored, perfunctory, or offloaded to companies like Google – almost as an afterthought?

My theory is that because the nuances of logging often make it fragile and complex, it isn’t easy to determine if it is accurate when in development. As the underlying systems change – whether that be schema shuffling or enumerated value redefinition (or recycling) for example and many hands are touching the code that creates the product, it makes sense to wait until things settle down to begin adding the measurement devices. Unfortunately, there are often special cases introduced – invisible to an end user, but obvious under the hood that makes straightforward logging difficult. The end result is often a pared down version of logging that is seen as “good enough” but not ideal. The classic “we’ll do this right in vNext” is my most hated phrase to hear.

The workaround to this malady, when possible, is to introduce clear, concise, standardized logging requirements that engineers can leverage across products. Often a block of specific types of values (timestamp, screen size, operating system, IP, user-id, etc) describe a majority of the values the analyst needs for pivoting, monitoring, etc. the remaining portion of a schema can then contain the pieces that are unique to the specific product (like “query string” if searching is a possible action in one product but not others).

The analyst must be vigilant, aware, engaged, and on the lookout for implementations that introduce actions or behaviors that are currently unlogged or that break expectations so that he or she can engage engineers proactively, before it’s too late, to add functionality to logging and be sure that important and essential user behavioral data does not go down the tube of the dreaded “vNext”.