Sep 262011

First off, read this.

So Netflix says to the SEC that churn is not important to them. Except that they didn’t actually say that. They said “the churn metric is a less reliable measure of business performance, specifically consumer acceptance of the service.” meaning that the metric, for them, is broken and therefore should not be used to compare them to others in the marketplace. The cynic would respond “what are you hiding?” but the truth is that they are correct: in their business, churn is so different that trying to compare it across companies is a disservice to the naive public. The information would be misconstrued and therefore should not be revealed. I generally am of the mind that you let the consumer of information make the decision about the quality of information but here I am with Netflix – the consumer is likely to misuse and ignorantly, accidentally compare it to other types of “churn” (a difficult metric to define to begin with).

In the BI universe, we use KPI’s to monitor the progress or success of a product, system, business, etc. but we also use them to compare and benchmark against like products. Pageviews, time on page, click through rates, etc. are the common bellwethers of awesomeness or supreme suck in the web world. But what happens if you make a website that uses a continuous scrolling method… like say an image search results page? Suddenly your pageviews per user drops massively compared to industry standards! I would argue that the continuous scrolling image search is superior to the tired paging image search (in fact, so superior that Google ripped off Bing to some degree… a rarity to be sure) but have heard through the grapevine that one specific search-y company refused to drink the continuous scrolling Kool-ade due to the impact it would have on third party web reporting metrics. Sacrifice the user experience for sake of the KPI. So what does this mean for Netflix vs. the SEC?

When the paradigm changes, it’s often hard to jump out of the traditional KPI rut. Those KPIs are comparable, comfortable, expected (here with NFLX we are talking about quarterly churn, but we could be talking about unique users, time on page, page views, or any other metric). Remember P/E ratio arguments during the Dot Com boom? I find the same issues in my job as a BI manager – we have a product that is an Android widget and someone asks me a question about pageviews – what the hell is a pageview on a widget? Is every time a user focuses on the widget a “pageview”? Are all actions in the widget separate pageviews or part of the same initial pageview? Lastly, (and more importantly) does a pageview count matter or is it (or something similar) only useful when used as an internal metric?

The key, in my opinion, is the use of internal versus comparative metrics. Netflix is saying that giving out churn numbers as they are traditionally calculated is a great way to confuse and freak out their investors since so many customers “quit” and then rejoin a few months later. The definition of churn is too narrow (“users who quit the service” / “total users”) because a user who quits in January and rejoins in March has technically “churned” in Q1 even though they are now a customer again. As an internal metric, understanding their churn from quarter to quarter makes sense. They might want to (and surely do) offset that by calculating a metric of “sticky churn” i.e. people who, in the words of Marsellus Wallace in Pulp Fiction “and when you’re gone, you stay gone”.  Or even better would be a whole suite of metrics around churn and churn like behaviors – new never before seen people, returning after a short break people, returning after a long break people, totally gone from the system as far as we know it people. Lots of options, nothing perfect, nothing overly clear, and everything confusing to investors who only know how to compare the metrics they know and love within and across companies. I don’t blame Netflix for keeping the numbers to themselves. Of course, it would be nice for them to release a case study on all the cool and weird ways people migrate around their services – not for investors’ sakes, but for my own nerdy curiosity.

Apr 202011

As a business analyst, I live and die by logging. This makes me vigilant about what products are being developed by my organization, and how they change from concept to wireframes to implementation. Rarely do these three stages look the same, and sometimes the end product is a far cry from the original beast due to time pressures, build vs. buy decisions, scope creep, and a number of other fun issues. Regardless of my vigilance, I find that logging, and thoughts around instrumentation almost always come last. I am not alone in my observations as other analyst friends have made the same comment. In fact, this was verified by a development lead at a large organization recently when he commented to me “you know, we always wait until it’s too late to add logging, if we even consider it in the first place.”

Why is it that engineers have such an aversion to extended, non-performance instrumentation, and find it so onerous or unimportant? They write unit tests. They instrument for speed of throughput, heartbeat, and error messaging but tend to ignore the basics of user behavior on the products they have built.  It is seen as extraneous, performance impacting, nonsensical even. This is unfortunate.

When I was in graduate school my dissertation focused on how individual’s beliefs about the degree to which their organization in general, and their supervisor specifically, impacted their work behaviors.  In other words, if you think your supervisor cares about you as a person, does that make you work harder? What about your overall organization – does that matter? Are there special traits of supervisors that make you more or less likely to do your job well, to help others, to protect the organization from lawsuits or other problems, to decide to stay instead of quitting?  It took me almost 2 years to collect enough data to answer this set of questions. Two years. Today, I can ask interesting, in-depth questions about the data I collect every 2 minutes. The only reason this is possible is because the damn products are instrumented like mad to tell me everything the user is doing, seeing, interacting with (and choosing to ignore). This information is powerful for understanding usability, discovery, annoying product issues like confusing pages or buttons. Predictive analytic models can be built off of this behavior (user X likes this stuff, hates that stuff, buys this stuff, ignores that stuff etc…) but only if it is logged. With both a strong BI opportunity and predictive analytics opportunities, why is logging so often ignored, perfunctory, or offloaded to companies like Google – almost as an afterthought?

My theory is that because the nuances of logging often make it fragile and complex, it isn’t easy to determine if it is accurate when in development. As the underlying systems change – whether that be schema shuffling or enumerated value redefinition (or recycling) for example and many hands are touching the code that creates the product, it makes sense to wait until things settle down to begin adding the measurement devices. Unfortunately, there are often special cases introduced – invisible to an end user, but obvious under the hood that makes straightforward logging difficult. The end result is often a pared down version of logging that is seen as “good enough” but not ideal. The classic “we’ll do this right in vNext” is my most hated phrase to hear.

The workaround to this malady, when possible, is to introduce clear, concise, standardized logging requirements that engineers can leverage across products. Often a block of specific types of values (timestamp, screen size, operating system, IP, user-id, etc) describe a majority of the values the analyst needs for pivoting, monitoring, etc. the remaining portion of a schema can then contain the pieces that are unique to the specific product (like “query string” if searching is a possible action in one product but not others).

The analyst must be vigilant, aware, engaged, and on the lookout for implementations that introduce actions or behaviors that are currently unlogged or that break expectations so that he or she can engage engineers proactively, before it’s too late, to add functionality to logging and be sure that important and essential user behavioral data does not go down the tube of the dreaded “vNext”.