One of the persistent questions which keeps entrepreneurs on the edge is "Are we building the right thing?"
In the first web bubble, the Silly side of Silicon Valley chased vanity metrics such as "the number of eyeballs on the site" and "brand awareness" and "unique visitors". Those numbers are only interesting when you can correlate them to producing value for customers and bringing in real cash in the form of revenue.
I've enjoyed the book The Lean Startup by Eric Ries because he offers a much better mechanism to track the success or failure of any attempt to produce real value to customers. While split testing (or A/B testing) is useful to see how small changes lead to different customer behaviors, Ries recommends cohort analysis, where you can see the behavior of real customers through the sales funnel and correlate the X-axis with individual changes to your business or product.
That means tracking customer behavior. If you're building some sort of software as a service product, and if the mechanism of delivery of that product is primarily a web site, you probably already know the punchline.
Assume I already know how to identify and log events for each salient customer action type. (I've built that kind of system before.) Assume I don't want to collect personally identifiable information (I don't). Assume I'm using Plack and its middleware heavily, and assume I'm happy using Catalyst as a web framework.
How can I identify unique users (with and without accounts) on a daily basis, anonymize them, but group their actions across the site such that my automated daily cohort graphs correspond with reality?
So far I've identified few points of possible contention. I can rely on browser cookies for unique identification of users if I know that user sessions have unique identifiers within a 24 hour period. (I could generate GUIDs for this, but that may be overdoing things.) I think< I also have to track the transition from anonymous visitor to authenticated user, but I might be able to convince myself that either replacing the current session or smple subtraction of successful login events from total number of unique anonymous visitors would give the right numbers.
(I also haven't dived much into how Catalyst 5.9 and Plack interact in terms of session and cookie handling. Everything's just worked, so I've ignored the details until now.)
I don't mind building such a system if necessary, but if all of the pieces are out there and available—or if someone's already built this and can give guidance—so much the better.
Have you solved this problem? If so, how did you do it? If not, how would you do it? Would you handle logging at the Plack level or the application level? Would you worry about tracking session changes? Does Catalyst need to know about this?