In my previous post, I wrote about how you can up-level your product-led growth (PLG) flywheel by drawing insights across functional areas, such as product usage data, support tickets, consumption, etc. Back in the day, each functional area or tool maintained its own isolated data repository, making cross-functional analysis a massive undertaking.
Modern cloud data warehouses, with their rich data integration ecosystem, are finally proving to be a viable alternative as the “single source of truth.” Unfortunately, legacy analytics vendors would still have you believe that data silos are acceptable as long as you can “break” them [1, 2] — which is just a fancy term for copying external data into their silo. This approach creates inconsistency in your data and insights that you cannot trust, negating the entire value prop of the analytics. In this post, I will show you a better way.
How We Got Here
First, how did we end up here? Let me rewind a bit. Product analytics grew out of tag management in the early 2010s. Tag management centered around data collection and storage, and because there was no need to integrate this data with other parts of the business, these systems essentially became data silos. First-generation product analytics tools inherited this architecture, creating their own silos for data that they collected from product or web instrumentation. But for PLG and other reasons, customers started demanding analytics across instrumentation data and other business data sources. The solution? Copy all the external data that you need into the analytics vendor’s silo. This was borderline acceptable before cloud data warehouses became the viable single source of truth that they are today, but now it’s just a bad idea to be doing this. Let’s see why.
Why Analytics Silos Are Bad
As a user of various first-generation product analytics tools earlier in my career, I’ve had my fair share of agony with data silos. This is also one of the most cited pain points with first-generation tools that we hear from our customers. When users are saying things like “product metrics do not match the numbers from our data warehouse,” “insights are inconsistent,” or “I do not trust the data ingested in xyz tool,” these are all signs that you don’t have a single source of truth anymore. I’d categorize these pain points into 4 main areas.
1. Inconsistent data
Integrations for bringing external data into a product analytics tool typically operate on a preset schedule, but your users do not. As a result, your product usage data that is “enriched” with external data ends up producing different insights depending on when someone pulls the data and when it was last enriched. What’s worse, the resulting inconsistency causes confusion as insights don’t match with your data warehouse.
2. Missing data
Trick question: what do air travel and your business data sources have in common? Answer: both suffer from random delays and cancellations. Whereas the data warehouse eventually catches up with lost data getting backfilled and corrected, the same data cleansing processes aren’t scalable for each and every data silo. The result: missing data.
3. Data model
Our customers frequently report that analytical silos transcend the data itself and affect the data model as well. The result is that key metric definitions don’t match across your BI tool and your product analytics tool. For example, some of our customers track unique users or customers differently from the defaults used in first-generation product analytics tools.
4. Governance
Trick question: what’s the best way to enforce data governance on a siloed app? Answer: not having a siloed app in the first place. Product analytics tools were created to enable product teams to self-serve, without having to rely on data or BI teams. With the increasing focus on compliance and security, that promise has started fading away as data teams frequently find themselves wrangling with data governance for each siloed environment that they have to support.
The path forward
At NetSpring, we believe that data silos are inherently problematic and no amount of patch work can mitigate their consequences on your business. Our approach is to eliminate them altogether by offering product analytics directly on top of your data warehouse with zero data movement.
Your data warehouse remains your single source of truth, so NetSpring can give you insights that you can trust. See for yourself in live demo!
Check out our other posts on why we believe this is a superior approach: