When comparing the Prezly sales pipeline Hubspot data against what I was seeing in Mixpanel there was a 10% difference. We could work with a 10% difference for most of the marketing decisions but at the same time could not use Mixpanel only for all marketing/sales reporting.
For one given period we found about 556 demos in Hubspot and only 485 in Segment/Mixpanel.
The website is setup that our forms (the form at www.prezly.com/demo) and any other form on the website submits data directly to Hubspot Forms API urls.
This is some of the code that submits the data from the browser to hubspot:
To investigate I exported all the Hubspot demos and all the identify() calls from Day 2 (stored in DynamoDB) for a given period in a google sheet.
The first thing I did is removed the duplicates from the Hubspot list using a sheet addon:
Considering that Hubspot had the most data I added two columns to figure out if the same data was in segment. I ended up using vlookup in the 2nd table to see if the same email was there.
Added another helper column that used that data to put 0/1 in a new column
Ended up with something like this:
Allright. Something is up here. Comparing from both sides I found that all segment identify() data is in Hubspot but not the other way around.
Next questions for me was to figure out if this was a temporary thing? Like at one point some integration was broken/JS errors? So made a quick chart to see the distribution of misses over time:
This was consistent with the numbers higher up. Around 10-12% of demos end up in Hubspot but not in segment.
Now I was pretty confident that some visitors block calls to segment.com or other integrations. The 10% rate is consistent an article I found
I already knew that browsers are getting smarter about blocking 3th party scripts and tracking snippets. Big user of Brave myself, try to fall back to Firefox when that doesn't work, but I find myself disabling Brave shield a lot because some sites just don't work without it.
Reading up on the subject I discovered that a lot has happened in this area and blockers are getting better and stronger. Good. Here is a list of stuff that can potentially break my integration:
What can we do?
I'm not looking to track adblocking users or violate anyone's trust. What I am trying to get at is event-tracking for attribution purposes (marketing efforts) and if that is possible with an all-in-one segment approach.
After reading through segment forums, some docs and best practices I think we should change our segment setup:
I did most of the things above. Still working with Segment and our website infrastructure to also proxy the tracking calls.
Tomorrow I will try to discover if I can back fill the missing data in any way.