arrow-left arrow-right brightness-2 chevron-left chevron-right circle-half-full facebook-box facebook loader magnify menu-down rss-box star twitter-box twitter white-balance-sunny window-close
Remarks on Segment.com
4 min read

Remarks on Segment.com

In the last years i've been keeping a close watch on segment and the capabilities of the platform. We're long time customers and big fans. A quick list of how/which parts of Segment we're using:

  • Multiple projects (10+)
  • Event Filtering (on the client)
  • 14+ integrations (over multiple projects)
  • Analytics.js (provided by segment)
  • Analytics.js (based on open source version)
  • Outgoing webhooks
  • Event API + batch imports
  • Historic imports (calls with old timestamps)
  • Warehouse (on postgres)

Personally I think segment is awesome. We've been a paying customer since 2013 and when looking through the emails my first email exchanges were with Peter Reinhardt (CEO/Cofounder) of a now 600+ people company. What a journey!

No company is perfect, and in the last months I have ran into a range of issues making me feel surprised or at some points just flat out frustrated.

I am writing this down to keep a list but also hope some of those issues can be addressed.

We are running two analytics projects. One for marketing (www.prezly.com) and one for our customers (rock.prezly.com). By default the cookie domain of everything Segment is .yourdomain.com which means that data (like identifiers or user traits) is shared. Muchos problemos.  

cookies on www.prezly.com

There is a community question about it but no way to tune this if you're using the segment provided analytics.js.

Update 1: Asked through github (analytics.js library) in 2018.

Update 2: Asked commercial support about that earlier today. Their response:

Analytics.js in async/lambda environments

If you want to write (async) lambda functions don't use the analytics.js node library. It just doesn't work. Although playing with event flushing timeouts or manually calling flush might make it a little more stable. It will never be 100%.

In our case only 20-30% of our calls (originating from Lambda) made it to segment. This makes me feel that no one at segment has tried using the library in a serverless/lambda environment.

Event Delivery Reports are inaccurate. Events are accepted, not delivered.

I've been replaying/importing old events which is super easy by providing a timestamp trait to any tracking call. I spent 36 hours trying to debug why some events didn't arrive in mixpanel.

Turned out that we were using the API secret instead of the API key. And since Mixpanel requires an API key for old events those events silently failed. You'd think that the Event Tester or Event Delivery reports would catch those failed events? Think again.

Closed Source Glue Code

I am not 100% sure about this but I suspect that segment has their own internal analytics repository that can do more than the open-source version can. This is expected as customers have more functionality and probably need a better way to load segment configuration (filtered events, enabled integrations, ...) it would be more transparent to open-source that code as well.

Data Warehouse Indexes

Data warehouse is awesome. It makes segment create a schema and sync all calls (identify, track, page, ... everything) to the warehouse of your choice (Postgres, Redshift, ...). Your data remains yours! That data can then be used to replay old events.

Now if you start using the tracking archive to do anything else then just historic browsing you'll bump into performance issues. Thats because the segment provided and maintained schema does not have any indices.  

Event filtering should be free

I really believe that event filtering should be free and in every plan. It's such a key feature/USP of getting people to go with a tracking abstraction layer. I am convinced that feature gating event filtering is a mistake.

In fact we had event filtering for over a year when suddenly it was taken from us in presumably a plans/feature cleanup. We were kept on a grandfathered plan that suddenly didn't have the feature anymore. That's right, when this feature was launched even Segment was convinced it is a core feature available on every plan.

We're having a lot conversations ourselves on which features go to which plan. It's a though excise. I really like how Joost de Valk from Yoast explained their thinking about which features are free and which ones should be paid for in an episode of 'The Top' (Jason Latka).

Latka: How do you decide where you put up the paywall for features?

Joost: We have a strict rule within the company: Everything we think every site on the planet needs we put on the free plan. Stuff that saves you time we add to the premium plugin.

Joost de Valk - Yoast

I try to apply somewhat the same principle: Features that are expected/so common are available in every plan. For example adding an image to your story is not a feature we want to put behind an upgrade gate. Other examples:

  • Export of your data
  • 2 Factor Authentication
  • Logging in with your Google Account (security)

With that is the realisation that once you hit a certain scale that simple features like that rarely have an operational cost associated with it. I would even argument the opposite.

So for segment we have our own analytics.js wrapper that filters events. It's 30-50 lines of Javascript, using an API that is already there, just to specify which events go where:

import importantEvents from './importantEvents';

const getIntegrations = (eventName: string) => {
    if (importantEvents.indexOf(eventName) !== -1) {
        // send important events to all integrations
        return undefined;
    }

    return {
        All: false,
        Mixpanel: true,
        Vitally: true,
    };
};
analytics.js wrapper to filter events