Our goal at Potions is to offer real-time segmentation to e-commerce website while respecting web users privacy.
Here is how we are doing it : we store all the web user's navigation data in his device and we run the segmentation model directly in his browser to learn in which segment he falls in real-time without retrieving anything about him.
One of our first capability is to enable our e-commerce clients to access in real-time the segments that they defined in Universal Analytics. For instance target "high value customers" who have spent over 100$ in the past. Google Optimize 360 offers this capability through third party cookies and server requests, we, at Potions, offer it in real time, without third party cookies and without slow server request.
Let's dig in what it means technically.
Universal Analytics segments and schema
Google Universal Analytics lets you create segment in the interface or through the API to filter either sessions or users in complex queries.
A segment is a kind of "pre-query" that selects the users or the sessions on which you intend to run the real query.
Note : one actual limitation of Universal Analytics query system is that both the "pre-query" and the "real query" have to share the same time period. Therefore, the query "list the June transactions from the people who clicked on this button in May", is not possible... And it is more problematic than you think.
Here is how segments work in more details :
Segments - Feature Reference | Analytics Core Reporting API | Google Developers
This document provides an overview of segments in Google Analytics. There is an updated version of the Google Analytics Reporting API. We recommend migrating your code today to take advantage of the new API's key features. Segments allow you to select users and sessions to answer questions that are important to your business.
https://developers.google.com/analytics/devguides/reporting/core/v3/segments-feature-reference
The "high value customers" segment is defined through a simple condition in the interface
Here is the schema of Google Universal Analytics data according to https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/userActivity/search#Activity
The schema for GA4 is different
In the browser
We store each event sent to Google Analytics in RxDB database in the browser https://rxdb.info/
Each time the database is updated with a new event (or less frequently), we use RxJS https://rxjs.dev/guide/overview "observable sequences" to list the segments that the user might be falling into.
RxJSFollowing the Universal Analytics schema, but adapting it to be more flexible
A segment can be used as a predicate taking a user or a session as an argument and returning a boolean.
Segments in Google Universal Analytics have the following structure
javascript"sessionSegment": { "segmentFilters": [// Set of segmentFilters that are combined with an AND { // A segmentFilter can either be a simpleSegment or a sequenceSegment // the not property of the segmentFilter is self explanatory "simpleSegment" :{ "orFiltersForSegment": [...] }, "not": "True" }, { "sequenceSegment":{ "segmentSequenceSteps": [ { "orFiltersForSegment": [...], "matchType": enum(MatchType) }, { "orFiltersForSegment": [...], "matchType": enum(MatchType) } ], "firstStepShouldMatchFirstHit": boolean } } ] },
We developed a library of operators on formulas that can be composed to create complex predicate.
The static function LTL.fromUASegment(ua_segment) returns a function that takes a sequence of events (typically the ones that are stored) and returns true or false.
RxJS
https://rxjs.dev/guide/overview