Service Worker
v1.0.1 · 1,561 LOC · The core engine that intercepts navigations, detects traffic sources, resolves unexpanded macros, persists data in IndexedDB, and injects digitalData into HTML responses — all without touching the main thread.
Lifecycle
The Service Worker uses Skip Waiting + Clients Claim for instant activation. Once registered, it intercepts every same-origin navigation with query parameters.
// Install → skip waiting immediately
self.addEventListener('install', (event) => {
event.waitUntil(self.skipWaiting());
});
// Activate → claim all clients
self.addEventListener('activate', (event) => {
event.waitUntil(self.clients.claim());
});Fetch Interception
The SW only intercepts navigation requests (not assets, XHR, etc.) with query strings, from the same origin:
self.addEventListener('fetch', (event) => {
if (request.mode !== 'navigate') return; // Only navigations
if (url.origin !== self.location.origin) return; // Same-origin only
if (url.search.length === 0) return; // Only with params
event.respondWith(handleNavigation(event));
});handleNavigation Flow
- Extract raw URL parameters
- Sanitize ALL input via
sanitizeUrlParams() - Generate
digitalData(traffic detection + macro resolution) - Save to IndexedDB (non-blocking, catch errors)
- Broadcast to all clients (non-blocking)
- Try Navigation Preload, fallback to
fetch(request) - If response is HTML → inject script into
</head>
Traffic Detection
Two main classes handle traffic classification:
TrafficDetector
Comprehensive traffic source classifier with detection for:
| Category | Platforms/Providers | Classification |
|---|---|---|
| Paid Search | Google, Bing, Yahoo, Baidu, Yandex | gclid, msclkid, yclid, bdclkid |
| Paid Social | Facebook, TikTok, Twitter, LinkedIn, Snapchat, Pinterest, Reddit | fbclid, ttclid, twclid, li_fat_id, scid, pclid, rdt_cid |
| Display/Native | Criteo, Taboola, Outbrain, DoubleClick, TheTradeDesk, AdRoll | tblci, dclid, ttd_click_id, etc. |
| Affiliate | Impact, CJ, Awin, ShareASale, Rakuten, ClickBank | irclickid, cjevent, awc, sscid, etc. |
| HubSpot, Mailchimp, Klaviyo, Marketo, MailerLite | mc_cid, _ke, mkt_tok, etc. | |
| Mobile Attribution | Branch, Adjust, AppsFlyer, Singular, Kochava | ~click_id, af_click_id, etc. |
| AI Referral | ChatGPT, Claude, Gemini, Perplexity, Poe | Referrer pattern matching |
| Organic Search | Google, Bing, DuckDuckGo, Brave, Kagi + 10 more | Referrer domain matching |
| Social Organic | Facebook, Instagram, X, LinkedIn, TikTok, YouTube + 10 more | Referrer domain matching |
| Mobile Apps | Android package names + iOS user-agent patterns | android-app:// or UA matching |
Channel Assignment
Traffic is classified into GA4-compatible channels:
Paid Channels
- Paid Search
- Paid Social
- Paid Video
- Paid Shopping
- Display
- Audio
Organic Channels
- Organic Search
- Organic Social
- Organic Video
- Organic Shopping
- Referral / AI / SMS / Direct
MacroDetector
Detects and normalizes unexpanded ad platform macros in URL parameters (e.g., {{campaign.id}} when server-side expansion fails):
| Platform | Macro Format | Example |
|---|---|---|
{{macro}} | {{campaign.id}} | |
{macro} | {campaignid} | |
| TikTok | __MACRO__ | __CAMPAIGN_ID__ |
{{MACRO}} | {{CAMPAIGN_ID}} | |
| URL-encoded | %7B%7Bmacro%7D%7D | Auto-decoded |
IndexedDB Persistence
Database: traffic-campaign-db Store: preserved-data Key: 'latest' (single record, overwritten) TTL: 30 minutes (IDB_TTL_MS) Version: 1
Each navigation with parameters saves/overwrites the latest digitalData to IndexedDB. Records include an expiresAt timestamp — stale records are discarded on read. This ensures data survives page reloads and SPA navigations.
HTML Script Injection
When a navigation produces HTML, the SW injects a <script> block before </head> (with fallbacks to <body> or prepend):
// Injected into HTML response:
window.__PRESERVED_DIGITALDATA__ = <sanitized JSON>;
window.__DIGITALDATA_TIMESTAMP__ = 1711234567890;
window.__DIGITALDATA_SOURCE__ = 'service-worker';
window.dispatchEvent(new CustomEvent('trafficcampaign:ready', {
detail: window.__PRESERVED_DIGITALDATA__
}));safeJsonForScript() which escapes <, >, $,&, ', = and Unicode line/paragraph separators to prevent context-escape attacks.Message Protocol
The SW responds to 3 message types via MessageChannel:
| Message Type | Direction | Purpose |
|---|---|---|
PROCESS_URL | Client → SW | Request processing of current URL (same-origin validated) |
CONFIG | Client → SW | Update runtime config (debug mode) |
GET_DIGITALDATA | Client → SW | Retrieve latest data from IndexedDB |
Security Validations
- Origin check: Rejects messages from foreign origins
- Cross-origin URL:
PROCESS_URLrejects URLs not matchingself.location.origin - Input sanitization: All URL params passed through
sanitizeUrlParams() - Referrer sanitization:
sanitizeValue()applied to referrer in message handler
digitalData Output Schema
{
campaigns: {
campaignProvider: {
provider: "googleAds", // Detected provider
parameter: "gclid", // Trigger parameter
value: "abc123" // Click ID value
},
isCampaign: true
},
trafficDetail: {
source: "google", // Traffic source
medium: "cpc", // Traffic medium
campaign: "spring_sale", // Campaign name
channel: "Paid Search", // GA4-compatible channel
provider: "googleAds", // Provider key
clickId: "abc123" // Click ID (if applicable)
},
utm_source: "google",
utm_medium: "cpc",
utm_campaign: "spring_sale",
utm_content: "banner_v2",
utm_term: "attribution sdk",
cp_custom_param: "value", // Custom cp_ prefix params preserved
macroDetection: {
detected: false // or { detected: true, platform: "...", normalized: {...} }
}
}