Service Worker

v1.0.1 · 1,561 LOC · The core engine that intercepts navigations, detects traffic sources, resolves unexpanded macros, persists data in IndexedDB, and injects digitalData into HTML responses — all without touching the main thread.

Lifecycle

The Service Worker uses Skip Waiting + Clients Claim for instant activation. Once registered, it intercepts every same-origin navigation with query parameters.

// Install → skip waiting immediately
self.addEventListener('install', (event) => {
  event.waitUntil(self.skipWaiting());
});

// Activate → claim all clients
self.addEventListener('activate', (event) => {
  event.waitUntil(self.clients.claim());
});

Fetch Interception

The SW only intercepts navigation requests (not assets, XHR, etc.) with query strings, from the same origin:

self.addEventListener('fetch', (event) => {
  if (request.mode !== 'navigate') return;     // Only navigations
  if (url.origin !== self.location.origin) return; // Same-origin only
  if (url.search.length === 0) return;          // Only with params

  event.respondWith(handleNavigation(event));
});

handleNavigation Flow

  1. Extract raw URL parameters
  2. Sanitize ALL input via sanitizeUrlParams()
  3. Generate digitalData (traffic detection + macro resolution)
  4. Save to IndexedDB (non-blocking, catch errors)
  5. Broadcast to all clients (non-blocking)
  6. Try Navigation Preload, fallback to fetch(request)
  7. If response is HTML → inject script into </head>

Traffic Detection

Two main classes handle traffic classification:

TrafficDetector

Comprehensive traffic source classifier with detection for:

CategoryPlatforms/ProvidersClassification
Paid SearchGoogle, Bing, Yahoo, Baidu, Yandexgclid, msclkid, yclid, bdclkid
Paid SocialFacebook, TikTok, Twitter, LinkedIn, Snapchat, Pinterest, Redditfbclid, ttclid, twclid, li_fat_id, scid, pclid, rdt_cid
Display/NativeCriteo, Taboola, Outbrain, DoubleClick, TheTradeDesk, AdRolltblci, dclid, ttd_click_id, etc.
AffiliateImpact, CJ, Awin, ShareASale, Rakuten, ClickBankirclickid, cjevent, awc, sscid, etc.
EmailHubSpot, Mailchimp, Klaviyo, Marketo, MailerLitemc_cid, _ke, mkt_tok, etc.
Mobile AttributionBranch, Adjust, AppsFlyer, Singular, Kochava~click_id, af_click_id, etc.
AI ReferralChatGPT, Claude, Gemini, Perplexity, PoeReferrer pattern matching
Organic SearchGoogle, Bing, DuckDuckGo, Brave, Kagi + 10 moreReferrer domain matching
Social OrganicFacebook, Instagram, X, LinkedIn, TikTok, YouTube + 10 moreReferrer domain matching
Mobile AppsAndroid package names + iOS user-agent patternsandroid-app:// or UA matching

Channel Assignment

Traffic is classified into GA4-compatible channels:

Paid Channels

  • Paid Search
  • Paid Social
  • Paid Video
  • Paid Shopping
  • Display
  • Audio

Organic Channels

  • Organic Search
  • Organic Social
  • Organic Video
  • Organic Shopping
  • Email
  • Referral / AI / SMS / Direct

MacroDetector

Detects and normalizes unexpanded ad platform macros in URL parameters (e.g., {{campaign.id}} when server-side expansion fails):

PlatformMacro FormatExample
Facebook{{macro}}{{campaign.id}}
Google{macro}{campaignid}
TikTok__MACRO____CAMPAIGN_ID__
LinkedIn{{MACRO}}{{CAMPAIGN_ID}}
URL-encoded%7B%7Bmacro%7D%7DAuto-decoded

IndexedDB Persistence

Database: traffic-campaign-db
Store:    preserved-data
Key:      'latest' (single record, overwritten)
TTL:      30 minutes (IDB_TTL_MS)
Version:  1

Each navigation with parameters saves/overwrites the latest digitalData to IndexedDB. Records include an expiresAt timestamp — stale records are discarded on read. This ensures data survives page reloads and SPA navigations.

HTML Script Injection

When a navigation produces HTML, the SW injects a <script> block before </head> (with fallbacks to <body> or prepend):

// Injected into HTML response:
window.__PRESERVED_DIGITALDATA__ = <sanitized JSON>;
window.__DIGITALDATA_TIMESTAMP__ = 1711234567890;
window.__DIGITALDATA_SOURCE__ = 'service-worker';
window.dispatchEvent(new CustomEvent('trafficcampaign:ready', {
  detail: window.__PRESERVED_DIGITALDATA__
}));
Security: JSON is passed through safeJsonForScript() which escapes <, >, $,&, ', = and Unicode line/paragraph separators to prevent context-escape attacks.

Message Protocol

The SW responds to 3 message types via MessageChannel:

Message TypeDirectionPurpose
PROCESS_URLClient → SWRequest processing of current URL (same-origin validated)
CONFIGClient → SWUpdate runtime config (debug mode)
GET_DIGITALDATAClient → SWRetrieve latest data from IndexedDB

Security Validations

  • Origin check: Rejects messages from foreign origins
  • Cross-origin URL: PROCESS_URL rejects URLs not matching self.location.origin
  • Input sanitization: All URL params passed through sanitizeUrlParams()
  • Referrer sanitization: sanitizeValue() applied to referrer in message handler

digitalData Output Schema

{
  campaigns: {
    campaignProvider: {
      provider: "googleAds",      // Detected provider
      parameter: "gclid",         // Trigger parameter
      value: "abc123"             // Click ID value
    },
    isCampaign: true
  },
  trafficDetail: {
    source: "google",             // Traffic source
    medium: "cpc",                // Traffic medium
    campaign: "spring_sale",      // Campaign name
    channel: "Paid Search",       // GA4-compatible channel
    provider: "googleAds",        // Provider key
    clickId: "abc123"             // Click ID (if applicable)
  },
  utm_source: "google",
  utm_medium: "cpc",
  utm_campaign: "spring_sale",
  utm_content: "banner_v2",
  utm_term: "attribution sdk",
  cp_custom_param: "value",       // Custom cp_ prefix params preserved
  macroDetection: {
    detected: false               // or { detected: true, platform: "...", normalized: {...} }
  }
}