Service Workers at Scale, Part I: CDNs and Multiple Origins
We recently got the opportunity to develop a service worker for use on Smashing Magazine, which we’ll write about in more detail soon. This is the first of a multi-part article that will examine some of the solutions we arrived at and lessons we learned.
One of the challenges we’ve encountered while developing this worker is strategic caching for external resources. Most of the service worker examples we’ve seen address handling requests for local resources on the same domain. But pages served by complex sites often need to fetch (and cache) assets from external domains as well. Things can get complicated when these various domains need special handling, so we need a manageable way to structure our service worker to evaluate requests for both local and external resources.
Special handling for CDNs and subdirectories
The host of our service worker happens to be a WordPress site with assets requested from both the local server as well as CDNs. We need to handle fetch events differently depending on which domain or subdirectory they relate to. Some CDN requests need to be routed to cache functions, and some local requests (such as those for pages within the admin area) need to be ignored altogether. To fulfill this behavior, we need some way to designate rules for different URL origins.
One approach uses regular expressions. We need to match the URLs of fetch event requests against a set of base URLs, and using
RegExp for this makes sense. In practice, this method works well initially, but maintenance becomes more of a concern as the patterns get more complex.
Take for example an entry to match “content pages”:
This will match any local URLs except for ones with
wp-admin following the first slash. It’s not a complicated pattern as-is, but what about when we need to add another subdirectory exception? Is there another approach that will cater more to maintainability and comprehension?
Comparing URLs with the URL class
Let’s substitute the great power and responsibility of regular expressions for a more explicit way to classify URLs. Using a Map structure to represent the base URLs to handle fetch events for, we can “flag” some items to indicate them as subdirectories to ignore:
It’s more verbose, but it provides a clear interface for maintaining the list of base URLs we want to act on.
From this core list, we can derive more specific ones. Each of the derived lists consists of
With our subjects of interest stored as
URL instances, we can then use properties like
pathname in our logic:
Should this request be handled?
With our URL classifications and helper functions in place, the fetch event handler can use them to decide if it needs to intercept a request:
Coming up next: Using URL properties for Offline fallbacks
In part two, we’ll take a look at another unique challenge this project presented: serving URL-aware offline fallbacks.