‘Another sandwich!’ said the King.

‘There’s nothing but hay left now,’ the Messenger said

~ Lewis Carrol, Through the Looking Glass

From a performance standpoint, SDCH or Shared Dictionary Compression over HTTP (pronounced Sandwich), provides one of Chrome’s great party tricks. Particularly on mobile connections, where minimizing the number of round trips significantly increases the perceived speed of the page, reducing the size of the initial page load offers a major benefit. Unfortunately, implementations of SDCH and tools to work with the required dictionaries are scarce, and guides on the protocol prove hard to find. In the spirit of the excellent LinkedIn Blog post on SDCH, we’re writing up our guide to implementing SDCH and open sourcing a few small tools for producing the required dictionaries, including a demo of SDCH on Wikipedia pages, demonstrating an 85%+ compression ratio.

Background

GZip and friends provide worthwhile benefits for textual resources transferred via HTTP, but they are only able to optimize within the context of a single request. A page containing a list of elements with common HTML amongst the elements will compress quite nicely, since GZip can do a lot with the repeated content. On the other hand, an element that appears on every page of a site but only once on each page, such as a header or footer or common javascript and CSS compresses very poorly, since standard compression only works within the scope of each individual request. SDCH overcomes this limitation by providing a dictionary of common elements shared across multiple page loads.

Google added support for SDCH to Chromium, including Chrome and Android, some time ago. Unfortunately, Firefox has failed to show much interest in actually implementing SDCH. Server side SDCH implementations have proven scarce. An Apache module has gone nowhere, and a node implementation didn’t want to work properly when we tested it. On the positive side, VCDIFF, the actual compression format used for SDCH, has several implementations, including OpenVCDIFF, our chosen library.

Based on our experience, three main factors prevent wider adoption of SDCH. First, the informal and at times contradictory nature of the spec meants that an effective implementation required us to browse Chromium sources in order to complete a working implementation. Second, even though VCDIFF encoders abound, the lack of an integrated server module makes adding SDCH a non-trivial task. Even our implementation eventually settled on an application layer modification, rather than taking on the task of writing an NginX module. Finally, in the general case, building an effective dictionary requires solving the multiple longest common substring problem over a large corpus of pages. Numerous academic papers discuss the problem, but working implementations are scarce, and running times rapidly get out of hand.

Implementing SDCH

Every summer we spend a few weeks just working on speed. This past summer SDCH came up as one of our proposed speed improvements and we quickly whipped up some basic working demos using information from old SDCH mailing list posts and the LinkedIn article (We strongly encourage anyone attempting to implement SDCH to look at the request-response diagram in the LinkedIn post). The results were frankly astounding. As LinkedIn reported, compression in the neighborhood of 85% can be achieved even with dictionaries of modest size, when both SDCH and GZip are used together. On the promise of our proof of concept, we resolved to roll out SDCH as soon as possible.

Despite the amazing benefits which hung just out of reach, actually implementing SDCH proved fraught with difficulties. Setting aside the dictionary creation, which can be pushed into the build step, the following problems reared their ugly heads:

  1. If Chrome detects an error with SDCH, it blacklists all SDCH responses from the origin for a significant period of time. As a result, testing requires restarting Chrome after every error. The good news is that Chrome offers an SDCH tab at chrome://net-internals/#sdch, which also includes links to open the events tab with filters for SDCH request applied. This tab should be your constant companion while working on SDCH. In addition, clicking on an SDCH request in the events tab provides the complete debugging history of the request. In the event that an error occurs, a search of the Chromium source code for the exact error message provides the only useful source of information. In addition, chrome://histograms/Sdch provides some interesting stats on the internals of the SDCH system in Chrome.

    SDCH Domains Page in Chrome
    SDCH Domains Page in Chrome
  2. According to old mailing list entries, Google encountered some proxy servers which strip the SDCH entry from the Content-Encoding header and other shenanigans. As a result, Chrome assumes that every response made to a request with an Avail-Dictionary header is, in fact, SDCH encoded. Obviously, responses for media files won’t be encoded. Any response not SDCH encoded must include the header X-SDCH-Encode: 0. For reference, in Apache it looks like this <FilesMatch "\.(ico|png|jpg|css|js|woff|swf|html|eot|svg|ttf|woff2)$">Header set X-SDCH-Encode 0</FilesMatch>, and in NginX, like this add_header "X-SDCH-Encode" "0";. This problem is exacerbated by the fact that some versions of the spec refer to the header as X-SDCH-Encoded. In reality, Chrome only respects X-SDCH-Encode.
  3. In order to ensure the integrity of the dictionary, the server advertises the dictionary by part of it’s SHA-256 hash. The dictionary itself must contain the host and path as a header which goes into the hash. Confusingly, the bare dictionary without the header must be used when actually encoding the response. This means that the server side implementation either needs to strip the header each time it uses the dictionary, or else store a second version of the dictionary file, and use the with header version when responding to client requests and the bare version when encoding content. We chose the latter implementation to avoid string manipulation on every response. In some implementations, using only a single file might work, by passing a file pointer or buffer to the encoder after reading off the header bytes (and in fact, this is exactly how Chrome handles dictionaries). On the other hand, in our PHP implementation, reading off the header every time proved useless overhead. Moreover, there’s no reason whatsoever, not to use the complete dictionary including the header when encoding, it will work just as well and simplify implementations.
  4. Until recently the SDCH mailing list has been very quiet. Recent activity in the list and a seeming renewed interest in the spec provides hope, however. It also prompted us to write up these experiences, and create an easy to use demo in the hopes that more implementations emerge.
  5. Based on the LinkedIn post and observed Google usage, usage seems mostly for javascript and CSS. Our testing showed that the greatest benefits came from using it on the whole page and inlining many of our resources. This means we had to modify our page rendering engine to support several different methods of handling CSS and JS, including linking to them, partial inlining and full inlining.
  6. As a result of our decision to inline resources in SDCH environments, the size of the pages we used to train the dictionary grew significantly. Even on a small subset of pages, computing the dictionary with femtozip started taking tens of hours for each build, which significantly reduced our ability to quickly build and deploy updates. To solve such length build times, we further modified our page rendering engine to produce versions of pages without inline resources and without template variables, but with all of the HTML that each page would otherwise contain. We then train in two steps: first on CSS using a simple CSS parser; and then, on the neutered HTML pages using femtozip.
  7. Actually implementing the vcdiff encoding step was one of the easiest parts of the entire implementation. We simply pass the fully rendered page to openvcdiff with the -interleaved and -checksum options enabled. This works well in production.

When all was said and done, all of these changes, occurring as they did over several product cycles resulted in a total implementation time of nearly 6 months. We will readily concede that had we dedicated a single engineer to work on SDCH support full time, it should have been a matter of weeks instead of months. Nonetheless, the most effective implementation required instrumenting a number of different parts of our front end application, and was not a matter of “just” installing a plugin on our servers.

While we cannot easily produce a simple program to provide instrumentation or modify resource inclusion on other sites, we can offer the scripts which we use as part of our build process for preparing and properly formatting the dictionary. We’ve put them up on github under a permissive license. We hope that these scripts prove useful either as runnable programs or as a starting point for adding dictionary creation to your own build process.

In particular, we’d like to call out our method for building the dictionary for CSS. Rather than take the lengthy time necessary to compute the true longest common substrings for a large corpus of CSS, we parse the CSS and use a tokenized approach. We convert selectors, keys and values into a stream of tokens. After excluding those tokens which are too short to be useful (in our testing a minimum length of 7 worked out well), we create a histogram of tokens. Finally, we produce a dictionary from each token with more than a minimum number of occurrences (for our CSS, a minimum of 2 occurrences provided a nice balance of compression and dictionary size) or from unique but long tokens. We realize that this dictionary does not represent an ideal LCS of the CSS, but anecdotally, it still provides significant compression of CSS resources and builds in a matter of milliseconds.

Demonstration on Wikipedia

When we’ve talked to people about SDCH at conferences and tech events, we’ve received some skepticism. In particular, it’s been suggested that because our pages contain a lot of repeated elements with heavy HTML, SDCH works better for us than in the general case. In addition, we’ve heard the criticism that SDCH “overfits” the dictionary. But SDCH works well precisely because it trains on the actual content present on a particular site rather than on an abstract dictionary of the entire internet. Our github repo includes a demonstration of SDCH on Wikipedia pages. In order to provide a fair test, our demo trains the dictionary on one randomly selected set of pages, and performs the compression test on a different randomly selected set of pages. Even with such randomization, the test typically produces compression greater than 80%.

Wikipedia Test Results
Wikipedia Test Results

SDCH in Production

Our first production deploy of SDCH occurred along with several other product releases, and we did not do a proper A/B test between SDCH pages and non-SDCH pages. The release which enabled SDCH increased our average page size by about 4.3% (including the weight of CSS and javascript) due to other unrelated changes. Nonetheless, in the three weeks following the release our average page load time on Chrome (excluding Chrome for iOS, since it doesn’t support SDCH) was 15.5% faster than during the preceding three weeks, as reported by Google Analytics. We saw a modest increase of ~8% in CPU usage with the new release, although it contained other new code in addition to the SDCH, and we do not have an easy way of measuring CPU usage per browser at the moment. Overall, SDCH has provided a significant and worthwhile increase in page speed on Chrome. We hope that other browsers and server side implementations offer support in the future.

Pressed sandwich image from Tammy Gordon.

NO COMMENTS

LEAVE A REPLY