Home » ETags Revisited – configure entity tags to improve cache performance on websites

ETags Revisited – configure entity tags to improve cache performance on websites

Summary:

Learn how to configure or eliminate ETags to improve website performance.

An Entity tag (ETag) is a unique identifier assigned to a specific version of a given resource on a web server. ETags are used as a cache control mechanism to allow client browsers to make conditional requests. This lets caches work more efficiently by reusing unchanged resources on the client, and avoiding full server responses if the content has not changed. Efficient caching saves bandwidth and improves performance by delivering only what the client needs, not what it already has.

Anatomy of an ETag

In Apache, ETags are made out of three components: the INode, MTime, and Size.

FileETag INode MTime Size

An ETag looks like this:

ETag: "10690a1-4f2-40d45ae1"

Cache Conversation

When clients cache a resource they also save its ETag. If the server resource changes, its ETag is updated. When a client revisits a page, it checks to see if the resource has changed by sending a conditional header If-None-Match and the value of the object’s ETag:

If-None-Match: "10690a1-4f2-40d45ae1"

On the subsequent request, the server compares the ETag sent with the current Etag of the resource. If the Etags match the resource has not changed so the server sends the short response of HTTP 304 Not Modified status. The “not modified” status tells the client that the resource in the cache is good, and it can be reused. If the ETags do not match the server sends a complete response including the modified resource.

The Problem with ETags

The problem with ETags is that by default they are intended to be used on a single server (the inode portion above). Apache and Microsoft IIS default to the full ETag configuration, including the server’s iNode. For websites served from multiple servers, ETags can cause an unnecessary load by serving the same resource from different servers, even though the resource is identical and unchanged.

The Solution: Configure ETags

The solution for sites with multiple servers is to remove the iNode portion of Etags. To do this in Apache add the following lines to your configuration file.

<Directory /usr/local/httpd/htdocs>
FileETag MTime Size
</Directory>

An example from a site using iNode-less ETag follows:

Server	        Apache
Last-Modified	Fri, 28 Jan 2011 16:30:52 GMT
Etag	        "15c3-4d42ef3c"
Accept-Ranges	bytes
Content-Length	5571
Content-Type	image/jpeg
Date	        Mon, 31 Jan 2011 18:56:41 GMT

This server-independent ETag avoids the problem of the default ETag configuration, and allows more efficient caching of the same object across different servers.

Remove ETags Entirely

Another option is to remove ETags entirely, and rely on other cache control headers like Last-Modified timestamps. To remove ETags in Apache, add the following to your server configuration file.

Header unset Etag
FileETag none

This has the added benefit of reducing the size of your headers, which we’ll talk about in a future tweak.

Conclusion

Used correctly ETags can improve cache efficiency and improve the performance of your website. However, by default Apache and Microsoft IIS servers send fully specified ETags with a server component that can cause unnecessary requests for an identical resource served from multiple servers. One solution is to configure your ETags to omit the server information (iNode). Another is to eliminate ETags entirely.

Further Reading

Clausen. L. “Concerning Etags and datestamps,”
In Proc. Web Archiving Workshop, Bath, United Kingdom, September 2004. – Tested the optimal combination of ETag and datestamp caching.
Code: Configure or Eliminate ETags
An excerpt from Website Optimization showing how to configure ETags.
ETag from HTTP 1.1 Specification
Describes the use of ETags.

Leave a Comment