Call (877) SITE-OPT (748-3678)

Rewrite URLs with Content Negotiation

Summary: Content negotiation can make your URLs shorter and more abstract. By rewriting URLs without file extensions to the right resources you can save bytes and migration headaches.

You can make your URLs shorter and more abstract by using content negotiation to strip file extensions from your markup and source code. You'll save a few bytes off of each object reference, and migration headaches in the future when you change technologies.

What Is Content Negotiation?

Content negotiation is a little-used feature of Apache and IIS that transparently delivers the best variant of the same resource to browsers. Browsers tell servers their preferences, and servers tune their responses to select the best resource. Different languages, file types, content encodings, and character sets can be automatically delivered to different browsers based on browser-supplied preferences sent in header requests. You can vary the following dimensions with content negotiation:

  • Media type
  • Language
  • Content encoding
  • Charset

In this article we explore the media type dimension for URI abbreviation. We explored content encoding in HTTP Compression in a previous tweak.

MIME Types

Multimedia Internet Mail Extensions or MIME allow Apache to determine the type of file from its extension. The configuration file mime.types associates MIME types with file extensions. Apache advises that you not edit the mime type file directly, but use the AddType directive instead.

AddType Directive

You can add new and shorter variants of resources with the AddType directive. For example if you wanted to make images that ended with .g recognized as GIFs, you could add the following line to your server config or .htaccess file:

AddType image/gif .g

Of course now you'd have two different extensions for GIF files. Why not go even further and eliminate the extension entirely? That's where content negotiation comes in.

Content Negotiation with mod_negotiation

The Apache distribution comes with mod_negotiation preinstalled. To select among different variants of a resource, the server needs to get information about each variant. mod_negotiation uses two ways to select variants:

  • Use a type map (i.e., *.var file) that explicitly names the files and their associated MIME types
  • Use a MultiViews search, which searches for filename patterns and chooses from the results

Type maps contain paths to the variants of each MIME type to express a server-side preference for resources. The MultiViews option expresses no server-side preference and effectively fakes a type map file from searching the directory for files with MIME type extensions.

Type-map Files

Type-maps allow fine-grained control over available variants, here's an example:

URI: foo

URI: foo.jpg
Content-type: image/jpeg; qs=0.8

URI: foo.png
Content-type: image/png; qs=0.7

URI: foo.gif
Content-type: image/gif; qs=0.5

Mime types are set with content-type, and quality values with the "qs" parameter. Higher quality values are given priority over lower ones, so in this case any JPEG image would be served over a PNG, which takes priority over a GIF.

MultiViews Option

Explicitly setting paths to specific variants can become tedious for larger sites. A more powerful way to map variants to URLs is the MultiViews option. By setting the multiviews option in your server configuration or .htaccess file, you turn on content negotiation to search for variants within directories. For shaving bytes with extensionless URLs, MultiViews is your best option.

<Directory /home/www/sitename/htdocs>
Options + MultiViews
</Directory>

Now when the server receives a request for /random/gizmo/thing and /random/gizmo/thing does not exist, the server searches inside the gizmo directory looking for all files named thing.*, assigning MIME types based on the extension of each file. It then chooses the best match based on the browser's preferences and delivers that resource.

The negotiation goes something like this. Assume the "gizmo" directory consists of the following files:

thing.gif
thing.jpg
thing.png

The browser says the following with a reference to /random/gizmo/thing:

Accept: image/png; q=.7, image/jpeg; q=.6, image/gif;q=0.2, */*;q=0.1

With MultiViews enabled, the server would search the referenced directory and deliver the image with the highest quality, namely the thing.png. However, the URL in your XHTML file need not contain the filename extension, making maintenance easier and reducing file size.

The beautiful thing about this method is that now URLs become completely abstract, not showing the technology behind the resource. For example:

domain.com/somedir/otherthing.jpg

becomes

domain.com/somedir/otherthing

You still name your images or other resources with their normal MIME extensions (otherthing.jpg, etc.) but MultiViews-enabled content negotiation selects the best (or only) one from the bunch. So all your image references could become say <img src="image" alt="image of something"> instead of <img src="image.jpg" alt="image of something">.

What about Server Performance?

For all sites except extremely busy ones, the slight performance hit from content negotiation won't be noticeable by your users. You can cache negotiated responses with CacheNegotiatedDocs for HTTP/1.0 clients. The HTTP/1.1 protocol allows caching of negotiated responses, and HTTP Compression and other techniques can more than make up for the slight performance hit of content negotiation.

Rewriting URLs in IIS with PageXchanger

Microsoft's IIS server can also be used to abbreviate URLs with content negotiation. A simple way to rewrite URLs in IIS is to use an ISAPI filter designed specifically for that purpose. PageXchanger from Port80 Software automates clean URLs and content negotiation for IIS servers. With application mapping enabled, PageXchanger allows you to use clean URLs without file extensions.

"Application Mapping" allows a resource to be requested with or without a file extension. For example with "asp" at the top of the "User Defined Extension List" a request for:

www.domain.com/products/index

will be served as the file:

www.domain.com/products/index.asp

If the "Remove File Extension/Redirect File" option is enabled, a request for:

www.domain.com/products/index

will serve the same index.asp file, but the extension will not appear in the URL. Old links or bookmarks to a different variant of this resource like:

www.domain.com/products/index.jsp

will still be served the new index.asp file, but the user won't see the filename extension. This hides the technology from users, abstracting your URLs for a longer shelf life. Once you've removed file extensions from your markup check the "Remove File Extension/Redirect File" option. Here's an example code snippet from Port80Software.com's home page. Note the lack of file extensions.

<script src="javascript/print_css_root" language="javascript" type="text/javascript"></script></head><body>...
<img src="images/H_logo" width="220" height="64" alt="Port80 Software" title="..." border="0" />...

Caution Content Negotiators

A request for a file with no extension that shares the same name as a directory will return a 403 response, and browse the directory. Adopt a naming convention with each resource having a unique name to avoid these namespace collisions.

Conclusion

Abbreviating URLs with content negotiation is a good way to abstract your URLs and save a few bytes per resource. Available for Apache and IIS, extensionless URLS and content negotiation hides the underlying technology from users for increased security, cleaner URLs, and smaller files. Persistent URLs also streamline future migrations from one scripting environment to another.

About the Author

Andy King is the founder of five developer-related sites, and the author of Speed Up Your Site: Web Site Optimization (http://www.speedupyoursite.com) from New Riders Publishing. He publishes the monthly Bandwidth Report, the weekly Optimization Week, the weekly Speed Tweak of the Week, and the semiweekly WebReference Update.

Further Reading

Apache Core Features
Documents the Options directive, which includes MultiViews.
Apache URL Rewriting Guide
Ralf Engelschall shows how to use mod_rewrite.
Content Negotiation - Apache HTTP Server
Documents Apache's support of content negotiation detailed in the HTTP 1.1 specification.
Content Negotiation
A brief tutorial from Apache Week.
Cool URLs don't change
Details the importance of persistent URLs by Tim Berners-Lee.
HTTP 1.1 Content Negotiation
An excerpt from the HTTP 1.1 specification by Roy Fielding, et al.
ISAPI_Rewrite
URI rewriting ISAPI filter for Microsoft's IIS server, from Helicon Tech.
mod_negotiation
Apache HTTP server 1.3 documentation.
mod_rewrite for IIS
An opensource version of mod_rewrite for Microsoft's IIS server. Powered by regular expressions Mod Rewrite adds a flexible URL rewriting engine to your IIS server. Make dynamic sites look like static sites for better SEO. By Steven Liechti.
Port80 Software
Makers of PageXchanger ISAPI filter for Microsoft's IIS server. PageXchanger automates the URL rewriting process.
Towards Next Generation URLs
Thomas Powell and Joe Lima of Port80 Software show how to abbreviate and rewrite URLs.
URL as UI
The always succinct Jakob Nielsen on the importance of being URL.
URLS! URLS! URLS!
Documents URL rewriting in Apache with mod_rewrite. By Bill Humphries for A List Apart.
Use HTTP Compression
Use HTTP compression to compress your HTML, CSS, and JavaScript to speed up web page downloads and save bandwidth. From Speed Tweak of the Week.

By website optimization on 6 Jul 2004 AM

Copyright © 2002-2013 Website Optimization, LLC. All Rights Reserved - Free website speed test - Privacy Policy
Last modified: August 26, 2013.

Follow us on: Twitter, Google+, Facebook, Linked In