You can make your URLs shorter and more abstract by using content negotiation to strip file extensions from your markup and source code. You’ll save a few bytes off of each object reference, and migration headaches in the future when you change technologies.
What Is Content Negotiation?
Content negotiation is a little-used feature of Apache and IIS that transparently delivers the best variant of the same resource to browsers. Browsers tell servers their preferences, and servers tune their responses to select the best resource. Different languages, file types, content encodings, and character sets can be automatically delivered to different browsers based on browser-supplied preferences sent in header requests. You can vary the following dimensions with content negotiation:
- Media type
- Language
- Content encoding
- Charset
In this article we explore the media type dimension for URI abbreviation. We explored content encoding in HTTP Compression in a previous tweak.
MIME Types
Multimedia Internet Mail Extensions or MIME allow Apache to determine the type of file from its extension. The configuration file mime.types
associates MIME types with file extensions.
Apache advises that you not edit the mime type file directly, but use the AddType
directive instead.
AddType Directive
You can add new and shorter variants of resources with the AddType
directive. For example if you wanted to make images that ended with .g recognized as GIFs, you could add the following line to your server config or .htaccess file:
AddType image/gif .g
Of course now you’d have two different extensions for GIF files. Why not go even further and eliminate the extension entirely? That’s where content negotiation comes in.
Content Negotiation with mod_negotiation
The Apache distribution comes with mod_negotiation
preinstalled. To select among different variants of a resource, the server needs to get information about each variant. mod_negotiation
uses two ways to select variants:
- Use a type map (i.e.,
*.var
file) that explicitly names the files and their associated MIME types - Use a
MultiViews
search, which searches for filename patterns and chooses from the results
Type maps contain paths to the variants of each MIME type to express a server-side preference for resources. The MultiViews
option expresses no server-side preference and effectively fakes a type map file from searching the directory for files with MIME type extensions.
Type-map Files
Type-maps allow fine-grained control over available variants, here’s an example:
URI: foo
URI: foo.jpg
Content-type: image/jpeg; qs=0.8
URI: foo.png
Content-type: image/png; qs=0.7
URI: foo.gif
Content-type: image/gif; qs=0.5
Mime types are set with content-type, and quality values with the “qs” parameter. Higher quality values are given priority over lower ones, so in this case any JPEG image would be served over a PNG, which takes priority over a GIF.
MultiViews Option
Explicitly setting paths to specific variants can become tedious for larger sites. A more powerful way to map variants to URLs is the MultiViews
option. By setting the multiviews option in your server configuration or .htaccess
file, you turn on content negotiation to search for variants within directories. For shaving bytes with extensionless URLs, MultiViews
is your best option.
<Directory /home/www/sitename/htdocs>
Options + MultiViews
</Directory>
Now when the server receives a request for /random/gizmo/thing
and /random/gizmo/thing
does not exist, the server searches inside the gizmo
directory looking for all files named thing.*
, assigning MIME types based on the extension of each file. It then chooses the best match based on the browser’s preferences and delivers that resource.
The negotiation goes something like this. Assume the “gizmo” directory consists of the following files:
thing.gif
thing.jpg
thing.png
The browser says the following with a reference to /random/gizmo/thing
:
Accept: image/png; q=.7, image/jpeg; q=.6, image/gif;q=0.2, */*;q=0.1
With MultiViews enabled, the server would search the referenced directory and deliver the image with the highest quality, namely the thing.png
. However, the URL in your XHTML file need not contain the filename extension, making maintenance easier and reducing file size.
The beautiful thing about this method is that now URLs become completely abstract, not showing the technology behind the resource. For example:
domain.com/somedir/otherthing.jpg
becomes
domain.com/somedir/otherthing
You still name your images or other resources with their normal MIME extensions (otherthing.jpg, etc.) but MultiViews
-enabled content negotiation selects the best (or only) one from the bunch. So all your image references could become say <img src="image" alt="image of something">
instead of <img src="image.jpg" alt="image of something">
.
What about Server Performance?
For all sites except extremely busy ones, the slight performance hit from content negotiation won’t be noticeable by your users. You can cache negotiated responses with CacheNegotiatedDocs
for HTTP/1.0 clients. The HTTP/1.1 protocol allows caching of negotiated responses, and HTTP Compression and other techniques can more than make up for the slight performance hit of content negotiation.
Rewriting URLs in IIS with PageXchanger
Microsoft’s IIS server can also be used to abbreviate URLs with content negotiation. A simple way to rewrite URLs in IIS is to use an ISAPI filter designed specifically for that purpose. PageXchanger from Port80 Software automates clean URLs and content negotiation for IIS servers. With application mapping enabled, PageXchanger allows you to use clean URLs without file extensions.
“Application Mapping” allows a resource to be requested with or without a file extension. For example with “asp” at the top of the “User Defined Extension List” a request for:
www.domain.com/products/index
will be served as the file:
www.domain.com/products/index.asp
If the “Remove File Extension/Redirect File” option is enabled, a request for:
www.domain.com/products/index
will serve the same index.asp file, but the extension will not appear in the URL. Old links or bookmarks to a different variant of this resource like:
www.domain.com/products/index.jsp
will still be served the new index.asp file, but the user won’t see the filename extension. This hides the technology from users, abstracting your URLs for a longer shelf life. Once you’ve removed file extensions from your markup check the “Remove File Extension/Redirect File”
option. Here’s an example code snippet from Port80Software.com’s home page. Note the lack of file extensions.
<script src="javascript/print_css_root" language="javascript" type="text/javascript"></script></head><body>...
<img src="images/H_logo" width="220" height="64" alt="Port80 Software" title="..." border="0" />...
Caution Content Negotiators
A request for a file with no extension that shares the same name as a directory will return a 403 response, and browse the directory. Adopt a naming convention with each resource having a unique name to avoid these namespace collisions.
Conclusion
Abbreviating URLs with content negotiation is a good way to abstract your URLs and save a few bytes per resource. Available for Apache and IIS, extensionless URLS and content negotiation hides the underlying technology from users for increased security, cleaner URLs, and smaller files. Persistent URLs also streamline future migrations from one scripting environment to another.
About the Author
Andy King is the founder of five developer-related sites, and the author of Speed Up Your Site: Web Site Optimization (http://www.speedupyoursite.com) from New Riders Publishing. He publishes the monthly Bandwidth Report, the weekly Optimization Week, the weekly Speed Tweak of the Week, and the semiweekly WebReference Update.
Further Reading
- Apache Core Features
- Documents the Options directive, which includes MultiViews.
- Apache URL Rewriting Guide
- Ralf Engelschall shows how to use mod_rewrite.
- Content Negotiation – Apache HTTP Server
- Documents Apache’s support of content negotiation detailed in the HTTP 1.1 specification.
- Content Negotiation
- A brief tutorial from Apache Week.
- Cool URLs don’t change
- Details the importance of persistent URLs by Tim Berners-Lee.
- HTTP 1.1 Content Negotiation
- An excerpt from the HTTP 1.1 specification by Roy Fielding, et al.
- ISAPI_Rewrite
- URI rewriting ISAPI filter for Microsoft’s IIS server, from Helicon Tech.
- mod_negotiation
- Apache HTTP server 1.3 documentation.
- mod_rewrite for IIS
- An opensource version of mod_rewrite for Microsoft’s IIS server. Powered by regular expressions Mod Rewrite adds a flexible URL rewriting engine to your IIS server. Make dynamic sites look like static sites for better SEO. By Steven Liechti.
- Port80 Software
- Makers of PageXchanger ISAPI filter for Microsoft’s IIS server. PageXchanger automates the URL rewriting process.
- Towards Next Generation URLs
- Thomas Powell and Joe Lima of Port80 Software show how to abbreviate and rewrite URLs.
- URL as UI
- The always succinct Jakob Nielsen on the importance of being URL.
- URLS! URLS! URLS!
- Documents URL rewriting in Apache with mod_rewrite. By Bill Humphries for A List Apart.
- Use HTTP Compression
- Use HTTP compression to compress your HTML, CSS, and JavaScript to speed up web page downloads and save bandwidth. From Speed Tweak of the Week.
You don’t technically need to use PageXChanger, at least with ASP.NET: you can just specify in IIS that all extensions (.*) are handled by the aspnet_isapi dll then in your web.config file add an httpHandler for the specific script files that have no extension. However it does end up being a pain that your files actually have no extension when developing.