Home » Use Lowercase Markup For Better Compression – avoid uppercase markup to improve xhtml and html compression

Use Lowercase Markup For Better Compression – avoid uppercase markup to improve xhtml and html compression

Summary:

You can improve HTTP compression by using all lowercase markup. We test five popular sites for improved compression. Remember, all characters are not created equal.

Lowercase markup compresses more efficiently than uppercase markup. Along with the benefits of XHTML compatibility, lowercase markup allows HTTP compression to work more efficiently by increasing redundancy. In this article we show the benefits of using lowercase markup on five popular sites.

How Lowercase Markup Helps GZIP Compression

The GZIP compression algorithm used in mod_gzip and other HTTP compression programs works by substituting shorter tokens for longer identical strings. By using more lowercase strings, especially repetitious table and div structures, you increase the likelihood of more string matches. While an HTML file of all lowercase markup is the same size as a mixed case HTML file, it compresses more efficiently. Even if you don’t use HTTP compression on your site, your users on dialup accelerators like Earthlink’s Accelerator and AOL’s Topspeed will benefit from your lowercase markup.

How Much Smaller?

To test the effectiveness of lowercase markup we compressed the HTML homepages of five random sites before and after lowercasing all of their HTML markup (see Table 1).

Table 1: Lowercase versus Mixed Uppercase Markup Compression

Homepage Uncompressed HTML (bytes) GZIP -6 Compressed Compressed after lowercase markup Percent Smaller
ABCNews.com* 49,959 11,125 10,785 3.05
Guardian.co.uk 73,772 14,080 13,808 1.93
JCPenny.com* 19,728 3,310 3,154 4.71
Olympics.com* 26,927 6,273 6,126 2.34
Slashdot.org* 49,291 12,589 12,434 1.23
Average 2.65

*Uses HTTP compression. The homepages tested were ABCNews.com, Guardian.co.uk, JCPenny.com, Olympics.com, and Slashdot.org. Note that mod_gzip defaults to gzip -6 for compression to give the best balance between speed and size.

On average the all-lowercase markup saved an additional 2.65 percent off these compressed home pages. All lowercase markup saved from 1.23% (Slashdot.org) to 4.71% (JCPenny.com) off the compressed mixed case home pages. Four out of the five sites tested with our Web Page Analyzer used HTTP compression, so most of these sites would benefit from switching to lowercase markup and accelerated Guardian users would also benefit.

Conclusion

On average using all lowercase markup saved 2.65% off of compressed HTML file size. JCPenny.com would realize over 4.7% smaller HTML files using all lowercase markup after compression. You can achieve higher compression ratios by adopting the same approach to your CSS and JavaScript markup to maximize the efficiency of GZIP compression. Using identical wording, and repetitive markup (like tables, similarly structured divs, or class names) can improve GZIP compression even further.

About the Author

Andy King is the founder of five developer-related sites, and the author of Speed Up Your Site: Web Site Optimization (http://www.speedupyoursite.com) from New Riders Publishing. He publishes the monthly Bandwidth Report, the weekly Optimization Week, the weekly Speed Tweak of the Week, and the semiweekly WebReference Update.

Further Reading

Compressing the Web
Chapter 18 of Speed Up Your Site shows how to set up HTTP compression on Apache and IIS servers and evaluates the available compression software. Lists software and hardware compression tools for web compression.
HTTP Compression Speeds the Web
Introductory article on content encoding by Peter Cranstone.
Overweight Travel Sites Delay Holiday Travelers
Compares the home pages of Expedia, Orbitz, and Travelocity for speed and accessibility. Orbitz uses HTTP compression. From Optimization Week Magazine, Dec. 4, 2003.
Slow Shopping Sites Delay Santa: Scrooge Response Times
Five out of fourteen top shopping sites use HTTP compression on their home pages. By Andrew King of Optimization Week Magazine, Dec. 17, 2003.
Use HTTP Compression
Use HTTP compression to compress your HTML, CSS, and JavaScript to speed up web page downloads and save bandwidth. From Speed Tweak of the Week.
WebCompression.org
Stephen Pierzchala’s compression information resource.

Leave a Comment