Wednesday, January 23, 2013

Preliminary Best Practices for All Web Sites

All Sites Should:

(in the following, should be replaced with the official domain name of the site)

Have an Official Address

The site should have a single, official address: its canonical URL.

The canonical URL that the site is at should be what business uses on postcards, business cards and everything in the world, both online and offline. All of a site's various URLs or other spellings, should permanently redirect, using a 301 redirect header, to its canonical URL. Even should use a permanent redirect, using a 301 redirect header, to

Many systems assume the www will be at the beginning of the canonical URL, but if must be used as the canonical URL, then should 301 redirect to

Tell Robots Where Their Map Is and What to Ignore

To help machines and robots know what to index and allow machines to automatically discover the xml sitemaps, there should be a robots.txt file at the root level, publically viewable at

To allow everything on the site to be indexed, the robots.txt file should contain:

User-agent: *

Before upload, robots.txt files can be validated with free online tools like

