Wednesday, January 23, 2013

Preliminary Best Practices for All Web Sites

All Sites Should:

(in the following, www.example.com should be replaced with the official domain name of the site)

Have an Official Address

The site should have a single, official address: its canonical URL.

The canonical URL that the site is at should be what business uses on postcards, business cards and everything in the world, both online and offline. All of a site's various URLs or other spellings, should permanently redirect, using a 301 redirect header, to its canonical URL. Even http://example.com should use a permanent redirect, using a 301 redirect header, to http://www.example.com.

Many systems assume the www will be at the beginning of the canonical URL, but if http://example.com must be used as the canonical URL, then http://www.example.com should 301 redirect to http://example.com.

Tell Robots Where Their Map Is and What to Ignore

To help machines and robots know what to index and allow machines to automatically discover the xml sitemaps, there should be a robots.txt file at the root level, publically viewable at http://www.example.com/robots.txt.

To allow everything on the site to be indexed, the robots.txt file should contain:

User-agent: *
Disallow:
sitemap: http://www.example.com/sitemap.xml

Before upload, robots.txt files can be validated with free online tools like http://tool.motoricerca.info/robots-checker.phtml.

No comments: