Wednesday, April 14, 2010

Basics of robots.txt file generation

You can use direct commands in robots.txt file to block or allow access to different parts of your website or blog.

- Block spiders from all parts of website
User-agent: *
Disallow: /


- Allow spiders to access all parts of website
User-agent: *
Disallow:


- Block spiders from specific part of website (say files folder)
User-Agent: *
Disallow: /files/


- Block spiders from accessing specific file (say abc.html)
User-Agent: Googlebot
Disallow: /files/abc.html



Confused… what should be final robots.txt file
sitemap: http://www.websitename.com/sitemap.xml
User-agent: *
Disallow:

No comments:

Post a Comment