Robots txt - Robots Exclusion Protocol - REP File

Robots Text File (robots.txt)

It is always good practice to create a robots.txt file and place it in your root directory. It is correctly known as the robots exclusion protocol, but we'll stick to calling it robots txt, (Just because I'm too lazy to keep writing robots exclusion protocol throughout this article!)

There are many conflicting views regarding robots text files, (robots.txt), and whether a robots file should be added if it has no content, but following experimentation, on various websites, we are convinced that every website should feature a robots txt file, even if it is an "empty" file .

The robots.txt file is the first thing that a search engine looks for in your root directory, as it indicates which files you do not want the robot to crawl and which files you do not allow to be indexed in the search engines.

Blank robots.txt file

If a search engine spider doesn't find a robots.txt file, it assumes that it can spider the whole site. So why bother putting a blank file into the root directory at all?

The answer is twofold. First of all, it is simply good practice. If the robot is going to look for the file, for the one minute that it takes to create it, you may as well do it.

Secondly, If the robot finds a blank robots.txt file, it knows that you have nothing to hide in you site. The spider knows it has your absolute permission to look at and index every page on your site, because you have left him the message with the robots.txt file. The spider knows that you are not trying to hide or cloak any pages and in our experience, that is seen by certain robots, as a good thing.

By a "blank" robots text file, do you mean "blank"?

It used to be the case that you could create a simple robots.txt file with nothing in it and put it into your root directory. However, to ensure that robots exclusion protocol validates correctly, take ten seconds more to add the text to let the spider know that all robots are welcome and no pages are barred from being spidered.

Create a robots txt file (REP file)

To do this, simply open a page in notepad (by right clicking on your desktop, go to "new" and then "text document") and type in the following:

User-agent: *
Disallow:

The user-agent specifies the robot. You can have a user-agent line for each robot if you wish, but the wildcard symbol "*" lets the spider know that all robots are welcome. When you have completed this task save the file as robots.txt and upload it to your root directory. It's a s simple as that!

The disallow directive lets the robots know the files or directories that you do not wish to be spidered. By leaving the directive blank, in the robots.txt file, the spiders know that they can retrieve all of your website files and pages.

Preventing Robots from Spidering Files using robots txt

If you wish to stop all robots from spidering any of your files, (if they are under construction, for example), you simply create your robots.txt file and type the following:

User-agent: *
Disallow: /

If you want to prevent a specific robot from visiting a page of your site, place the following into your robots.txt file:

User-agent: googlebot
Disallow: rampage.htm

Or if it was a file, rather than just a page:

User-agent: googlebot
Disallow: /images/

For a better idea, visit the BBC robots.txt file. Once you have created your robots.txt file you can validate it here , to ensure that you have made no errors in creating or uploading your REP file.

If you have any questions regarding this article on REP, please do not hesitate to contact us, we are always happy to offer any help or advice that we can.

This table shows the current search engine standings for the Robots txt terms. SERPs correct as at 10/04/2007. Pages in bold are number one positions.
Key Phrase robots text robots text file using robots text robots exclusion protocol robots exclusion protocol file Google Page 1 Page 3 Page 1 Page 1 Page 1 Yahoo Page 1 Page 1 Page 1 Page 1 Page 1 MSN Page 1 Page 2 Page 1 Page 1 Page 1
Kenkai - Search Engine Optimisation
Search Engine Optimisation - The Team - SEO Blog - Portfolio - Services - Articles - Contact - Site Map - Kenkai Terms - Guarantee - Privacy
Resources [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
Kenkai.com is owned by Kenkai Projects Ltd. Registered in Scotland, UK, Reg No: 333323
Free Search Engine Positioning Report
Telephone:
0131 208 2038
0208 123 6963
Address:
Kenkai House,
19 Marshall Way
Muirside, Alloa
Clackmannanshire,
Scotland, UK
FK10 2GA
Email: