What is a robot.txt file?
If any company owning a website does not want its pages or directories to be indexed by a search engine it can add a file named robot.txt to its root directory. This can only be effective if the search engine robots follow the usual protocol of searching a website in a specified normal routine. Deviations may be there due to the development of software technology and the searching abilities of the search engine robots. With the help of this file robot.txt the robots following the usual protocol can be either allowed or disallowed to index the file on their respective search engines. Here is an example. For excluding all robots the commands used will be; Disallow:/cgi-bin/ along with Disallow:/misc/sitestats.