Robots.txt is a notepad file that contains a code and instructs to web crawlers which file to crawl or not to crawl. This is also called Robots exclusion protocol.
How to Create a Robots.txt File?
It’s easy to create robots.txt file. You can use notepad to create this file.
Step1: Open notepad editor
Step2: Save the file under name “Robots.txt”
Step3: Write the code for robots to follow
What robots.txt link looks like?
Suppose you have a website named http://yourwebsite.com/ and now you want to add robots.txt file. You just need to upload robots file that you have already created following above-given steps under name “robots.txt”
Once you upload on the server your link will look like this http://yourwebsite/robots.txt
What code to write in robots.txt file?
Understand the basic codes given below
This code means to write the name of crawlers for different search engines. In place of “*” we can use the name of related search engines spiders to whom we want not to crawl or to crawl.
If we use “*” like User-agent: * it meant we are writing the code for all search engines crawlers.
Famous Search engines and their crawlers name
Google Search Engine – Google Bot
Yahoo Search Engine – Yahoo Slurp
Bing Search Engine – Bing Bot
Disallow:/ (disallowing a complete site)
This code means we are blocking search engine crawlers not to crawl the full site because the symbol called forward slash “/” means “All”. This will block the complete site from crawlers to visit and index. Check the complete code below
Disallow: (Allowing a complete site)
This code will allow web crawlers to crawl the full site because in this code we have not added any file to not to access. All search engine crawlers will crawl all pages to index. You can check given below code
Disallow:/filename.html (Blocking a particular file)
This code used to block a particular file or page in your website. Suppose you have a page named “filename.html” and you want this page to block for web crawlers than you full code will be as given below
Disallow:/imagename.jpg (Blocking a particular image)
This code is used to block a particular image from crawlers not to crawl and index that image in a search engine. You can see the full code below
Disallow:/foldername/ (disallowing a folder)
This code is used to block the access of complete folder named “foldername”. A complete code will be
How to add robots.txt file to a website?
If you want to add or submit robots.txt file you just need to upload this file in root directory of your web server.
Why should you have proper knowledge of Robots.txt file?
1. Having incomplete knowledge of this file can be dangerous for your website search engine ranking.
2. A small mistake in this file can block the wrong page that you don’t want to block.
3. Robots.txt file and web crawlers interact with each other and help to increase the search engine ranking.