How to create perfect Robot.txt file [Beginners Guide] 2022

A Robot.txt file contains instructions for bots that tell them which web pages they can and cannot access. Robot.txt files are most relevant for web crawlers from search engines like Google.

#Robots.txt file

A "robots.txt" file is a file that tells a search engine which search engine will crawl which pages of a site and which pages will not. This robots.txt file is in the root folder. Some pages on the website may need to not be shown in the search results. The reason may be that the work of those pages is not finished yet or any other reason. For this, a robots.txt file can be created to fix which pages will not be crawled by Search Engines.

If there is a subdomain and some of its pages do not need to be shown in the search results then a separate robots.txt file has to be created for it. The robots.txt file needs to be created and then uploaded to the root folder.

#Robots.txt file created

With the robots.txt file, it is possible to control which pages of the search engine bot crawler and spider site will see and which pages will not. This control method is called Robots Exclusion Protocol or Robots Exclusion Standard.

Before creating this file, let's take a look at some of the symbols used here.

Symbol User-agent indicates the robot (s) to * Wildcard. User-agent: * These means disallow all robots. Each line starts with disallow:. You can then adjust the URL path with /. By doing this, the robot will no longer crawl that path or file or that page. If you don't give a path, that is, if it is empty, then disallow will work for allow. # To comment. This is followed by a line so that the following codes can be understood later.

Disallow fields may represent partial or full URLs. The path that will be mentioned after the "/" sign will not be visited by the robot. See an example below -

See example

Disallow: / help

#disallows both /help.html and /help/index.html, whereas

See another example below -

See example

Disallow: / help /

# would disallow /help/index.html but allow /help.html

See some examples below -

Allow all robots to visit all files (wildcard “*” indicates all robots)

See example

User-agent: *

Disallow:

Not all robots will visit any file Aman see an example -

See example

User-agent: *

Disallow: /

GoogleBot will only allow visits. No one else will be able to visit. See an example -

See example

User-agent: GoogleBot

Disallow:

Visits from GoogleBot and Yahoo!

See example

User-agent: GoogleBot

User-agent: Slurp

Disallow:

If you want to block visits to a particular bot, use the following code

See example

User-agent: Teoma

Disallow: /

If you stop crawling any URL or page of your site with this file, these pages may still show up somewhere due to some problem. For example, referral logs may show URLs. Moreover, there are some search engines whose algorithms are not very advanced so when they send spiders/bots to crawl from these engines, they will crawl all your URLs ignoring the instructions in the robots.txt file.

Another good way to avoid these problems is to password-enclose all this content with a .htaccess file.

Beware of rel = "nofollow"

You can tell Google or the search engines not to crawl all these links by setting “nofollow” in the rel attribute of any link. If your site is a blog or forum where comments can be made, then you can keep the comment part by nofollow. This will not increase the rank of your site by using the reputation of your blog or forum. Again many times many may give offensive site address to your site which you do not want. It may also provide links to sites that are spammer to Google, thus damaging your site's reputation.

If you do not follow any link to the robot meta tag, nofollow will do the same.

How to create perfect Robot.txt file [Beginners Guide] 2022

#Robots.txt file

#Robots.txt file created

Posted by Md Mustakim Ahmmed

Post a Comment

0 Comments

Search This Blog

Featured post

What is Meaning of Blog, Blogger and Blogging [part-1]

Popular Posts

How to Create SEO-FRIENDLY URL Structure

Labels

Recent Post

Recent In Internet

Most Popular

How to Implement Backlink Management Strategy in 2022

What is off-page SEO?[Beginners Guide] 2022

Keyword Research: Why do keyword research?

What is Meaning of Blog, Blogger and Blogging [part-1]

How to create perfect Robot.txt file [Beginners Guide] 2022

#Robots.txt file

#Robots.txt file created

Posted by Md Mustakim Ahmmed

You may like these posts

Post a Comment

0 Comments

Search This Blog

Featured post

What is Meaning of Blog, Blogger and Blogging [part-1]

Social Plugin

Popular Posts

How to Create SEO-FRIENDLY URL Structure

Labels

Recent Post

Recent In Internet

Most Popular

How to Implement Backlink Management Strategy in 2022

What is off-page SEO?[Beginners Guide] 2022

Keyword Research: Why do keyword research?

What is Meaning of Blog, Blogger and Blogging [part-1]