Skip to content

Robots.txt

Case-1

User-agent: *

Disallow: /

  • Comment: Search engine bots to not index

 

Case-2

User-agent: *

Disallow: 

  • Comment: Search engine bots would be able to crawl & index everything on the website.

Case-3

User-agent: *

Disallow: courier.html

  • Comment: Specific pages that you do not want to be indexed.

 

Case-4

User-agent: Googlebot-Image

Disallow: /images/usa-shipping.jpg

  • Comment: Blocking specific images from google images.

 

Case-5

User-agent: *

Disallow: /ebooks/*.pdf

Disallow: /staging/

  • Disallow: /ebooks/*.pdf — In conjunction with the first line, this link means that all web crawlers should not crawl any pdf files in the ebooks folder within this website. This means search engines won’t include these direct PDF links in search results.
  • Disallow: /staging/ —In conjunction with the first line, this line asks all crawlers not to crawl anything in the staging folder of the website. This can be helpful if you’re running a test and don’t want the staged content to appear in the search results.



Case-6

User-agent: *

Disallow: /*?utm=*

  • Disallow: /*?utm=* — In conjunction with the first line, this link means that all web crawlers should not crawl any links with UTM parameters.

 

Case-7

User-agent: *

Disallow: /*?

  • Block search engines from accessing any URL that has a ? in it

 

Case-8

User-agent: *

Disallow: /*.php$

  • The $ character is used for “end of URL” matches. This example blocks GoogleBot crawling URLs that end with “.php”

 

Case-9

User-agent: *

Disallow: /search?s=*

  • Stop any crawler from crawling search parameter pages.

 

Case-10

User-agent: *

Disallow: /search?s=*     (Disallow: /query?kw=*)

  • Stop any crawler from crawling search parameter pages

 

Case-11

User-agent: Googlebot-Image

Disallow: /*.gif$

  • By specifying Googlebot-Image as the User-agent, the images will be excluded from Google Image Search.

 

Case-12

User-agent: Googlebot-Image

Disallow: /*.gif$

  • By specifying Googlebot-Image as the User-agent, the images will be excluded from Google Image Search

 

Case-13

User-agent: Googlebot-Image

Disallow: /*?color

Allow: /*?color=blue

  • By specifying - Block search engines from crawling any URL with the ?color= parameter in it, except for ?color=blue

 

Case-14

User-agent: Googlebot-Image

Disallow: /blog/*/page/

  • This means that URI paths such as /blog/category-name/page/3 will be blocked from crawling, without having to specify each category and each pagination.