inetspec.com
robots.txt

Robots Exclusion Standard data for inetspec.com

Resource Scan

Scan Details

Site Domain inetspec.com
Base Domain inetspec.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-09-27T06:14:04+00:00
Next Scan 2024-10-27T06:14:04+00:00

Last Successful Scan

Scanned2024-08-06T03:06:37+00:00
URL https://inetspec.com/robots.txt
Domain IPs 66.228.138.150
Response IP 66.228.138.150
Found Yes
Hash 8cfd95f889ccbb6c3a66115c018020b305805f94a7e28cb85e9653dc38be8a2f
SimHash 241679c747f4

Groups

*

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

copyright sheriff

Rule Path
Disallow /

dealgates bot

Rule Path
Disallow /

gaisbot

Rule Path
Disallow /

gingercrawler

Rule Path
Disallow /

nutch

Rule Path
Disallow /

renlifangbot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yeti

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

butterfly
charlotte
exabot
envolk
gigabot
scoutjet
speedy
teoma
turnitinbot
twiceler
yowedobot
mj12bot

Rule Path
Disallow
Disallow /adm
Disallow /php
Disallow /tmp
Disallow /upl
Disallow /index.php

Other Records

Field Value
crawl-delay 20

ia_archiver

Rule Path
Disallow
Disallow /adm
Disallow /php
Disallow /tmp
Disallow /upl
Disallow /index.php

adsbot-google
googlebot

Rule Path
Disallow /files
Disallow /adm
Disallow /php
Disallow /tmp
Disallow /uploads
Disallow /index.php
Allow /

googlebot-image
mediapartners-google

Rule Path
Disallow /

slurp

Rule Path
Disallow /

bingbot
msnbot
msnbot
msnbot-products
msnbot-newsblogs

Rule Path
Disallow
Disallow /adm
Disallow /php
Disallow /tmp
Disallow /uploads
Disallow /index.php

Other Records

Field Value
crawl-delay 30

bingbot-media
msnbot-media

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.inetspec.com/sitemap
sitemap http://www.inetspec.com/sitemap.xml

Comments

  • iNet Specialists
  • Robots TXT File
  • Good reference pages:
  • http://antezeta.com/news/avoid-search-engine-indexing
  • http://www.youtube.com/user/GoogleWebmasterHelp
  • http://support.google.com/webmasters/bin/answer.py?hl=en&answer=182072
  • http://www.bing.com/community/site_blogs/b/webmaster/archive/2012/05/03/to-crawl-or-not-to-crawl-that-is-bingbot-s-question.aspx
  • Base directive for unknown and mis-behaving spiders
  • More specific general directives
  • added to prevent indexing of certain pages
  • Alexa web archiver directives
  • added to prevent indexing of certain pages
  • Googlebot(s) specific directives
  • Files that SHOULD NOT be crawled (and have been)
  • Disallow: /pathname.ext
  • Directories that SHOULD NOT be crawled (and have been)
  • added Dec 2012 to prevent indexing of certain pages
  • Allow everything else (according to Google robots.txt translation)
  • NO Image/AdSense Bots
  • Yahoo! Slurp(s) specific directives
  • The Slurp! is DEAD, Long Live the Bing
  • MSNbot(s) specific directives for BING
  • Files that SHOULD NOT be crawled (and have been)
  • Disallow: /pathname.ext
  • Directories that SHOULD NOT be crawled (and have been)
  • added Dec 2012 to prevent indexing of certain pages
  • NO Media
  • html version
  • xml version