bestvaluecolleges.org
robots.txt

Robots Exclusion Standard data for bestvaluecolleges.org

Resource Scan

Scan Details

Site Domain bestvaluecolleges.org
Base Domain bestvaluecolleges.org
Scan Status Ok
Last Scan2024-09-18T21:27:45+00:00
Next Scan 2024-10-18T21:27:45+00:00

Last Scan

Scanned2024-09-18T21:27:45+00:00
URL https://bestvaluecolleges.org/robots.txt
Domain IPs 104.21.36.103, 172.67.192.83, 2606:4700:3030::ac43:c053, 2606:4700:3037::6815:2467
Response IP 104.21.36.103
Found Yes
Hash 293a75dc91c39c744679055b0d19e7ac042b6a6db4d564587ceb289f1327c621
SimHash 041893f0c3a2

Groups

*

Rule Path
Disallow /cgi-bin
Disallow /wp-admin/
Disallow /?
Disallow *?s=
Disallow *%26s%3D
Disallow /search
Disallow /author/
Disallow */embed
Disallow */page/
Disallow */xmlrpc.php
Disallow *utm*%3D
Disallow *openstat%3D

ccbot

Rule Path
Disallow /

ccbot/2.0

Rule Path
Disallow /

ccbot/2.0 (http://commoncrawl.org/faq/)

Rule Path
Disallow /

wikido

Rule Path
Disallow /

fr_crawler

Rule Path
Disallow /

yandex

Rule Path
Disallow /

baiduspider-favo

Rule Path
Disallow /

maxpointcrawler

Rule Path
Disallow /

admantx

Rule Path
Disallow /

baiduspider-news

Rule Path
Disallow /

baiduspider-cpro

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

baiduspider-video

Rule Path
Disallow /

baiduspider-ads

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

bitvorebot

Rule Path
Disallow /

blp_bbot

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

kraken

Rule Path
Disallow /

synthesio

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

brandonbot

Rule Path
Disallow /

germcrawler

Rule Path
Disallow /

sogou

Rule Path
Disallow /

moatbot

Rule Path
Disallow /

bhcbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /
Allow /

Other Records

Field Value
sitemap https://bestvaluecolleges.org/sitemap_index.xml

Warnings

  • `host` is not a known field.