aircconline.com
robots.txt

Robots Exclusion Standard data for aircconline.com

Resource Scan

Scan Details

Site Domain aircconline.com
Base Domain aircconline.com
Scan Status Ok
Last Scan2024-05-02T18:36:15+00:00
Next Scan 2024-06-01T18:36:15+00:00

Last Scan

Scanned2024-05-02T18:36:15+00:00
URL https://aircconline.com/robots.txt
Domain IPs 209.59.138.194
Response IP 209.59.138.194
Found Yes
Hash 7b280d076bfbb5fdc2cf2cc28e917ffc792d1783da0a879e29f394786b5d48d4
SimHash 01319052c696

Groups

googlebot

Rule Path
Allow /

mediapartners-google

Rule Path
Allow /

adsbot-google

Rule Path
Allow /

slurp

Rule Path
Allow /

openfind

Rule Path
Allow /

scooter

Rule Path
Allow /

bingbot

Rule Path
Allow /

twiceler

Rule Path
Allow /

rogerbot

Rule Path
Allow /

teoma

Rule Path
Allow /

mantraagent

Rule Path
Allow /

semanticscholarbot

Rule Path
Allow /

lycos_spider_(t-rex)

Rule Path
Allow /

robozilla

Rule Path
Allow /

zyborg

Rule Path
Allow /

ia_archiver

Rule Path
Allow /

gulliver

Rule Path
Allow /

echo2

Rule Path
Allow /

scoutjet

Rule Path
Allow /

yahoofeedseeker

Rule Path
Allow /

bloglines

Rule Path
Allow /

blogstreetbot

Rule Path
Allow /

fastbuzz.com

Rule Path
Allow /

syndic8

Rule Path
Allow /

nif/1.1

Rule Path
Allow /

newsgatoronline

Rule Path
Allow /

mywireservicebot

Rule Path
Allow /

feedster

Rule Path
Allow /

feedfetcher

Rule Path
Allow /

yandexbot

Rule Path
Allow /
Disallow /sgw/
Disallow /covers/
Disallow /*checkval
Disallow /*wicket%3Ainterface

Other Records

Field Value
crawl-delay 2

ahrefsbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

*

Rule Path
Allow /

Other Records

Field Value
sitemap http://airccse.org/sitemap.xml

Comments

  • all others