connectu.it
robots.txt

Robots Exclusion Standard data for connectu.it

Resource Scan

Scan Details

Site Domain connectu.it
Base Domain connectu.it
Scan Status Ok
Last Scan2024-11-10T22:33:19+00:00
Next Scan 2024-11-17T22:33:19+00:00

Last Scan

Scanned2024-11-10T22:33:19+00:00
URL https://connectu.it/robots.txt
Domain IPs 46.105.204.11
Response IP 46.105.204.11
Found Yes
Hash 7a95499fffdbdd771830a2ae8f3bd9cfc2397b668ddf6fe015fff6bfc6e98b83
SimHash 6d56dc12ce53

Groups

*

Rule Path
Disallow /pages/revision
Disallow /friends
Disallow /thewire
Disallow /videos/rawvideo
Disallow /videos/loadrelatedvideos
Disallow /profile

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yandex

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

twitterbot
facebookexternalhit/1.1

Rule Path
Allow /
Allow /blog
Allow /groups
Allow /videos
Allow /photos
Allow /bookmarks
Allow /discussion

Other Records

Field Value
sitemap sitemap.xml.gz

Comments

  • ALL
  • ROBOTS
  • whitelist