gcolle.net
robots.txt

Robots Exclusion Standard data for gcolle.net

Resource Scan

Scan Details

Site Domain gcolle.net
Base Domain gcolle.net
Scan Status Ok
Last Scan2024-07-01T09:03:26+00:00
Next Scan 2024-07-31T09:03:26+00:00

Last Scan

Scanned2024-07-01T09:03:26+00:00
URL https://gcolle.net/robots.txt
Domain IPs 210.188.203.241
Response IP 210.188.203.241
Found Yes
Hash 162d4003269d3797051568e9d716b9340b8fab93c9af3014aafb6d53d89a6b09
SimHash 330cfd008795

Groups

*

Rule Path
Allow /
Disallow /*/action/
Disallow /admin-lite/
Disallow /admin/
Disallow /download/
Disallow /images/
Disallow /includes/
Disallow /phpmyadmin/
Disallow /pub/

baiduspider

Rule Path
Disallow /

baiduspider+

Rule Path
Disallow /

baiduimagespider

Rule Path
Disallow /

baidumobaider

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

icc-crawler

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

Other Records

Field Value
sitemap https://gcolle.net/sitemap.xml

Warnings

  • 2 invalid lines.