webcomm.in
robots.txt

Robots Exclusion Standard data for webcomm.in

Resource Scan

Scan Details

Site Domain webcomm.in
Base Domain webcomm.in
Scan Status Ok
Last Scan2025-09-26T23:03:41+00:00
Next Scan 2025-10-03T23:03:41+00:00

Last Scan

Scanned2025-09-26T23:03:41+00:00
URL https://webcomm.in/robots.txt
Domain IPs 104.21.40.76, 172.67.181.57, 2606:4700:3035::6815:284c, 2606:4700:3035::ac43:b539
Response IP 104.21.40.76
Found Yes
Hash 8e5e2cd3875cf95304b57d8a50fad499c661ea0f201a3ca8cf042bd4ee17cfc2
SimHash c90559208e83

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /wp-login/
Disallow /wp-includes/
Disallow /Extra/

adsbot-google

Rule Path
Allow /

Other Records

Field Value
sitemap https://webcomm.in/sitemap_index.xml

Warnings

  • 2 invalid lines.