webcomm.in
robots.txt
Robots Exclusion Standard data for webcomm.in
Resource Scan
Scan Details
Site Domain | webcomm.in |
Base Domain | webcomm.in |
Scan Status | Ok |
Last Scan | 2025-09-26T23:03:41+00:00 |
Next Scan | 2025-10-03T23:03:41+00:00 |
Last Scan
Scanned | 2025-09-26T23:03:41+00:00 |
URL | https://webcomm.in/robots.txt |
Domain IPs | 104.21.40.76, 172.67.181.57, 2606:4700:3035::6815:284c, 2606:4700:3035::ac43:b539 |
Response IP | 104.21.40.76 |
Found | Yes |
Hash | 8e5e2cd3875cf95304b57d8a50fad499c661ea0f201a3ca8cf042bd4ee17cfc2 |
SimHash | c90559208e83 |
Groups
*
Rule | Path |
---|---|
Disallow | /wp-admin/ |
Disallow | /wp-login/ |
Disallow | /wp-includes/ |
Disallow | /Extra/ |
Other Records
Field | Value |
---|---|
sitemap | https://webcomm.in/sitemap_index.xml |
Warnings
- 2 invalid lines.