whlsm.com
robots.txt
Robots Exclusion Standard data for whlsm.com
Resource Scan
Scan Details
Site Domain | whlsm.com |
Base Domain | whlsm.com |
Scan Status | Ok |
Last Scan | 2024-09-20T13:47:14+00:00 |
Next Scan | 2024-09-27T13:47:14+00:00 |
Last Scan
Scanned | 2024-09-20T13:47:14+00:00 |
URL | https://whlsm.com/robots.txt |
Domain IPs | 38.134.113.246 |
Response IP | 38.134.113.246 |
Found | Yes |
Hash | 53123771c07fa0f807088e4345252cb3270f27d893baef0d0e124590ebef4579 |
SimHash | cc10cae2ec11 |
Groups
*
Rule | Path |
---|---|
Disallow | /logout |
Disallow | /oauth/ |
Disallow | /account/ |
Disallow | /search/ |
Disallow | /app/ |
Disallow | /docs/ |
Disallow | /en/docs/ |
Disallow | /email/ |
Disallow | /password/ |
Disallow | /r/sms |
Disallow | /fun/ |
Disallow | /subfeeds/ |
Disallow | /*?commentId=* |
Disallow | /*?filter=* |
Allow | /.well-known/ |
Allow | /*.js |
Allow | /*.css |
Allow | /app-ads.txt |
Allow | /ads.txt |
Other Records
Field | Value |
---|---|
sitemap | https://whlsm.com/sitemap.xml.gz |