skuce.com
robots.txt
Robots Exclusion Standard data for skuce.com
Resource Scan
Scan Details
Site Domain | skuce.com |
Base Domain | skuce.com |
Scan Status | Ok |
Last Scan | 2024-11-14T04:05:48+00:00 |
Next Scan | 2024-12-14T04:05:48+00:00 |
Last Scan
Scanned | 2024-11-14T04:05:48+00:00 |
URL | http://skuce.com/robots.txt |
Domain IPs | 107.180.46.218 |
Response IP | 107.180.46.218 |
Found | Yes |
Hash | 2ede6b463df82b1358063137fd5eed8b9a5d5b1e7278830823cd5eb0e3d8432b |
SimHash | a50e4e803342 |
Groups
*
Rule | Path |
---|---|
Disallow | /images/ |
Disallow | /photos/ |
Disallow | /stats/ |
Disallow | /work/ |
Disallow | /thetoque/ |
Disallow | /genealogy/data/ |
Disallow | /genealogy/data2/ |
Disallow | /genealogy/bdm/ |
Disallow | /genealogy/burial/ |
Disallow | /genealogy/immigration/ |
Disallow | /genealogy/gen-ireland.html |
Disallow | /genealogy/gen-datadump.html |
Disallow | /genealogy/dna.html |
Disallow | /about.html |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |
Warnings
- 1 invalid line.
Comments