archives.nicematin.com
robots.txt
Robots Exclusion Standard data for archives.nicematin.com
Resource Scan
Scan Details
Site Domain | archives.nicematin.com |
Base Domain | nicematin.com |
Scan Status | Ok |
Last Scan | 2024-11-10T04:00:46+00:00 |
Next Scan | 2024-11-17T04:00:46+00:00 |
Last Scan
Scanned | 2024-11-10T04:00:46+00:00 |
URL | http://archives.nicematin.com/robots.txt |
Redirect | https://www.nicematin.com/robots.txt |
Redirect Domain | www.nicematin.com |
Redirect Base | nicematin.com |
Domain IPs | 80.94.98.229, 80.94.98.231 |
Redirect IPs | 80.94.98.229, 80.94.98.231 |
Response IP | 80.94.98.229 |
Found | Yes |
Hash | 68d3d5f2705591107f5b20daac45ba936bb26583827a2027e77495b5faa8bff9 |
SimHash | 0a8658250921 |
Groups
*
Rule | Path |
---|---|
Disallow | /recherche?search=* |
Disallow | /oa |
Disallow | /user* |
Disallow | /a/ |
Disallow | /edition-du-jour/lire |
Disallow | /auth/ |
Disallow | /index.php/* |
Disallow | /*/get-token* |
Disallow | /*/oaToken/* |
Disallow | /newspapers/read/* |
Disallow | /carnet-avis-deces* |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://www.nicematin.com/sitemap.xml |
sitemap | https://www.nicematin.com/googlenews.xml |
Comments