archives.nicematin.com
robots.txt

Robots Exclusion Standard data for archives.nicematin.com

Resource Scan

Scan Details

Site Domain archives.nicematin.com
Base Domain nicematin.com
Scan Status Ok
Last Scan2024-11-10T04:00:46+00:00
Next Scan 2024-11-17T04:00:46+00:00

Last Scan

Scanned2024-11-10T04:00:46+00:00
URL http://archives.nicematin.com/robots.txt
Redirect https://www.nicematin.com/robots.txt
Redirect Domain www.nicematin.com
Redirect Base nicematin.com
Domain IPs 80.94.98.229, 80.94.98.231
Redirect IPs 80.94.98.229, 80.94.98.231
Response IP 80.94.98.229
Found Yes
Hash 68d3d5f2705591107f5b20daac45ba936bb26583827a2027e77495b5faa8bff9
SimHash 0a8658250921

Groups

*

Rule Path
Disallow /recherche?search=*
Disallow /oa
Disallow /user*
Disallow /a/
Disallow /edition-du-jour/lire
Disallow /auth/
Disallow /index.php/*
Disallow /*/get-token*
Disallow /*/oaToken/*
Disallow /newspapers/read/*
Disallow /carnet-avis-deces*

Other Records

Field Value
crawl-delay 10

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.nicematin.com/sitemap.xml
sitemap https://www.nicematin.com/googlenews.xml

Comments

  • Sitemap