thewrexhaminsider.com
robots.txt

Robots Exclusion Standard data for thewrexhaminsider.com

Resource Scan

Scan Details

Site Domain thewrexhaminsider.com
Base Domain thewrexhaminsider.com
Scan Status Ok
Last Scan2024-06-10T04:17:57+00:00
Next Scan 2024-06-17T04:17:57+00:00

Last Scan

Scanned2024-06-10T04:17:57+00:00
URL https://thewrexhaminsider.com/robots.txt
Redirect https://www.thewrexhaminsider.com/robots.txt
Redirect Domain www.thewrexhaminsider.com
Redirect Base thewrexhaminsider.com
Domain IPs 104.21.37.10, 172.67.202.44, 2606:4700:3033::6815:250a, 2606:4700:3034::ac43:ca2c
Redirect IPs 104.21.37.10, 172.67.202.44, 2606:4700:3033::6815:250a, 2606:4700:3034::ac43:ca2c
Response IP 104.21.37.10
Found Yes
Hash 062d071b5cf99b6b37e1c2d204b90a4450b7a8eb4151918153725d4cb6d5dc30
SimHash 3b309a042430

Groups

*

Rule Path
Disallow /core/wp-admin/
Allow /core/wp-admin/admin-ajax.php
Disallow /?s=

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.thewrexhaminsider.com/sitemap_index.xml

Comments

  • XML Sitemap & Google News version 5.3.6 - https://status301.net/wordpress-plugins/xml-sitemap-feed/
  • No XML Sitemaps are enabled on this site.
  • Block Common Crawl
  • Block Google Bard AI
  • Block Open AI