newformat.info
robots.txt

Robots Exclusion Standard data for newformat.info

Resource Scan

Scan Details

Site Domain newformat.info
Base Domain newformat.info
Scan Status Ok
Last Scan2024-09-21T02:58:58+00:00
Next Scan 2024-09-28T02:58:58+00:00

Last Scan

Scanned2024-09-21T02:58:58+00:00
URL https://newformat.info/robots.txt
Domain IPs 104.21.81.86, 172.67.158.177, 2606:4700:3035::6815:5156, 2606:4700:3035::ac43:9eb1
Response IP 104.21.81.86
Found Yes
Hash 3fb52312d702a3313e2a9ac9caa8f9e6225071bee176d59cf59f2588ab67e6a2
SimHash bff4e02987f3

Groups

*

Rule Path Comment
Disallow /wp-json/ -
Disallow */xmlrpc.php WordPress API file
Disallow /? All query parameters on the home page.
Disallow *?s= -
Disallow *%26s%3D Search.
Disallow /search/* -
Disallow *utm*%3D Links with utm tags
Disallow *openstat%3D Links with openstat tags
Disallow *refid%3D ref link
Disallow /id_date archives by date
Disallow /wp-admin/ -
Disallow /readme.html -

nuclei
wikido
riddler
petalbot
zoominfobot
go-http-client
node/simplecrawler
cazoodlebot
dotbot/1.0
gigabot
barkrowler
blexbot
magpie-crawler

Rule Path
Disallow /

Other Records

Field Value
sitemap https://newformat.info/sitemap_index.xml

Comments

  • Allow: /
  • Host: https://newformat.info
  • Ban bots that don't benefit us.
  • --------------------------------