sciencenewstoday.org
robots.txt

Robots Exclusion Standard data for sciencenewstoday.org

Resource Scan

Scan Details

Site Domain sciencenewstoday.org
Base Domain sciencenewstoday.org
Scan Status Ok
Last Scan2025-10-20T09:12:18+00:00
Next Scan 2025-10-27T09:12:18+00:00

Last Scan

Scanned2025-10-20T09:12:18+00:00
URL https://sciencenewstoday.org/robots.txt
Domain IPs 104.21.41.188, 172.67.166.156, 2606:4700:3035::ac43:a69c, 2606:4700:3037::6815:29bc
Response IP 172.67.166.156
Found Yes
Hash 48006eb53c9e9b06833c556011204daee3429eecf729d61430ae5c222ca4dc86
SimHash a088c84b659a

Groups

googlebot

Rule Path
Allow /

googlebot-image

Rule Path
Allow /wp-content/uploads/

adsbot-google

Rule Path
Allow /

*

Rule Path Comment
Disallow /wp-admin/ -
Disallow /wp-includes/ -
Disallow /cgi-bin/ -
Disallow /trackback/ -
Disallow /feed/ -
Disallow /comments/ -
Disallow /search/ Fixed search disallow rule
Disallow /*?replytocom= -
Disallow /*?s= -
Disallow /*%26sort%3D -
Disallow /*?utm_= -
Disallow /*?fbclid= -
Disallow /*?gclid= -
Disallow /tag/*?paged= -
Disallow /category/*?paged= -
Disallow /author/*?paged= -
Disallow /page/*?paged= -
Disallow /private-research.pdf -
Disallow /*?orderby= -
Disallow /*?filter= -
Disallow /*?preview= -
Disallow /*?mode= -
Disallow /*?dir= -
Disallow /archives/ -
Disallow /?attachment_id= -
Allow /wp-content/uploads/ -
Allow /wp-content/themes/ -
Allow /wp-content/plugins/ -

Other Records

Field Value
sitemap https://www.sciencenewstoday.org/sitemap_index.xml

Comments

  • Block duplicate content from common URL parameters
  • Prevent unnecessary archive indexing
  • Allow important directories for proper site rendering
  • Sitemap for better indexing