waytoidea.com
robots.txt

Robots Exclusion Standard data for waytoidea.com

Resource Scan

Scan Details

Site Domain waytoidea.com
Base Domain waytoidea.com
Scan Status Ok
Last Scan2025-07-08T16:05:17+00:00
Next Scan 2025-07-15T16:05:17+00:00

Last Scan

Scanned2025-07-08T16:05:17+00:00
URL https://waytoidea.com/robots.txt
Redirect https://www.waytoidea.com/robots.txt
Redirect Domain www.waytoidea.com
Redirect Base waytoidea.com
Domain IPs 104.21.15.8, 172.67.160.250, 2606:4700:3031::6815:f08, 2606:4700:3036::ac43:a0fa
Redirect IPs 104.21.15.8, 172.67.160.250, 2606:4700:3031::6815:f08, 2606:4700:3036::ac43:a0fa
Response IP 172.67.160.250
Found Yes
Hash 625981dcfe80b154a2846edcbb495f10ee26956f25e6b726803bae34e6b70756
SimHash 2114f7d173f7

Groups

*

Rule Path Comment
Allow / -
Allow /blog$ Only allow the main blog page
Disallow /blog?page= Prevent pagination with query parameters
Disallow /blog/page Prevent pagination with path segments

googlebot-image

Rule Path
Allow /images/
Allow /images/blog/
Allow /images/blog/featured-images/
Allow /images/author/vishal-meena.webp
Allow /images/author/vishal-meena.png
Allow /favicons/

mediapartners-google

Rule Path
Allow /

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.waytoidea.com/sitemap.xml

Comments

  • Allow Google Image bot to access image files
  • Allow Google AdSense bot
  • Crawl-delay for all bots
  • Sitemap