topteachingtasks.com
robots.txt

Robots Exclusion Standard data for topteachingtasks.com

Resource Scan

Scan Details

Site Domain topteachingtasks.com
Base Domain topteachingtasks.com
Scan Status Ok
Last Scan2025-10-31T16:13:41+00:00
Next Scan 2025-11-07T16:13:41+00:00

Last Scan

Scanned2025-10-31T16:13:41+00:00
URL https://topteachingtasks.com/robots.txt
Domain IPs 104.21.14.223, 172.67.160.160, 2606:4700:3031::ac43:a0a0, 2606:4700:3034::6815:edf
Response IP 172.67.160.160
Found Yes
Hash aeeadf75f0622b4787c8cc93646e9225ecbdf6b70fad933a88a4173e035fb0ea
SimHash 090958d2e3d3

Groups

scrapy

Rule Path
Allow /

*

Rule Path
Disallow /*blackhole
Disallow /?blackhole

*

Rule Path
Disallow /cart/
Disallow /wishlist/
Disallow /checkout/
Disallow /my-account/
Disallow /*add-to-cart%3D*
Disallow /*?filter
Disallow /*?orderby=*
Disallow /*?add-to-wishlist=*

*

Rule Path
Disallow /search/
Disallow /*?s=*
Disallow /*%26p%3D*
Disallow /%26preview%3D*

Other Records

Field Value
sitemap https://topteachingtasks.com/sitemap_index.xml

Comments

  • Block Bad Bots
  • Block Woocommerce assets
  • Block Search Assets