goutu.org
robots.txt

Robots Exclusion Standard data for goutu.org

Resource Scan

Scan Details

Site Domain goutu.org
Base Domain goutu.org
Scan Status Ok
Last Scan2024-09-23T07:50:57+00:00
Next Scan 2024-09-30T07:50:57+00:00

Last Scan

Scanned2024-09-23T07:50:57+00:00
URL https://goutu.org/robots.txt
Domain IPs 2001:41d0:1:1b00:213:186:33:24, 213.186.33.24
Response IP 213.186.33.24
Found Yes
Hash 4b3bebd7ebed5696da72c6907c5853ebc1a39c12ed29ad873509fc69b25e18f0
SimHash 00bcc9e0943a

Groups

*

Rule Path
Disallow /cgi-bin
Disallow /wp-
Disallow /?s=
Disallow *%26s%3D
Disallow /search
Disallow /author/
Disallow *?attachment_id=
Disallow */feed
Disallow */rss
Disallow */embed
Allow /wp-content/uploads/
Allow /wp-content/themes/
Allow /*/*.js
Allow /*/*.css
Allow /wp-*.png
Allow /wp-*.jpg
Allow /wp-*.jpeg
Allow /wp-*.gif
Allow /wp-*.svg
Allow /wp-*.pdf

*

Rule Path
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/
Disallow /wp-json/
Disallow /?rest_route=

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://goutu.org/sitemap_index.xml
sitemap https://goutu.org/sitemap_index.xml

Comments

  • START YOAST BLOCK
  • ---------------------------
  • ---------------------------
  • END YOAST BLOCK