grplca.com
robots.txt

Robots Exclusion Standard data for grplca.com

Resource Scan

Scan Details

Site Domain grplca.com
Base Domain grplca.com
Scan Status Ok
Last Scan2026-01-09T08:30:48+00:00
Next Scan 2026-01-16T08:30:48+00:00

Last Scan

Scanned2026-01-09T08:30:48+00:00
URL https://grplca.com/robots.txt
Redirect https://swtk.grplca.com/robots.txt
Redirect Domain swtk.grplca.com
Redirect Base grplca.com
Domain IPs 217.70.184.55
Redirect IPs 2001:4b98:dc2:55:216:3eff:fead:1129, 213.167.243.94
Response IP 213.167.243.94
Found Yes
Hash 3407db4d3703d0494b630079e9b53c125edb6e0cc13100154bbefce44afae777
SimHash ae51ab554c61

Groups

*

Rule Path
Allow /
Allow /sitemap.xml.gz
Disallow /out/
Disallow /login

googlebot

Rule Path
Allow /
Allow /sitemap.xml.gz
Disallow /*?
Disallow /*atct_album_view$
Disallow /*folder_factories$
Disallow /*folder_summary_view$
Disallow /*login_form$
Disallow /*mail_password_form$
Disallow /*search
Disallow /*search_rss$
Disallow /*sendto_form$
Disallow /*summary_view$
Disallow /*thumbnail_view$
Disallow /*view$

Other Records

Field Value
sitemap https://swtk.grplca.com/sitemap.xml.gz

Comments

  • Define access-restrictions for robots/spiders
  • http://www.robotstxt.org/wc/norobots.html
  • By default we allow robots to access all areas of our site
  • already accessible to anonymous users
  • Add Googlebot-specific syntax extension to exclude forms
  • that are repeated for each piece of content in the site
  • the wildcard is only supported by Googlebot
  • http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling