attendanceguru.com
robots.txt

Robots Exclusion Standard data for attendanceguru.com

Resource Scan

Scan Details

Site Domain attendanceguru.com
Base Domain attendanceguru.com
Scan Status Ok
Last Scan2025-09-23T22:03:46+00:00
Next Scan 2025-10-23T22:03:46+00:00

Last Scan

Scanned2025-09-23T22:03:46+00:00
URL https://attendanceguru.com/robots.txt
Domain IPs 3.165.102.115, 3.165.102.122, 3.165.102.123, 3.165.102.96
Response IP 3.165.102.122
Found Yes
Hash 5ce0bb647793433a4e115108c2bd7c32fdb528f65954ad02e5200cba3f03b3d9
SimHash a9a21559a456

Groups

*

Rule Path Comment
Allow / -
Disallow /login/ -
Disallow / /edzon/
Disallow /signinupextended -
Allow /*.css -
Allow /*.js -
Allow /*.png -
Allow /*.jpg -
Allow /*.gif -
Allow /*.svg -
Allow /ads.txt -
Allow /ads/preferences/ -
Allow /gpt/ -
Allow /pagead/show_ads.js -
Allow /pagead/js/adsbygoogle.js -
Allow /pagead/js/*/show_ads_impl.js -
Allow /static/glade.js -
Allow /static/glade/ -

Other Records

Field Value
sitemap https://attendanceguru.com/subdomain.xml

Comments

  • Best Practices robots.txt Example
  • 1. Sitemap Declaration(s)
  • Always declare your sitemap(s) to help search engines discover your important pages.
  • Use the full URL to your sitemap(s). If you have multiple, list them all.
  • Sitemap: https://website-sitemap.s3.ap-south-1.amazonaws.com/subdomain.xml
  • Sitemap: https://website-sitemap.s3.ap-south-1.amazonaws.com/conversationseo.xml
  • Sitemap: https://website-sitemap.s3.ap-south-1.amazonaws.com/sitemap.xml
  • 2. User-Agent Directives
  • Apply directives to all crawlers unless a specific crawler needs different rules.
  • 3. General Allowance (often implicit or good for clarity)
  • Allow crawling of the entire site by default. More specific Disallow rules will override this for specific paths.
  • 4. Disallow Directives (Commonly Blocked Areas)
  • Block areas that are not intended for public search results or are purely functional.
  • - Administrative areas (e.g., login, admin dashboards)
  • - User-specific pages (e.g., user profiles, settings) that are not public
  • - Internal search result pages (can create infinite crawl loops and low-value content)
  • - Shopping cart/checkout processes (once the user starts them)
  • - Development/staging environments
  • Specific disallows from your original list (adjust as needed based on intent)
  • Disallow: /#/edzon/attendanceguru/ # Only if this path is truly not meant for indexing
  • 5. Handling of CSS, JavaScript, and Images (CRITICAL FOR RENDERING)
  • Google explicitly recommends *not* blocking CSS, JavaScript, or images that are
  • essential for rendering the page's content or understanding its layout.
  • Blocking them can lead to "degraded" or "incomplete" rendering by Googlebot.
  • If you have non-essential JS/CSS (e.g., very large analytics files that don't affect content),
  • you *could* disallow them, but it's often not necessary.
  • ALLOW all CSS and JS for proper rendering.
  • 6. Specific Allowances for Third-Party Scripts (like AdSense, Google Analytics)
  • These are often allowed even if there's a broader disallow that might accidentally catch them.
  • Your original file had good examples of these.