trashmail.in
robots.txt

Robots Exclusion Standard data for trashmail.in

Resource Scan

Scan Details

Site Domain trashmail.in
Base Domain trashmail.in
Scan Status Ok
Last Scan2026-01-25T08:32:52+00:00
Next Scan 2026-02-01T08:32:52+00:00

Last Scan

Scanned2026-01-25T08:32:52+00:00
URL https://trashmail.in/robots.txt
Domain IPs 95.217.58.48
Response IP 95.217.58.48
Found Yes
Hash fad802868dfca3235da2722e1a8859b8fb6654bc131cac93f1f338f40219c200
SimHash 7d08bad0c493

Groups

*

Rule Path
Allow /
Allow /blog
Allow /post/
Allow /page/
Allow /category/
Disallow /login
Disallow /register
Disallow /password/
Disallow /auth/
Disallow /admin/
Disallow /api/
Disallow /private/
Disallow /logs/

mediapartners-google

Rule Path
Allow /

gptbot

Rule Path
Allow /

google-extended

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

claudebot

Rule Path
Allow /

Other Records

Field Value
sitemap https://trashmail.in/sitemap.xml

Comments

  • TrashMail.in - Search Engine & AdSense Governance
  • Block thin/utility pages from indexing to prevent "Low Value" flags
  • Specifically Allow AdSense Crawler (Crucial for Approval)
  • Explicit AI & LLM Bot Permissions
  • Sitemap Direction