tweecampus.com
robots.txt

Robots Exclusion Standard data for tweecampus.com

Resource Scan

Scan Details

Site Domain tweecampus.com
Base Domain tweecampus.com
Scan Status Ok
Last Scan2025-12-12T14:50:02+00:00
Next Scan 2026-01-11T14:50:02+00:00

Last Scan

Scanned2025-12-12T14:50:02+00:00
URL https://tweecampus.com/robots.txt
Domain IPs 104.21.38.187, 172.67.137.94, 2606:4700:3032::ac43:895e, 2606:4700:3037::6815:26bb
Response IP 104.21.38.187
Found Yes
Hash 1da3cef70c61248c901740e6ff82af75049ba7240bd74981fbb2ec8b95b122e5
SimHash 6686c631e5a3

Groups

*

Rule Path
Allow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

slurp

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

baiduspider

Rule Path
Allow /

yandexbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

twitterbot

Rule Path
Allow /

linkedinbot

Rule Path
Allow /

whatsapp

Rule Path
Allow /

applebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://tweecampus.com/sitemap.xml

Comments

  • Allow all search engines to crawl the site
  • Sitemap location
  • Crawl-delay for respectful crawling
  • Host directive

Warnings

  • `host` is not a known field.