wisma138c.net
robots.txt

Robots Exclusion Standard data for wisma138c.net

Resource Scan

Scan Details

Site Domain wisma138c.net
Base Domain wisma138c.net
Scan Status Ok
Last Scan2025-11-29T15:02:02+00:00
Next Scan 2025-12-29T15:02:02+00:00

Last Scan

Scanned2025-11-29T15:02:02+00:00
URL https://wisma138c.net/robots.txt
Domain IPs 104.21.37.98, 172.67.206.222, 2606:4700:3031::6815:2562, 2606:4700:3035::ac43:cede
Response IP 172.67.206.222
Found Yes
Hash ef5a859839056a230d1e0ad1f7fee937f3afb7aaa1e6ae99f215614af85b56c3
SimHash a9044f1167a1

Groups

*

Rule Path
Disallow /admin/
Disallow /login/
Disallow /checkout/
Disallow /cart/
Disallow /private/
Disallow /user/
Disallow /register/

googlebot

Rule Path
Disallow /admin/
Disallow /login/
Disallow /checkout/
Disallow /cart/
Disallow /private/
Disallow /user/
Disallow /register/

badbot

Rule Path
Disallow /
Disallow /*.pdf$
Disallow /*.zip$
Disallow /*.tar$

googlebot-image

Rule Path
Allow /images/

Other Records

Field Value
sitemap https://wisma138c.net/sitemap.xml

Comments

  • robots.txt for best Google crawling
  • Allow all search engines to crawl all content
  • Allow Googlebot to crawl everything except restricted areas
  • Block a specific bot from crawling the site
  • Sitemap location (helps crawlers find your sitemap easily)
  • Block bots from indexing certain file types (like PDFs or temporary files)
  • Enable Googlebot to crawl your images