henitan.com
robots.txt

Robots Exclusion Standard data for henitan.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	henitan.com
Base Domain	henitan.com
Scan Status	Ok
Last Scan	5/19/2025, 9:02:31 AM
Next Scan	5/26/2025, 9:02:31 AM

Last Scan

Scanned	5/19/2025, 9:02:31 AM
URL	https://henitan.com/robots.txt
Domain IPs	104.21.79.172, 172.67.146.160, 2606:4700:3033::ac43:92a0, 2606:4700:3037::6815:4fac
Response IP	104.21.79.172
Found	Yes
Hash	768bde52814028f14dcdbe0b883d87f1ee8a5d8e05d53ff1c70ddd0e60b20051
SimHash	6635d85367e0

Groups

*

Rule	Path
Disallow	/wp-admin/
Disallow	/wp-includes/
Disallow	/wp-content/plugins/
Disallow	/wp-content/cache/
Disallow	/tmp/
Disallow	/private/
Disallow	/backup/
Disallow	/scripts/

Rule

Path

Disallow

/wp-admin/

Disallow

/wp-includes/

Disallow

/wp-content/plugins/

Disallow

/wp-content/cache/

Disallow

/tmp/

Disallow

/private/

Disallow

/backup/

Disallow

/scripts/

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

/

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

/

badbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

/

*

Rule	Path
Disallow	/112924tpgealegtw-.html
Disallow	/145460tpgetokyo/aleqtg-.htm
Disallow	/41446tpgealecgs-el.html
Disallow	/*.html$
Disallow	/*.htm$

Rule

Path

Disallow

/112924tpgealegtw-.html

Disallow

/145460tpgetokyo/aleqtg-.htm

Disallow

/41446tpgealecgs-el.html

Disallow

/*.html$

Disallow

/*.htm$

Back to top

Other Records

Field	Value
sitemap	https://henitan.com/sitemap_index.xml

Field

Value

sitemap

https://henitan.com/sitemap_index.xml

Back to top

Comments

robots.txt for https://henitan.com
Allow all user agents to crawl the entire site
Allow Googlebot and other major search engines to crawl
Block specific bots that may harm your site
Sitemap file for better indexing
Allow crawling of AdSense-related content
Crawl-delay settings for less aggressive bots (optional)
User-agent: *
Crawl-delay:
Block any URLs that end with .html or .htm

Back to top

henitan.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot

bingbot

badbot

adsbot-google

*

Other Records

Comments

henitan.com
robots.txt