tafsirhulm.com
robots.txt

Robots Exclusion Standard data for tafsirhulm.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	tafsirhulm.com
Base Domain	tafsirhulm.com
Scan Status	Ok
Last Scan	2026-02-20T17:05:39+00:00
Next Scan	2026-02-27T17:05:39+00:00

Last Scan

Scanned	2026-02-20T17:05:39+00:00
URL	https://tafsirhulm.com/robots.txt
Domain IPs	104.21.28.12, 172.67.170.37, 2606:4700:3033::6815:1c0c, 2606:4700:3037::ac43:aa25
Response IP	172.67.170.37
Found	Yes
Hash	244e323362838b9c2aa6e6cf50d5cb1fbd26b69f37d409274053ceb7f2917d80
SimHash	bfaf5d7364e7

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

/

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

/

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

/

slurp

Rule	Path
Allow	/

Rule

Path

Allow

/

duckduckbot

Rule	Path
Allow	/

Rule

Path

Allow

/

baiduspider

Rule	Path
Allow	/

Rule

Path

Allow

/

yandexbot

Rule	Path
Allow	/
Disallow	/admin/
Disallow	/wp-admin/
Disallow	/administrator/
Disallow	/cpanel/
Disallow	/phpmyadmin/
Disallow	/blogs?page=
Disallow	/tmp/
Disallow	/temp/
Disallow	/cache/
Disallow	/logs/
Disallow	/private/
Disallow	/includes/
Disallow	/config/
Allow	/css/
Allow	/js/
Allow	/images/
Allow	/img/
Allow	/assets/

Rule

Path

Allow

/

Disallow

/admin/

Disallow

/wp-admin/

Disallow

/administrator/

Disallow

/cpanel/

Disallow

/phpmyadmin/

Disallow

/blogs?page=

Disallow

/tmp/

Disallow

/temp/

Disallow

/cache/

Disallow

/logs/

Disallow

/private/

Disallow

/includes/

Disallow

/config/

Allow

/css/

Allow

/js/

Allow

/images/

Allow

/img/

Allow

/assets/

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

Back to top

Other Records

Field	Value
sitemap	https://tafsirhulm.com/sitemap.xml

Field

Value

sitemap

https://tafsirhulm.com/sitemap.xml

Back to top

Comments

Allow all search engines to crawl the site
Disallow crawling of admin areas (if any)
Disallow paginated blog listing pages (they're duplicates - canonical tags handle indexing)
Only /blogs (page 1) should be indexed, paginated pages are for navigation only
Disallow crawling of temporary files
Disallow crawling of private files
Allow crawling of important directories
Main sitemap index - references all sitemaps (static pages + all blog posts)
This is the standard entry point that search engines check first
Crawl delay (optional - be respectful to server resources)
Host directive (specify the preferred domain)

Back to top

Warnings

`host` is not a known field.

Back to top

tafsirhulm.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot

bingbot

slurp

duckduckbot

baiduspider

yandexbot

Other Records

Other Records

Comments

Warnings

tafsirhulm.com
robots.txt