epaper.pknewspapers.com
robots.txt

Robots Exclusion Standard data for epaper.pknewspapers.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	epaper.pknewspapers.com
Base Domain	pknewspapers.com
Scan Status	Ok
Last Scan	2025-04-26T10:35:52+00:00
Next Scan	2025-05-10T10:35:52+00:00

Last Scan

Scanned	2025-04-26T10:35:52+00:00
URL	https://epaper.pknewspapers.com/robots.txt
Redirect	https://newspaperspk.com/robots.txt
Redirect Domain	newspaperspk.com
Redirect Base	newspaperspk.com
Domain IPs	104.21.24.146, 172.67.219.64, 2606:4700:3036::ac43:db40, 2606:4700:3037::6815:1892
Redirect IPs	104.21.84.199, 172.67.196.149, 2606:4700:3031::6815:54c7, 2606:4700:3037::ac43:c495
Response IP	172.67.196.149
Found	Yes
Hash	3b5f9a5ce12d3f8711217ab46aaa1f9fa14ba089f03c4deb1f0bc315cf1f32cf
SimHash	01d4556169bc

Groups

*

Rule	Path
Allow	/.js
Allow	/.css
Allow	/.png
Allow	/.jpg
Allow	/.gif
Disallow	/administrator/
Disallow	/api/
Disallow	/bin/
Disallow	/cache/
Disallow	/cli/
Disallow	/includes/
Disallow	/installation/
Disallow	/language/
Disallow	/layouts/
Disallow	/libraries/
Disallow	/logs/
Disallow	/tmp/

Rule

Path

Allow

/*.js*

Allow

/*.css*

Allow

/*.png*

Allow

/*.jpg*

Allow

/*.gif*

Disallow

/administrator/

Disallow

/api/

Disallow

/bin/

Disallow

/cache/

Disallow

/cli/

Disallow

/includes/

Disallow

/installation/

Disallow

/language/

Disallow

/layouts/

Disallow

/libraries/

Disallow

/logs/

Disallow

/tmp/

Back to top

Other Records

Field	Value
sitemap	https://newspaperspk.com/sitemap.xml
sitemap	https://newspaperspk.com/sitemap_articles_pakistani_newspapers.xml
sitemap	https://newspaperspk.com/sitemap_articles_indian_newspapers.xml
sitemap	https://newspaperspk.com/sitemap_articles_world_newspapers.xml
sitemap	https://newspaperspk.com/sitemap_images.xml

Field

Value

sitemap

https://newspaperspk.com/sitemap.xml

sitemap

https://newspaperspk.com/sitemap_articles_pakistani_newspapers.xml

sitemap

https://newspaperspk.com/sitemap_articles_indian_newspapers.xml

sitemap

https://newspaperspk.com/sitemap_articles_world_newspapers.xml

sitemap

https://newspaperspk.com/sitemap_images.xml

Back to top

Comments

robots.txt for https://newspaperspk.com
This file is used to guide web crawlers on how to interact with the site.
It allows crawlers to access certain resources and disallows others to ensure optimal indexing.
Allowing crawlers to access common web resources like JavaScript, CSS, and image files
Disallowing crawlers from accessing sensitive or backend directories
Sitemap entries to guide crawlers to the correct sitemap locations for better indexing

Back to top

epaper.pknewspapers.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

epaper.pknewspapers.com
robots.txt