newspaperarchive.com
robots.txt

Robots Exclusion Standard data for newspaperarchive.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	newspaperarchive.com
Base Domain	newspaperarchive.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2024-09-05T08:34:59+00:00
Next Scan	2024-12-04T08:34:59+00:00

Last Successful Scan

Scanned	2023-04-18T13:14:26+00:00
URL	https://newspaperarchive.com/robots.txt
Domain IPs	172.66.40.104, 172.66.43.152, 2606:4700:3108::ac42:2868, 2606:4700:3108::ac42:2b98
Response IP	172.66.40.104
Found	Yes
Hash	8d9f1bd834c05e450698a97eebf5d4a7e6e2cf98c1d29cbc4d09f6a30c47d1ad
SimHash	c89a49d3c6f0

Groups

*

Rule	Path
Disallow	*qa.newspaperarchive.com
Disallow	*access.newspaperarchive.com
Disallow	/tags/*
Disallow	/serverstatus/*
Disallow	/cache/*
Disallow	/IIPViewerWeb/*
Disallow	/?
Disallow	/profile/*
Disallow	/Pubjpgimages/

Rule

Path

Disallow

*qa.newspaperarchive.com

Disallow

*access.newspaperarchive.com

Disallow

/tags/*

Disallow

/serverstatus/*

Disallow

/cache/*

Disallow

/IIPViewerWeb/*

Disallow

/?

Disallow

/profile/*

Disallow

/Pubjpgimages/

googlebot

Rule	Path
Allow	/Pubjpgimages/

Rule

Path

Allow

/Pubjpgimages/

googlebot

Rule	Path
Allow	/Pubjpgimages/

Rule

Path

Allow

/Pubjpgimages/

archive.is
sitecheck.internetseer.com
zealbot
sitesnagger
webstripper
webcopier
fetch
offline explorer
teleport
teleportpro
webzip
linko
httrack
xenu
larbin
libwww
zyborg
download ninja
myfamilybot
ia_archiver
yandex
ccbot
voltron
blexbot
googlebot-image

No rules defined. All paths allowed.

Back to top

Other Records

Field	Value
sitemap	https://newspaperarchive.com/sitemap.xml

Field

Value

sitemap

https://newspaperarchive.com/sitemap.xml

Back to top

newspaperarchive.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

googlebot

googlebot

archive.issitecheck.internetseer.comzealbotsitesnaggerwebstripperwebcopierfetchoffline explorerteleportteleportprowebziplinkohttrackxenularbinlibwwwzyborgdownload ninjamyfamilybotia_archiveryandexccbotvoltronblexbotgooglebot-image

Other Records

newspaperarchive.com
robots.txt

archive.is
sitecheck.internetseer.com
zealbot
sitesnagger
webstripper
webcopier
fetch
offline explorer
teleport
teleportpro
webzip
linko
httrack
xenu
larbin
libwww
zyborg
download ninja
myfamilybot
ia_archiver
yandex
ccbot
voltron
blexbot
googlebot-image