archive.triblive.com
robots.txt

Robots Exclusion Standard data for archive.triblive.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	archive.triblive.com
Base Domain	triblive.com
Scan Status	Ok
Last Scan	2024-10-29T12:11:34+00:00
Next Scan	2024-11-28T12:11:34+00:00

Last Scan

Scanned	2024-10-29T12:11:34+00:00
URL	https://archive.triblive.com/robots.txt
Domain IPs	18.117.15.191, 18.223.36.122
Response IP	18.117.15.191
Found	Yes
Hash	616841ae9443288d71c79e948f786fb1d9945e5ee7a02e99ec8316ca23c5955e
SimHash	c369f355cf07

Groups

*
mediapartners-google

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	20

Field

Value

crawl-delay

twitterbot

Rule	Path
Disallow
Allow	*

Rule

Path

Disallow

Allow

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

facebook

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	20

Field

Value

crawl-delay

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	60

Field

Value

crawl-delay

slurp

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	60

Field

Value

crawl-delay

yahoo! slurp

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	60

Field

Value

crawl-delay

addthis
ahrefsbot
amazonadbot
archivebot
awariosmartbot
baiduspider
blackwidow
blexbot
ccbot
coccocbot
chinaclaw
clickagy
clickagy intelligence bot
cliqzbot
custo
dotbot
demandbasepublisheranalyzer
disco
download\ demon
ecatch
exabot
eirgrabber
emailsiphon
emailwolf
express\ webpictures
extractorpro
eyenetie
flashget
getright
getweb!
gigabot
go!zilla
go-ahead-got-it
grapeshot
grapeshotcrawler
grabnet
grafula
grammarly
hmview
httrack
image\ stripper
image\ sucker
indy\ library
interget
internet\ ninja
jetcar
joc\ web\ spider
larbin
leechftp
linkisbot
mass\ downloader
mediawords
midown\ tool
mister\ pix
monsidobot
mj12bot
navroad
nearsite
netants
netspider
net\ vampire
netzip
ntentbot
octopus
offline\ explorer
offline\ navigator
pagegrabber
papa\ foto
pavuk
pcbrowser
proximic
pu_in
qwantify
realdownload
reget
riddler
rogerbot
scrapy
serpstatbot
semrushbot
semrushbot-sa
sitesnagger
sogou
smartdownload
superbot
superhttp
surfbot
takeout
teleport\ pro
the\ knowledge\ ai
trendictionbot
turnitinbot
tweetmemebot
voideye
velenpublicwebcrawler
web\ image\ collector
web\ sucker
webauto
webcopier
webfetch
webgo\ is
webleacher
webreaper
websauger
website\ extractor
website\ quester
webstripper
webwhacker
webzip
wget
widow
wwwoffle
xaldon\ webspider
yandex
zeus

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://archive.triblive.com/sitemap.xml

Field

Value

sitemap

https://archive.triblive.com/sitemap.xml

Comments

Updated 06/04/20

Warnings

1 invalid line.

archive.triblive.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*mediapartners-google

Other Records

twitterbot

Other Records

facebook

Other Records

googlebot

Other Records

bingbot

Other Records

slurp

Other Records

yahoo! slurp

Other Records

Other Records

Comments

Warnings

archive.triblive.com
robots.txt

*
mediapartners-google