newsunzip.com
robots.txt

Robots Exclusion Standard data for newsunzip.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	newsunzip.com
Base Domain	newsunzip.com
Scan Status	Ok
Last Scan	2024-11-08T23:45:38+00:00
Next Scan	2024-11-15T23:45:38+00:00

Last Scan

Scanned	2024-11-08T23:45:38+00:00
URL	https://newsunzip.com/robots.txt
Domain IPs	104.21.74.227, 172.67.164.15, 2606:4700:3030::ac43:a40f, 2606:4700:3034::6815:4ae3
Response IP	104.21.74.227
Found	Yes
Hash	6ea1c6cf3ea10ad8e3a00801b130fd3dea3a2576f1d1ceccb807c1dd35c5c4f8
SimHash	5ba85540e903

Groups

*

Rule	Path
Disallow	/wp-admin/
Allow	/wp-admin/admin-ajax.php

Rule

Path

Disallow

/wp-admin/

Allow

/wp-admin/admin-ajax.php

ninjabot

Rule	Path
Allow	/

Rule

Path

Allow

mediapartners-google*

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-image

Rule	Path
Allow	/wp-content/uploads/

Rule

Path

Allow

/wp-content/uploads/

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-mobile

Rule	Path
Allow	/

Rule

Path

Allow

applebot
googlebot
googlebot-image
googlebot-news
googlebot-video
googlebot-mobile
adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

openaibot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

serpstatbot

Rule	Path
Disallow	/

Rule

Path

Disallow

majesticseo

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

moz

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefssiteaudit

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ba

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-si

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-swa

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ct

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

alexibot

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

xenuâs

Rule	Path
Disallow	/

Rule

Path

Disallow

xenuâs link sleuth 1.1c

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

nextgensearchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org bot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkwalker

Rule	Path
Disallow	/

Rule

Path

Disallow

gigablast spider

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

picscout

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

tineye

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

uptimerobot/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms robot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler (+http://www.website-datenbank.de/)

Rule	Path
Disallow	/

Rule

Path

Disallow

wiseguys robot

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitin robot

Rule

Path

Disallow

heritrix

Rule

Path

Disallow

pimonster

Rule

Path

Disallow

pimonster

Rule

Path

Disallow

pi-monster

Rule

Path

Disallow

eccp/1.0 (search@eniro.com)

Rule

Path

Disallow

psbot

Rule

Path

Disallow

youdaobot

Rule

Path

Disallow

blexbot

Rule

Path

Disallow

naverbot
yeti

Rule

Path

Disallow

zbot

Rule

Path

Disallow

vagabondo

Rule

Path

Disallow

linkwalker

Rule

Path

Disallow

simplepie

Rule

Path

Disallow

wget

Rule

Path

Disallow

pixray-seeker

Rule

Path

Disallow

boardreader

Rule

Path

Disallow

quantify

Rule

Path

Disallow

plukkie

Rule

Path

Disallow

cuam

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

megaindex.com

Rule

Path

Disallow

megaindex.ru/2.0

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

ccbot

Rule

Path

Disallow

gptbot

Rule

Path

Disallow

chatgpt-user

Rule

Path

Disallow

anthropic-ai

Rule

Path

Disallow

cohere-ai

Rule

Path

Disallow

ia_archiver

Rule

Path

Disallow

omgili

Rule

Path

Disallow

omgilibot

Rule

Path

Disallow

mj12bot

Rule

Path

Disallow

piplbot

Rule

Path

Disallow

google-extended

Rule

Path

Disallow

meltwater

Rule

Path

Disallow

Other Records

Field

Value

sitemap

https://www.newsunzip.com/sitemap.xml

sitemap

https://www.newsunzip.com/sitemap-news.xml

Comments

Block NextGenSearchBot
Block ia-archiver from crawling site
Block archive.org_bot from crawling site
Block Archive.org Bot from crawling site
Block LinkWalker from crawling site
Block GigaBlast Spider from crawling site
Block ia_archiver-web.archive.org_bot from crawling site
Block PicScout Crawler from crawling site
Block BLEXBot Crawler from crawling site
Block TinEye from crawling site
Block SEOkicks
Block BlexBot
Block SISTRIX
Block Uptime robot
Block Ezooms Robot
Block netEstate NE Crawler (+http://www.website-datenbank.de/)
Block WiseGuys Robot
Block Turnitin Robot
Block Heritrix
Block pricepi
Block Eniro
Block Psbot
Block Youdao
BLEXBot
Block NaverBot
Block ZBot
Block Vagabondo
Block LinkWalker
Block SimplePie
Block Wget
Block Pixray-Seeker
Block BoardReader
Block Quantify
Block Plukkie
Block Cuam
https://megaindex.com/crawler

Warnings

2 invalid lines.

newsunzip.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

ninjabot

mediapartners-google*

googlebot-image

adsbot-google

googlebot-mobile

applebotgooglebotgooglebot-imagegooglebot-newsgooglebot-videogooglebot-mobileadsbot-google

openaibot

ahrefsbot

semrushbot

serpstatbot

majesticseo

mj12bot

moz

rogerbot

dotbot

ahrefssiteaudit

mj12bot

semrushbot

semrushbot-sa

semrushbot-ba

semrushbot-si

semrushbot-swa

semrushbot-ct

dotbot

ahrefsbot

alexibot

surveybot

xenuâs

xenuâs link sleuth 1.1c

rogerbot

nextgensearchbot

ia_archiver

archive.org_bot

archive.org bot

linkwalker

gigablast spider

ia_archiver-web.archive.org

picscout

blexbot crawler

tineye

seokicks-robot

blexbot

sistrix crawler

uptimerobot/2.0

ezooms robot

netestate ne crawler (+http://www.website-datenbank.de/)

wiseguys robot

turnitin robot

heritrix

pimonster

pimonster

pi-monster

eccp/1.0 (search@eniro.com)

psbot

youdaobot

blexbot

naverbotyeti

zbot

vagabondo

linkwalker

simplepie

wget

pixray-seeker

boardreader

quantify

plukkie

cuam

megaindex.ru

megaindex.com

megaindex.ru/2.0

megaindex.ru

ccbot

gptbot

chatgpt-user

newsunzip.com
robots.txt

applebot
googlebot
googlebot-image
googlebot-news
googlebot-video
googlebot-mobile
adsbot-google

xenuâs

xenuâs link sleuth 1.1c

naverbot
yeti