newstral.com
robots.txt

Robots Exclusion Standard data for newstral.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	newstral.com
Base Domain	newstral.com
Scan Status	Ok
Last Scan	2024-11-18T10:49:52+00:00
Next Scan	2024-11-25T10:49:52+00:00

Last Scan

Scanned	2024-11-18T10:49:52+00:00
URL	https://newstral.com/robots.txt
Domain IPs	138.201.137.196
Response IP	138.201.137.196
Found	Yes
Hash	7367b52450c03c802d1294e44791ab156683faa08ea62f0a662ce2616aea76e6
SimHash	321c7d6de572

Groups

*

Rule	Path
Disallow	/nl/maps
Disallow	/nl/regions
Disallow	/nl/people
Disallow	/nl/organisations
Disallow	/en/cars
Disallow	/es/cars
Disallow	/nl/cars
Disallow	/de/article/en
Disallow	/de/article/es
Disallow	/en/article/de
Disallow	/en/article/es
Disallow	/en/article/nl
Disallow	/es/article/en
Disallow	/es/article/de
Disallow	/es/article/nl
Disallow	/nl/article/en
Disallow	/nl/article/es
Disallow	/nl/article/de
Disallow	/sources/1
Disallow	/sources/2
Disallow	/sources/3
Disallow	/sources/4
Disallow	/sources/5
Disallow	/sources/6
Disallow	/sources/7
Disallow	/sources/8
Disallow	/sources/9

Rule

Path

Disallow

/nl/maps

Disallow

/nl/regions

Disallow

/nl/people

Disallow

/nl/organisations

Disallow

/en/cars

Disallow

/es/cars

Disallow

/nl/cars

Disallow

/de/article/en

Disallow

/de/article/es

Disallow

/en/article/de

Disallow

/en/article/es

Disallow

/en/article/nl

Disallow

/es/article/en

Disallow

/es/article/de

Disallow

/es/article/nl

Disallow

/nl/article/en

Disallow

/nl/article/es

Disallow

/nl/article/de

Disallow

/sources/1

Disallow

/sources/2

Disallow

/sources/3

Disallow

/sources/4

Disallow

/sources/5

Disallow

/sources/6

Disallow

/sources/7

Disallow

/sources/8

Disallow

/sources/9

msnbot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seobility

Rule	Path
Disallow	/

Rule

Path

Disallow

james bot

Rule	Path
Disallow	/

Rule

Path

Disallow

seostats

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

waybackarchive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

easouspider

Rule	Path
Disallow	/

Rule

Path

Disallow

youdaobot

Rule	Path
Disallow	/

Rule

Path

Disallow

yisouspider

Rule	Path
Disallow	/

Rule

Path

Disallow

http://www.baidu.com/search/spider.html

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou web spider

Rule	Path
Disallow	/

Rule

Path

Disallow

proximic

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandeximages

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms

Rule	Path
Disallow	/

Rule

Path

Disallow

unisterbot

Rule	Path
Disallow	/

Rule

Path

Disallow

katbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

crystalsemanticsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

fastbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mail.ru_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

meanpathbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti

Rule	Path
Disallow	/

Rule

Path

Disallow

ssearch

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

spbot

Rule	Path
Disallow	/

Rule

Path

Disallow

careerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

niki-bot

Rule	Path
Disallow	/

Rule

Path

Disallow

bot-pge.chlooe.com

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix

Rule	Path
Disallow	/

Rule

Path

Disallow

grapeshotcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

cms crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

searchmetricsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://newstral.com/sitemap_index.xml

Field

Value

sitemap

https://newstral.com/sitemap_index.xml

Comments

See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
To ban all spiders from the entire site uncomment the next two lines:

Warnings

2 invalid lines.

newstral.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

msnbot

Other Records

turnitinbot

seobility

james bot

seostats

baiduspider

waybackarchive.org

easouspider

youdaobot

yisouspider

http://www.baidu.com/search/spider.html

sogou web spider

proximic

yandexbot

yandeximages

ezooms

unisterbot

katbot

mj12bot

archive.org_bot

exabot

ahrefsbot

crystalsemanticsbot

fastbot

mail.ru_bot

meanpathbot

yeti

ssearch

surveybot

spbot

careerbot

blexbot

niki-bot

bot-pge.chlooe.com

sistrix

grapeshotcrawler

cms crawler

searchmetricsbot

Other Records

Comments

Warnings

newstral.com
robots.txt