newstral.com
robots.txt

Robots Exclusion Standard data for newstral.com

Resource Scan

Scan Details

Site Domain newstral.com
Base Domain newstral.com
Scan Status Ok
Last Scan2024-10-07T03:16:29+00:00
Next Scan 2024-10-14T03:16:29+00:00

Last Scan

Scanned2024-10-07T03:16:29+00:00
URL https://newstral.com/robots.txt
Domain IPs 138.201.137.196
Response IP 138.201.137.196
Found Yes
Hash 7367b52450c03c802d1294e44791ab156683faa08ea62f0a662ce2616aea76e6
SimHash 321c7d6de572

Groups

*

Rule Path
Disallow /nl/maps
Disallow /nl/regions
Disallow /nl/people
Disallow /nl/organisations
Disallow /en/cars
Disallow /es/cars
Disallow /nl/cars
Disallow /de/article/en
Disallow /de/article/es
Disallow /en/article/de
Disallow /en/article/es
Disallow /en/article/nl
Disallow /es/article/en
Disallow /es/article/de
Disallow /es/article/nl
Disallow /nl/article/en
Disallow /nl/article/es
Disallow /nl/article/de
Disallow /sources/1
Disallow /sources/2
Disallow /sources/3
Disallow /sources/4
Disallow /sources/5
Disallow /sources/6
Disallow /sources/7
Disallow /sources/8
Disallow /sources/9

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

turnitinbot

Rule Path
Disallow /

seobility

Rule Path
Disallow /

james bot

Rule Path
Disallow /

seostats

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

waybackarchive.org

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

yisouspider

Rule Path
Disallow /

http://www.baidu.com/search/spider.html

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

proximic

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

katbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

fastbot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

ssearch

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

careerbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

niki-bot

Rule Path
Disallow /

bot-pge.chlooe.com

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

grapeshotcrawler

Rule Path
Disallow /

cms crawler

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://newstral.com/sitemap_index.xml

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:

Warnings

  • 2 invalid lines.