news-addict.com
robots.txt

Robots Exclusion Standard data for news-addict.com

Resource Scan

Scan Details

Site Domain news-addict.com
Base Domain news-addict.com
Scan Status Ok
Last Scan2024-11-15T16:27:17+00:00
Next Scan 2024-12-15T16:27:17+00:00

Last Scan

Scanned2024-11-15T16:27:17+00:00
URL https://news-addict.com/robots.txt
Domain IPs 178.32.151.192
Response IP 178.32.151.192
Found Yes
Hash e547a81379bb8fa8f0e68f91964e96a7e385a9cb4e2e7002a28e2edefdc219aa
SimHash ade79b7e4331

Groups

baiduspider
yisouspider
petalbot
bytespider
sogou web spider
sogou inst spider

Rule Path
Disallow /

facebookbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

*

Rule Path
Disallow /scores*
Disallow /home*
Disallow /*flux*
Disallow /*favorite*
Disallow /*widget*
Disallow /*clubByName*
Disallow /*clubByFiltre*
Disallow /*modalite
Disallow /*fbshare
Disallow /*CacheiPhone
Disallow /*.php
Disallow /*newsFlux*
Disallow /*photosFlux*
Disallow /*videosFlux*
Disallow /*news_*
Disallow /*photos_*
Disallow /*videos_*
Disallow /*news*%20*
Disallow /*photos*%20*
Disallow /*videos*%20*
Disallow /*mraid*
Disallow /*images*
Disallow /*archives
Allow /$
Allow /*article*
Allow /photos$
Allow /videos$
Allow /news*
Allow /photos*
Allow /videos*
Allow /*routing*

Warnings

  • 1 invalid line.