soirmag.be
robots.txt

Robots Exclusion Standard data for soirmag.be

Archived Snapshots

Resource Scan

Scan Details

Site Domain	soirmag.be
Base Domain	soirmag.be
Scan Status	Ok
Last Scan	2024-10-26T05:58:08+00:00
Next Scan	2024-11-25T05:58:08+00:00

Last Scan

Scanned	2024-10-26T05:58:08+00:00
URL	https://soirmag.be/robots.txt
Redirect	https://soirmag.lesoir.be/robots.txt
Redirect Domain	soirmag.lesoir.be
Redirect Base	lesoir.be
Domain IPs	109.7.16.204, 90.83.65.204
Redirect IPs	23.192.150.25, 23.192.150.27, 2600:1413:b000:6::17d5:2bc5, 2600:1413:b000:6::17d5:2bca
Response IP	23.48.107.24
Found	Yes
Hash	b81f6defba883422311a2b88a6b4f4b0b7f8a21ca8a5bd49ba4d841a78fb4a16
SimHash	30945d08c674

Groups

*

Rule	Path
Allow	/misc/*.css$
Allow	/misc/*.css?
Allow	/misc/*.js$
Allow	/misc/*.js?
Allow	/misc/*.gif
Allow	/misc/*.jpg
Allow	/misc/*.jpeg
Allow	/misc/*.png
Allow	/modules/*.css$
Allow	/modules/*.css?
Allow	/modules/*.js$
Allow	/modules/*.js?
Allow	/modules/*.gif
Allow	/modules/*.jpg
Allow	/modules/*.jpeg
Allow	/modules/*.png
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/themes/*.css$
Allow	/themes/*.css?
Allow	/themes/*.js$
Allow	/themes/*.js?
Allow	/themes/*.gif
Allow	/themes/*.jpg
Allow	/themes/*.jpeg
Allow	/themes/*.png
Disallow	/includes/
Disallow	/misc/
Disallow	/modules/
Disallow	/profiles/
Disallow	/scripts/
Disallow	/themes/
Disallow	/CHANGELOG.txt
Disallow	/cron.php
Disallow	/INSTALL.mysql.txt
Disallow	/INSTALL.pgsql.txt
Disallow	/INSTALL.sqlite.txt
Disallow	/install.php
Disallow	/INSTALL.txt
Disallow	/LICENSE.txt
Disallow	/MAINTAINERS.txt
Disallow	/update.php
Disallow	/UPGRADE.txt
Disallow	/xmlrpc.php
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/?q=admin%2F
Disallow	/?q=comment%2Freply%2F
Disallow	/?q=filter%2Ftips%2F
Disallow	/?q=node%2Fadd%2F
Disallow	/?q=search%2F
Disallow	/?q=user%2Fpassword%2F
Disallow	/?q=user%2Fregister%2F
Disallow	/?q=user%2Flogin%2F
Disallow	/?q=user%2Flogout%2F
Disallow	/81985301/LESOIR/
Disallow	/Sections/Soir.be/

Rule

Path

Allow

/misc/*.css$

Allow

/misc/*.css?

Allow

/misc/*.js$

Allow

/misc/*.js?

Allow

/misc/*.gif

Allow

/misc/*.jpg

Allow

/misc/*.jpeg

Allow

/misc/*.png

Allow

/modules/*.css$

Allow

/modules/*.css?

Allow

/modules/*.js$

Allow

/modules/*.js?

Allow

/modules/*.gif

Allow

/modules/*.jpg

Allow

/modules/*.jpeg

Allow

/modules/*.png

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/themes/*.css$

Allow

/themes/*.css?

Allow

/themes/*.js$

Allow

/themes/*.js?

Allow

/themes/*.gif

Allow

/themes/*.jpg

Allow

/themes/*.jpeg

Allow

/themes/*.png

Disallow

/includes/

Disallow

/misc/

Disallow

/modules/

Disallow

/profiles/

Disallow

/scripts/

Disallow

/themes/

Disallow

/CHANGELOG.txt

Disallow

/cron.php

Disallow

/INSTALL.mysql.txt

Disallow

/INSTALL.pgsql.txt

Disallow

/INSTALL.sqlite.txt

Disallow

/install.php

Disallow

/INSTALL.txt

Disallow

/LICENSE.txt

Disallow

/MAINTAINERS.txt

Disallow

/update.php

Disallow

/UPGRADE.txt

Disallow

/xmlrpc.php

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/?q=admin%2F

Disallow

/?q=comment%2Freply%2F

Disallow

/?q=filter%2Ftips%2F

Disallow

/?q=node%2Fadd%2F

Disallow

/?q=search%2F

Disallow

/?q=user%2Fpassword%2F

Disallow

/?q=user%2Fregister%2F

Disallow

/?q=user%2Flogin%2F

Disallow

/?q=user%2Flogout%2F

Disallow

/81985301/LESOIR/

Disallow

/Sections/Soir.be/

adequat

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat-systems

Rule	Path
Disallow	/

Rule

Path

Disallow

amisoftware

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

argus

Rule	Path
Disallow	/

Rule

Path

Disallow

ask n read

Rule	Path
Disallow	/

Rule

Path

Disallow

asknread.com

Rule	Path
Disallow	/

Rule

Path

Disallow

augure

Rule	Path
Disallow	/

Rule

Path

Disallow

auramundi

Rule	Path
Disallow	/

Rule

Path

Disallow

bloodhound

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

cision

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

coexel

Rule	Path
Disallow	/

Rule

Path

Disallow

converacrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

corporama

Rule	Path
Disallow	/

Rule

Path

Disallow

cydralspider

Rule	Path
Disallow	/

Rule

Path

Disallow

digimind

Rule	Path
Disallow	/

Rule

Path

Disallow

download ninja

Rule	Path
Disallow	/

Rule

Path

Disallow

downloadexpress

Rule	Path
Disallow	/

Rule

Path

Disallow

edd

Rule	Path
Disallow	/

Rule

Path

Disallow

ellisphere

Rule	Path
Disallow	/

Rule

Path

Disallow

eureka

Rule	Path
Disallow	/

Rule

Path

Disallow

europresse

Rule	Path
Disallow	/

Rule

Path

Disallow

explore

Rule	Path
Disallow	/

Rule

Path

Disallow

factiva

Rule	Path
Disallow	/

Rule

Path

Disallow

fasterfox

Rule	Path
Disallow	/

Rule

Path

Disallow

fetch

Rule	Path
Disallow	/

Rule

Path

Disallow

gammaspider

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

httrack

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

indexer

Rule	Path
Disallow	/

Rule

Path

Disallow

infoseek

Rule	Path
Disallow	/

Rule

Path

Disallow

jetbot

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

kantar

Rule	Path
Disallow	/

Rule

Path

Disallow

kbcrawl

Rule	Path
Disallow	/

Rule

Path

Disallow

knowings

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin

Rule	Path
Disallow	/

Rule

Path

Disallow

leadbox

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

linkfluence

Rule	Path
Disallow	/

Rule

Path

Disallow

linko

Rule

Path

Disallow

manageo

Rule

Path

Disallow

mediacompil

Rule

Path

Disallow

meltwater

Rule

Path

Disallow

mention

Rule

Path

Disallow

moreover

Rule

Path

Disallow

msiecrawler

Rule

Path

Disallow

mytwip

Rule

Path

Disallow

newscan-online

Rule

Path

Disallow

newsnow

Rule

Path

Disallow

newzbin

Rule

Path

Disallow

npbot

Rule

Path

Disallow

objectssearch

Rule

Path

Disallow

offline explorer

Rule

Path

Disallow

opinion-tracker

Rule

Path

Disallow

pimptrain

Rule

Path

Disallow

proxem

Rule

Path

Disallow

quepasacreep

Rule

Path

Disallow

qwam content intelligence

Rule

Path

Disallow

raven

Rule

Path

Disallow

readability.com

Rule

Path

Disallow

scoop.it

Rule

Path

Disallow

score3

Rule

Path

Disallow

sindup

Rule

Path

Disallow

sitecheck.internetseer.com

Rule

Path

Disallow

sitesnagger

Rule

Path

Disallow

spotter

Rule

Path

Disallow

synthesio

Rule

Path

Disallow

talkwater

Rule

Path

Disallow

teleport

Rule

Path

Disallow

teleportpro

Rule

Path

Disallow

trendeo

Rule

Path

Disallow

trendybuzz

Rule

Path

Disallow

tunitinbot

Rule

Path

Disallow

turnitinbot

Rule

Path

Disallow

up2news

Rule

Path

Disallow

vecteurplus

Rule

Path

Disallow

verif

Rule

Path

Disallow

verticalsearch

Rule

Path

Disallow

vsw

Rule

Path

Disallow

wapspider

Rule

Path

Disallow

webcopier

Rule

Path

Disallow

webreaper

Rule

Path

Disallow

webstripper

Rule

Path

Disallow

webzinger

Rule

Path

Disallow

webzip

Rule

Path

Disallow

wget

Rule

Path

Disallow

winello

Rule

Path

Disallow

youmag

Rule

Path

Disallow

zealbot

Rule

Path

Disallow

zite

Rule

Path

Disallow

zyborg

Rule

Path

Disallow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
Crawl-delay: 10
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
Other paths

Warnings

4 invalid lines.

soirmag.berobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

adequat

adequat-systems

amisoftware

anthropic-ai

argus

ask n read

asknread.com

augure

auramundi

bloodhound

ccbot

chatgpt-user

cision

claude-web

coexel

converacrawler

corporama

cydralspider

digimind

download ninja

downloadexpress

edd

ellisphere

eureka

europresse

explore

factiva

fasterfox

fetch

gammaspider

google-extended

gptbot

grub-client

httrack

ia_archiver

ia_archiver-web.archive.org

indexer

infoseek

jetbot

k2spider

kantar

kbcrawl

knowings

larbin

leadbox

libwww

linkfluence

linko

manageo

mediacompil

meltwater

mention

moreover

msiecrawler

mytwip

newscan-online

newsnow

newzbin

npbot

objectssearch

offline explorer

opinion-tracker

pimptrain

proxem

quepasacreep

qwam content intelligence

raven

readability.com

scoop.it

score3

sindup

sitecheck.internetseer.com

sitesnagger

spotter

soirmag.be
robots.txt