aisnenouvelle.fr
robots.txt

Robots Exclusion Standard data for aisnenouvelle.fr

Resource Scan

Scan Details

Site Domain aisnenouvelle.fr
Base Domain aisnenouvelle.fr
Scan Status Ok
Last Scan2025-07-12T00:32:11+00:00
Next Scan 2025-07-19T00:32:11+00:00

Last Scan

Scanned2025-07-12T00:32:11+00:00
URL https://aisnenouvelle.fr/robots.txt
Redirect https://www.aisnenouvelle.fr/robots.txt
Redirect Domain www.aisnenouvelle.fr
Redirect Base aisnenouvelle.fr
Domain IPs 109.7.16.62, 90.83.65.62
Redirect IPs 23.59.168.131, 2600:1413:5000:12::1737:27f1, 2600:1413:5000:12::1737:27f8
Response IP 104.88.70.112
Found Yes
Hash 6e22b6a3d8fdc6256157c2fcdf30d20b1e5789880e8e39a286d483a0d7130faa
SimHash 243f51f1c677

Groups

mediapartners-google
googlebot
googlebot-image
googlebot-mobile
googlebot-news
googlebot-video
adsbot-google
googlebot_nauxeo
twitterbot
applebot
bingbot
echoboxbot
publication-access-for-facebook
grapeshot
proximic
weborama-fetcher
upday
facebookexternalhit
flipboard
flipboardproxy
siteauditbot

Rule Path
Disallow /simplesaml
Disallow /simplsamlphp_auth/
Disallow /wallyextra/contenttypesajax
Disallow /esi/
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /admin
Disallow /admin/
Disallow /comment/reply/
Disallow /logout
Disallow /search/
Disallow /user/register
Disallow /user/password
Disallow /user/login
Disallow /cgi-bin/
Disallow /bears
Disallow /archives/recherche*
Disallow /agenda-evenements/*
Disallow */modules/*
Disallow */advertising_script.js
Disallow /node*
Disallow */www.ultimedia.com/js/common/smart.js
Disallow */video/MMV
Disallow /package-type/
Disallow /*?amp
Disallow /*?referer=%2Farchives%2F
Disallow /atom/
Disallow /*arthttps
Allow /misc/*.js
Allow /misc/*.css
Allow /modules/*.js
Allow /modules/*.css
Allow /profiles/*.js
Allow /profiles/*.css
Allow /themes/*.js
Allow /themes/*.css
Allow /.well-known/
Allow /apple-app-site-association
Allow */modules/*.js

adequat
adequat-systems
amisoftware
awariorssbot
awariosmartbot
argus
ask n read
asknread.com
augure
auramundi
bloodhound
cision
coexel
converacrawler
corporama
cydralspider
digimind
download ninja
downloadexpress
edd
ellisphere
eureka
europresse
explore
factiva
fasterfox
fetch
gammaspider
grub-client
httrack
ia_archiver
ia_archiver-web.archive.org
indexer
infoseek
jetbot
k2spider
kantar
kbcrawl
knowings
larbin
leadbox
libwww
linkfluence
linko
manageo
mediacompil
meltwater
mention
moreover
msiecrawler
mytwip
newscan-online
newsnow
newzbin
npbot
objectssearch
offline explorer
opinion-tracker
pimptrain
proxem
quepasacreep
qwam content intelligence
raven
readability.com
scoop.it
score3
sindup
sitecheck.internetseer.com
sitesnagger
spotter
synthesio
talkwater
teleport
teleportpro
trendeo
trendybuzz
tunitinbot
turnitinbot
up2news
vecteurplus
verif
verticalsearch
vsw
wapspider
webcopier
webreaper
webstripper
webzinger
webzip
wget
winello
youmag
zealbot
zite
zyborg

Rule Path
Disallow /

ai2bot
amazonbot
applebot-extended
anthropic-ai
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
diffbot
duckassistbot
facebookbot
google-extended
gptbot
meta-externalagent
meta-externalfetcher
oai-searchbot
perplexitybot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.aisnenouvelle.fr/sitemap.xml
sitemap https://www.aisnenouvelle.fr/sites/default/files/sitemaps/abonne_aisnenouvelle_fr/sitemapnews-0.xml

Comments

  • Agent Specific Allowed Sections
  • Sitemaps
  • General Disallowed Paths
  • General Allowed Paths
  • Not allowed bots
  • AI Data Scrapers

Warnings

  • 2 invalid lines.