paperblog.com
robots.txt

Robots Exclusion Standard data for paperblog.com

Resource Scan

Scan Details

Site Domain paperblog.com
Base Domain paperblog.com
Scan Status Ok
Last Scan2024-11-12T18:48:40+00:00
Next Scan 2024-11-19T18:48:40+00:00

Last Scan

Scanned2024-11-12T18:48:40+00:00
URL https://paperblog.com/robots.txt
Domain IPs 104.21.19.19, 172.67.184.119, 2606:4700:3032::ac43:b877, 2606:4700:3036::6815:1313
Response IP 104.21.19.19
Found Yes
Hash d4980a0f409536b671afecd22b95091826653e4ab83d1b702f16a3273821bff2
SimHash 235cd4705411

Groups

mediapartners-google

Rule Path
Disallow

*

Rule Path
Disallow /admin/
Disallow /accounts/activate/
Disallow /accounts/logout/
Disallow /accounts/unregister/
Disallow /articles/
Disallow /users/order/
Disallow /users/authors/
Disallow /users/search/
Disallow /users/password_reset/
Disallow /users/actions/
Disallow /espaces/
Disallow /flux/
Disallow /alaune/
Disallow /votes/
Disallow /forum/new-fb-comment/
Disallow /forum/create/
Disallow /forum/
Disallow /r/*
Disallow /i/
Disallow /plugins/feedback.php

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

ahrefsbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

www.integromedb.org/crawler

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.paperblog.fr/sitemap.xml