pdffiles.org
robots.txt

Robots Exclusion Standard data for pdffiles.org

Resource Scan

Scan Details

Site Domain pdffiles.org
Base Domain pdffiles.org
Scan Status Ok
Last Scan2025-11-05T19:39:45+00:00
Next Scan 2025-11-12T19:39:45+00:00

Last Scan

Scanned2025-11-05T19:39:45+00:00
URL https://pdffiles.org/robots.txt
Domain IPs 104.21.72.217, 172.67.155.159, 2606:4700:3034::6815:48d9, 2606:4700:3035::ac43:9b9f
Response IP 172.67.155.159
Found Yes
Hash 11d6f0bb9ce950fbd596c149f273a6fc1aedba404d39e5c9402ec8f10fa834bc
SimHash 653559ccf693

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /page/
Disallow /blog/page/*
Disallow /amp/page/*
Disallow /dgd_scrollbox/
Disallow /?s=*
Disallow /go/
Disallow /recommended/
Disallow /comments/feed/
Disallow /trackback/
Disallow /index.php
Disallow /xmlrpc.php
Disallow /search?
Disallow /?p=*
Disallow *?replytocom
Disallow */trackback
Disallow */feed
Disallow */comments
Disallow /tag/
Disallow /wp-admin/

Other Records

Field Value
sitemap https://pdffiles.org/sitemap_index.xml

Warnings

  • 1 invalid line.