pdf.directindustry.com
robots.txt

Robots Exclusion Standard data for pdf.directindustry.com

Resource Scan

Scan Details

Site Domain pdf.directindustry.com
Base Domain directindustry.com
Scan Status Ok
Last Scan2025-12-31T17:30:57+00:00
Next Scan 2026-01-30T17:30:57+00:00

Last Scan

Scanned2025-12-31T17:30:57+00:00
URL https://pdf.directindustry.com/robots.txt
Domain IPs 104.18.4.208, 104.18.5.208
Response IP 104.18.5.208
Found Yes
Hash 668bb5500c250a35f2976659debb8307d011b006754822192720ceac594a221e
SimHash 6906bd02efbb

Groups

ocelli

Rule Path
Disallow /

psbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

*

Rule Path
Disallow /images_*/2ai/
Disallow /restricted/
Disallow /*/restricted/
Disallow /r/
Disallow /*/r/
Disallow /scripts/
Disallow /*/scripts/
Disallow /tab/
Disallow /*/tab/
Disallow /pdf/tab/
Disallow /*/pdf/tab/
Disallow /*/pdf-en/
Disallow /cache_*/
Disallow /pdf/*/Show/
Disallow /*/pdf/*/Show/
Disallow /pdf/incat/
Disallow /*/pdf/incat/
Disallow /pdf/incatsoc/
Disallow /*/pdf/incatsoc/
Disallow /*favicon.ico
Disallow /*.pdf$
Disallow /pdf-en/
Disallow /ajax/
Disallow /*/ajax/
Disallow /static/ressources/
Disallow /*/static/ressources/
Disallow /*.json$
Disallow /request*$
Disallow /*/request*$
Disallow /images/*$
Disallow /localization/country/list.html$
Disallow /*/localization/country/list.html
Disallow /*?*
Disallow /myspace/
Disallow /*/myspace/
Disallow /tracking/*
Disallow /*/images_*/2ai/
Disallow /*/images/*
Disallow /*/tracking/*
Disallow /pdf/*-_*.html
Disallow /*/pdf/*-_*.html
Disallow /discover-us/thank-you.html
Disallow /newsletter/
Disallow /jsErrorHandler
Disallow /compare.html
Disallow /*/compare.html
Disallow /prod2/
Disallow /*/prod2/
Disallow /rfq/
Disallow /mailing/*
Disallow /*/mailing/*