pacer.org
robots.txt

Robots Exclusion Standard data for pacer.org

Resource Scan

Scan Details

Site Domain pacer.org
Base Domain pacer.org
Scan Status Ok
Last Scan2024-05-21T08:11:18+00:00
Next Scan 2024-06-20T08:11:18+00:00

Last Scan

Scanned2024-05-21T08:11:18+00:00
URL https://pacer.org/robots.txt
Redirect https://www.pacer.org/robots.txt
Redirect Domain www.pacer.org
Redirect Base pacer.org
Domain IPs 104.26.14.49, 104.26.15.49, 172.67.70.162, 2606:4700:20::681a:e31, 2606:4700:20::681a:f31, 2606:4700:20::ac43:46a2
Redirect IPs 104.26.14.49, 104.26.15.49, 172.67.70.162, 2606:4700:20::681a:e31, 2606:4700:20::681a:f31, 2606:4700:20::ac43:46a2
Response IP 104.26.15.49
Found Yes
Hash 13886e1e29b526974728c699818427549709e2f699929066f198e5b039d00e41
SimHash 7850ca08c912

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /script/
Disallow /Connections/
Disallow /mpc/list/
Disallow /mpc/piweek/
Disallow /publications/internal/
Disallow /housing/emails/
Disallow /newsletters/eblast/
Disallow /forms/workshops.asp
Disallow /workshops/flyer/
Disallow /workshops/emails/
Disallow /help/pdf/
Disallow /international/india/
Disallow /bullying/resources/activities/toolkits/spookley/pdf/
Disallow /bullying/newsletter/edition/
Disallow /stc/atfinder/
Disallow /bullying/wewillgen/