dl.acm.org
robots.txt

Robots Exclusion Standard data for dl.acm.org

Resource Scan

Scan Details

Site Domain dl.acm.org
Base Domain acm.org
Scan Status Ok
Last Scan2025-11-06T13:57:10+00:00
Next Scan 2025-12-06T13:57:10+00:00

Last Scan

Scanned2025-11-06T13:57:10+00:00
URL https://dl.acm.org/robots.txt
Domain IPs 104.18.42.89, 172.64.145.167
Response IP 172.64.145.167
Found Yes
Hash b1000dd8b3ad0e41baba0e6638787999a0c27daad176000f8bab50061896882d
SimHash 6b1ed87087b1

Groups

*

Rule Path
Disallow /action/
Disallow /topic/
Disallow /keyword/
Disallow /author/
Disallow /search/
Disallow /web/
Disallow /pb/widgets/
Disallow /servlet/linkout
Disallow /na101/
Disallow /na101v1/
Disallow /na102/
Disallow /*.cfm
Disallow /doi/metrics/
Disallow /authored-by/
Disallow /history/
Allow /action/showBmPdf
Allow /action/showFmPdf
Allow /action/showAltPdf
Allow /action/showFmHtml
Allow /action/showBmHtml

facebookexternalhit
linkedinbot
twitterbot

Rule Path
Allow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://dl.acm.org/sitemap-index-1.txt