/.well-known/

Log In Sign Up

centurylink.net
robots.txt

Robots Exclusion Standard data for centurylink.net

Archived Snapshots

Resource Scan

Scan Details

Site Domain	centurylink.net
Base Domain	centurylink.net
Scan Status	Ok
Last Scan	2025-12-19T07:27:12+00:00
Next Scan	2025-12-26T07:27:12+00:00

Last Scan

Scanned	2025-12-19T07:27:12+00:00
URL	https://centurylink.net/robots.txt
Domain IPs	129.159.71.219
Response IP	129.159.71.219
Found	Yes
Hash	747ba5e62e3e943b7b991cd04184402a14ab61f27d409236abad1434f7735d0b
SimHash	d4157b52c481

Groups

*

Rule

Path

Disallow

/google/

Disallow

/search/

Disallow

/provisioning/

Disallow

/library/

Disallow

/files/

Disallow

/*?*u_d=

Disallow

/*?*email=

Disallow

/*?*e-mail=

admantx
alphabot
anthropic-ai
awariorssbot
awariosmartbot
blexbot
buzzbot
bytespider
ccbot
chatgpt-user
claritybot
claude-web
claudebot
cohere-ai
diffbot
facebookbot
friendlycrawler
google-extended
gptbot
huggingface
imagesiftbot
img2dataset
magpie-crawler
meltwater
neevabot
news-please
newsnow
nutch
omgili
omgilibot
panscient.com
perplexity-ai
perplexitybot
petalbot
piplbot
scoop.it
scrapy
seekr
sentibot
seznambot
turnitinbot
youbot
zumbot

Rule

Path

Disallow

/

Back to top