d1dagyb1ctngx1.cloudfront.net
robots.txt

Robots Exclusion Standard data for d1dagyb1ctngx1.cloudfront.net

Resource Scan

Scan Details

Site Domain d1dagyb1ctngx1.cloudfront.net
Base Domain d1dagyb1ctngx1.cloudfront.net
Scan Status Ok
Last Scan2024-11-14T16:39:34+00:00
Next Scan 2024-11-21T16:39:34+00:00

Last Scan

Scanned2024-11-14T16:39:34+00:00
URL https://d1dagyb1ctngx1.cloudfront.net/robots.txt
Domain IPs 13.33.88.125, 13.33.88.2, 13.33.88.25, 13.33.88.51, 2600:9000:223b:0:e:515c:9940:93a1, 2600:9000:223b:2600:e:515c:9940:93a1, 2600:9000:223b:400:e:515c:9940:93a1, 2600:9000:223b:4600:e:515c:9940:93a1, 2600:9000:223b:4800:e:515c:9940:93a1, 2600:9000:223b:ae00:e:515c:9940:93a1, 2600:9000:223b:b000:e:515c:9940:93a1, 2600:9000:223b:f000:e:515c:9940:93a1
Response IP 13.33.88.2
Found Yes
Hash d136d1732070b900578aa48e7e950fc17d3f39d5fad321e9ddc8d7128224d1a9
SimHash 78166940e283

Groups

*

Rule Path Comment
Disallow /myexpress/ -
Disallow /printer/ We'll keep the print version for our newspaper
Disallow /users/ -
Disallow /sponsored/ Advertorials
Disallow /trackings/ Adserving
Disallow /34722903/ Adserving
Disallow /search?* -
Disallow /videos/get_video_by_uid/ -
Disallow /videos/viewmeta/ -

grapeshot

Rule Path
Disallow

googlebot-news

Rule Path Comment
Disallow /myexpress/ -
Disallow /printer/ We'll keep the print version for our newspaper
Disallow /users/ -
Disallow /fun/ -
Disallow /sponsored/ Advertorials
Disallow /trackings/ Adserving
Disallow /34722903/ Adserving
Disallow /sponsoredfeatures -
Disallow /search?* -
Disallow /videos/get_video_by_uid/ -
Disallow /videos/viewmeta/ -

ia_archiver

Rule Path
Disallow /

nutch

Rule Path
Disallow /

ias_crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10.0

ias_wombles

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10.0

ias-ie

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10.0

daumoa

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

semetrical

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

perplexity-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.express.co.uk/sitemap.xml
sitemap https://www.express.co.uk/googlenews.xml

Comments

  • 170820-DXD-6728