theatlantic.com
robots.txt

Robots Exclusion Standard data for theatlantic.com

Resource Scan

Scan Details

Site Domain theatlantic.com
Base Domain theatlantic.com
Scan Status Ok
Last Scan2024-10-29T12:02:55+00:00
Next Scan 2024-11-05T12:02:55+00:00

Last Scan

Scanned2024-10-29T12:02:55+00:00
URL https://theatlantic.com/robots.txt
Redirect https://www.theatlantic.com/robots.txt
Redirect Domain www.theatlantic.com
Redirect Base theatlantic.com
Domain IPs 151.101.130.133, 151.101.194.133, 151.101.2.133, 151.101.66.133
Redirect IPs 199.232.194.133, 199.232.198.133
Response IP 151.101.42.133
Found Yes
Hash 0998d6ae476aed39c7388e9175174b3041f59324d6792b7af4c9ae0a2c159014
SimHash 7108d953e480

Groups

*

Rule Path
Disallow /4624/TheAtlanticOnline/*
Disallow /magazine/archive/2010/11/letters-to-the-editor/308258/
Disallow /magazine/archive/2010/11/letters-to-the-editor/308258/*
Disallow /ab/*
Disallow /video/embed/
Disallow /video/iframe/*
Disallow /search/?*q=*
Allow /magazine/archive/2001/02/bill-clinton-and-his-consequences/303383/$
Disallow /magazine/archive/2001/02/bill-clinton-and-his-consequences/303383/*
Allow /

Other Records

Field Value
crawl-delay 1

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

awariorssbot
awariosmartbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

newsnow

Rule Path
Disallow /

news-please

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

quora-bot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.theatlantic.com/sitemap.xml
sitemap https://www.theatlantic.com/sponsored/sitemap.xml