ncnewsline.com
robots.txt

Robots Exclusion Standard data for ncnewsline.com

Resource Scan

Scan Details

Site Domain ncnewsline.com
Base Domain ncnewsline.com
Scan Status Ok
Last Scan2024-05-19T12:22:33+00:00
Next Scan 2024-06-18T12:22:33+00:00

Last Scan

Scanned2024-05-19T12:22:33+00:00
URL https://ncnewsline.com/robots.txt
Domain IPs 104.26.14.193, 104.26.15.193, 172.67.69.213, 2606:4700:20::681a:ec1, 2606:4700:20::681a:fc1, 2606:4700:20::ac43:45d5
Response IP 104.26.14.193
Found Yes
Hash 820632aafaefc56ef41482618ead236ce736b4ed7c0d1f58c93aea51a254f8fe
SimHash 741c6940a4a2

Groups

*

Rule Path
Disallow

adsbot-google

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

peer39_crawler

Rule Path
Disallow /

peer39_crawler/1.0

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://ncnewsline.com/sitemap_index.xml

Comments

  • START YOAST BLOCK
  • ---------------------------
  • ---------------------------
  • END YOAST BLOCK