classified.singtao.ca
robots.txt

Robots Exclusion Standard data for classified.singtao.ca

Resource Scan

Scan Details

Site Domain classified.singtao.ca
Base Domain singtao.ca
Scan Status Ok
Last Scan2025-08-01T02:50:03+00:00
Next Scan 2025-08-31T02:50:03+00:00

Last Scan

Scanned2025-08-01T02:50:03+00:00
URL https://classified.singtao.ca/robots.txt
Domain IPs 52.71.4.49
Response IP 52.71.4.49
Found Yes
Hash 627d1d3949fdf87185cc446050a778fc459b8843f728d4d87fe9f50102b9f862
SimHash 5d7eda11c0be

Groups

*

Rule Path
Allow /

Other Records

Field Value
crawl-delay 60

mediapartners-google

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

googlebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

googlebot-news

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

bingbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

msnbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

slurp

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

yandex

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

baiduspider

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

alexabot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

ia_archiver

Rule Path
Allow /

Other Records

Field Value
crawl-delay 300

ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot
applebot-extended
awario
bedrockbot
brightbot 1.0
bytespider
ccbot
chatgpt-user
claude-searchbot
claude-user
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
datenbank crawler
devin
diffbot
duckassistbot
echobot bot
echoboxbot
facebookbot
facebookexternalhit
factset_spyderbot
firecrawlagent
friendlycrawler
gemini-deep-research
google-cloudvertexbot
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mistralai-user
mistralai-user/1.0
mycentralaiscraperbot
netestate imprint crawler
novaact
oai-searchbot
omgili
omgilibot
operator
pangubot
panscient
panscient.com
perplexity-user
perplexitybot
perplexity-user
petalbot
phindbot
poseidon research crawler
qualifiedbot
quillbot
quillbot.com
sbintuitionsbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
summalybot
thinkbot
tiktokspider
timpibot
velenpublicwebcrawler
wardbot
webzio-extended
wpbot
yandexadditional
yandexadditionalbot
youbot
piplbot
sosospider

Rule Path
Disallow /
Disallow /admin/
Disallow /ajax/
Disallow /assets/
Disallow /css/
Disallow /js/
Disallow /vendor/
Disallow /main.php
Disallow /index.php
Disallow /mix-manifest.json
Disallow /locale/en
Disallow /locale/zh_Hant
Disallow /locale/zh
Disallow /auth/facebook
Disallow /auth/linkedin
Disallow /auth/twitter
Disallow /auth/google

Other Records

Field Value
sitemap https://classified.singtao.ca/ca/sitemaps.xml

Comments

  • ========================
  • 1. Default Rules (All Bots)
  • ========================
  • ========================
  • 2. Google Ad Manager (FASTER Crawl)
  • ========================
  • ========================
  • 3. Other Google & Bing Bots (Standard Crawl)
  • ========================
  • ========================
  • 4. Blocked Bots (AI, Scrapers, Spam)
  • ========================
  • ========================
  • 5. Disallowed Paths (Security & SEO)
  • ========================
  • ========================
  • 6. Sitemaps
  • ========================