mainpost.de
robots.txt

Robots Exclusion Standard data for mainpost.de

Resource Scan

Scan Details

Site Domain mainpost.de
Base Domain mainpost.de
Scan Status Ok
Last Scan2024-12-20T18:03:46+00:00
Next Scan 2024-12-27T18:03:46+00:00

Last Scan

Scanned2024-12-20T18:03:46+00:00
URL https://mainpost.de/robots.txt
Redirect https://www.mainpost.de/robots.txt
Redirect Domain www.mainpost.de
Redirect Base mainpost.de
Domain IPs 82.211.32.210
Redirect IPs 82.211.32.210
Response IP 82.211.32.210
Found Yes
Hash ccf37e24589dabcaac2451381488bc22e3f9442834d8f3c7903a894d386f4202
SimHash ba661f182d67

Groups

*

Rule Path
Disallow /member/
Disallow /specials/geschichte/
Disallow /freizeit/
Disallow /_/tools/diaview.html?*
Disallow /_/tools/picview.html?*
Disallow /socialbookmark/
Disallow /_/tools/tedstat.html*
Disallow /admin/
Disallow /alt/*
Disallow /archiv/*
Disallow /leoevtman/
Disallow /mm/
Disallow /tgs-videos/
Disallow /tagesspiegel/
Disallow /mpnlneu/
Disallow /leoevtadrkino/
Disallow /leoevtadr/
Disallow /leoevtart/
Disallow /abfall/*
Disallow /abfall2/*
Disallow /dpa/
Disallow /*?_FRAME=*
Disallow /*?_FRAME=33$
Disallow /*?_FRAME=64$
Disallow /*?po_id=*
Disallow /*?SID*
Disallow /*?fcms=*
Disallow /*?link2=*
Disallow /*?select1=*
Disallow /*?pin_type=*
Disallow /*%26list%3D1$
Disallow /*xmv*%2C*%2C*
Disallow /*?page*
Disallow /register/
Disallow /fcms/
Disallow /xmedia/
Disallow /hos_test/
Disallow /login/
Disallow /wap/
Disallow /topnews-feed/
Disallow /termine2/
Disallow /toplisten/
Disallow /netzwerk/
Disallow /lesezeichen/
Disallow /intern/
Disallow /mainde/
Disallow /sport/hos_toplisten_test_sport/
Disallow /sport/tabellen/*
Disallow /sport/ueberregional/dpa/
Disallow /nachrichten/wirtschaft/afxline/
Disallow /werbung_alt/
Disallow /tools_cms1/
Disallow /tools/
Disallow /*?_CMFUNC*
Disallow /werbung/
Disallow /werbung2009/
Disallow /werbung2008/
Disallow /werbung2007/
Disallow /freizeit/hos_termine/
Disallow /freizeit/suche2/
Disallow /anzeigen/antz2/
Disallow /anzeigen/alt/
Disallow /werbung/adserver-check/
Disallow /demo/
Disallow /user/
Disallow /intern/kantine/
Disallow /contentdiaserienartikel/
Disallow /boxenartikel/
Disallow /*?_FRAME=*
Disallow /_/tools/diaview.html?prev=true*
Disallow /_/tools/pdfpage.html
Disallow /_/tools/pdfpage.html*
Disallow /_/tools/pdfpage.html?arid=*
Disallow /*admin*/
Disallow /_/sendmail.html*
Disallow /_/tools/bb_redirect.html*
Disallow /storage/mct/*
Disallow /storage/epapage/*
Disallow /storage/med/archiv/*
Disallow /archiv-ueberregional/*
Disallow /frames/*
Disallow /*?wt_mc=*
Disallow /*?pk_campaign=RSS
Disallow /_fWS/*
Disallow /_CPiX/*

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.mainpost.de/index-sitemap.sitemap.xml
sitemap https://www.mainpost.de/portal-sitemap.sitemap.xml

Comments

  • ID: 1
  • Legal notice: mainpost.de expressly reserves the right to use its content for commercial text and data mining (ยง 44b UrhG).
  • The use of robots or other automated means to access mainpost.de or collect or mine data without the express permission of mainpost.de is strictly prohibited.
  • If you would like to apply for permission to crawl mainpost.de, collect or use data, please contact syndication@mainpost.de
  • OpenAI
  • Google Bard
  • Common Crawl Foundation