matchinmedia.nl
robots.txt
Robots Exclusion Standard data for matchinmedia.nl
Resource Scan
Scan Details
Site Domain | matchinmedia.nl |
Base Domain | matchinmedia.nl |
Scan Status | Ok |
Last Scan | 2024-11-11T13:24:30+00:00 |
Next Scan | 2024-11-25T13:24:30+00:00 |
Last Scan
Scanned | 2024-11-11T13:24:30+00:00 |
URL | https://matchinmedia.nl/robots.txt |
Domain IPs | 162.159.140.127 |
Response IP | 162.159.140.127 |
Found | Yes |
Hash | 7e10f2919ebf554feef512870a8c99a79353f644988bb7a44ffb82bb95dedd56 |
SimHash | 610c2d638f04 |
Groups
*
Rule | Path |
---|---|
Disallow | /overig/instellingen/generiek |
Disallow | /overig/instellingen/vacaturebank |
Disallow | /aspnet_client/ |
Disallow | /bin/ |
Disallow | /config/ |
Disallow | /data/ |
Disallow | /install/ |
Disallow | /macroScripts/ |
Disallow | /masterpages/ |
Disallow | /umbraco/ |
Disallow | /umbraco_client/ |
Disallow | /usercontrols/ |
Disallow | /xslt/ |
Disallow | /*?* |
Allow | /*?from= |
Allow | /*?page= |
Allow | /*?v= |
Allow | /*?currentpage= |
Allow | /media/* |
Other Records
Field | Value |
---|---|
sitemap | https://3d606fef-9d64-4796-b908-13885cc20929.azurewebsites.net/sitemap.xml |