tintasperu.com
robots.txt

Robots Exclusion Standard data for tintasperu.com

Resource Scan

Scan Details

Site Domain tintasperu.com
Base Domain tintasperu.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a server error.
Last Scan2025-07-01T06:36:08+00:00
Next Scan 2025-09-29T06:36:08+00:00

Last Successful Scan

Scanned2022-12-13T02:48:41+00:00
URL http://tintasperu.com/robots.txt
Domain IPs 185.151.30.196, 2a07:7800::196
Response IP 185.151.30.196
Found Yes
Hash eab1f7e4f69d241c930c4bfb085d9484b9e7f5854e24a45a9589d7c668943584
SimHash a0765953c7d5

Groups

mediapartners-google*

Rule Path
Disallow

googlebot-image

Rule Path
Disallow /

googlebot

Rule Path
Disallow /*.jpg$

*

Rule Path
Disallow /

googlebot

Rule Path
Disallow
Disallow /admin
Disallow /account.php
Disallow /advanced_search.php
Disallow /checkout_shipping.php
Disallow /create_account.php
Disallow /login.php
Disallow /login.php
Disallow /password_forgotten.php
Disallow /popup_image.php
Disallow /shopping_cart.php

googlebot

Rule Path
Disallow /MercadoLibre

Other Records

Field Value
sitemap http://cdn.attracta.com/sitemap/3283994.xml.gz

Comments

  • Sample robots.txt file (make sure the filename is ALL LOWERCASE on Linux/Unix systems)
  • This file should go in your web site's ROOT directory
  • The root directory is where your site's main /index.html file would be found
  • It is usually found in /yourhomedir/public_html/ or /yourhomedir/httpdocs
  • Where "yourhomedir" is your user account's name
  • We invite you to also check out our popular contribution: Simple Template System (STS)
  • It lets you layout or change your OSC look-and-feel by modifying a single HTML file
  • http://www.oscommerce.com/community/contributions,1524 or SimpleTemplateSystem.com
  • Enjoy! - Brian Gallagher @ DiamondSea.com
  • This says to apply these settings to ALL search engine spiders/crawlers
  • User-agent: *
  • These settings will keep spiders from indexing your unwanted pages
  • This assumes that your OSC install is in your web site's ROOT directory
  • ie: http://www.yoursite.com/index.php <- Use if this brings up your OSC main page
  • User-agent: Medapartners-Google*
  • Disallow:
  • Feel free to add any other pages on your site that you don't want to be indexed by
  • the search engines.
  • PLEASE NOTE: Any pages that you list here should be secured by other means if you
  • don't want people to be able to view them, as some malicious users will look at a
  • robots.txt file to try to find "hidden" or "secret" areas of web sites to find
  • confidential information.
  • Just Uncomment a line or add new ones as you see fit.
  • Disallow: /private
  • Disallow: /hidden
  • IF YOU DO NOT WISH TO HAVE THE GOOGLE IMAGE BOT SCAN YOUR DOMAIN FOR IMAGES
  • THEN YOU CAN INCLUDE THE FOLLOWING IN YOUR ROBOTS FILE.
  • I FOUND THAT MY BANDWIDTH USAGE DROPPED BY A MASSIVE AMOUNT AFTER I GOT RID
  • OF THE GOOGLE IMAGE BOT. ALL I HAD WAS IMAGE HUNTERS STEALING PRODUCT SHOTS
  • AND NOT EVEN BROWSING THE SITE.
  • User-agent: Googlebot-Image
  • Disallow: /
  • Begin Attracta SEO Tools Sitemap. Do not remove
  • End Attracta SEO Tools Sitemap. Do not remove