mpf.mp.br
robots.txt

Robots Exclusion Standard data for mpf.mp.br

Archived Snapshots

Resource Scan

Scan Details

Site Domain	mpf.mp.br
Base Domain	mpf.mp.br
Scan Status	Ok
Last Scan	2025-06-12T14:51:31+00:00
Next Scan	2025-07-12T14:51:31+00:00

Last Scan

Scanned	2025-06-12T14:51:31+00:00
URL	https://www.mpf.mp.br/robots.txt
Domain IPs	200.142.30.42
Response IP	200.142.30.42
Found	Yes
Hash	18c39c85791823cc3e99cbf4099cea503f3696ac3b0007debdae19153cf148ac
SimHash	2d1581554d41

Groups

*

Rule	Path
Disallow	/*?
Disallow	/*atct_album_view$
Disallow	/*folder_factories$
Disallow	/*folder_summary_view$
Disallow	/*login_form$
Disallow	/*mail_password_form$
Disallow	/*search
Disallow	/*search_rss
Disallow	/*searchRSS
Disallow	/*updated_search
Disallow	/*sendto_form$

Rule

Path

Disallow

/*?

Disallow

/*atct_album_view$

Disallow

/*folder_factories$

Disallow

/*folder_summary_view$

Disallow

/*login_form$

Disallow

/*mail_password_form$

Disallow

/*search

Disallow

/*search_rss

Disallow

/*searchRSS

Disallow

/*updated_search

Disallow

/*sendto_form$

Back to top

Other Records

Field	Value
sitemap	/sitemap.xml.gz

Field

Value

sitemap

/sitemap.xml.gz

Back to top

Comments

Define access-restrictions for robots/spiders
http://www.robotstxt.org/wc/norobots.html
By default we allow robots to access all areas of our site accessible to
anonymous users, except for search, which burns our CPU for no reason.
Block all URLs including query strings (? pattern) - contentish objects expose query string only for actions or status reports which
might confuse search results.
This will also block ?set_language
Add Googlebot-specific syntax extension to exclude forms
that are repeated for each piece of content in the site
the wildcard is only supported by Googlebot
http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling

Back to top

mpf.mp.brrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

mpf.mp.br
robots.txt