integrate.io
robots.txt

Robots Exclusion Standard data for integrate.io

Resource Scan

Scan Details

Site Domain integrate.io
Base Domain integrate.io
Scan Status Ok
Last Scan2024-07-03T18:02:14+00:00
Next Scan 2024-07-17T18:02:14+00:00

Last Scan

Scanned2024-07-03T18:02:14+00:00
URL https://integrate.io/robots.txt
Redirect https://www.integrate.io/robots.txt
Redirect Domain www.integrate.io
Redirect Base integrate.io
Domain IPs 13.227.254.104, 13.227.254.107, 13.227.254.2, 13.227.254.59
Redirect IPs 52.84.229.115, 52.84.229.124, 52.84.229.5, 52.84.229.52
Response IP 52.84.229.115
Found Yes
Hash 81ae8a0d67e6d9b772303a5b0a5259998685f3033a1cce4dfa05f7f8a17ba7c8
SimHash ba955d052440

Groups

*

Rule Path
Disallow /signup/welcome
Disallow /signup/thanks
Disallow /contact/thanks
Disallow /login
Disallow /*q%3D
Disallow /*.atom
Disallow /blog/tag/

Other Records

Field Value
sitemap https://www.integrate.io/sitemap.xml
sitemap https://www.integrate.io/blog-sitemap.xml
sitemap https://www.integrate.io/glossary-sitemap.xml
sitemap https://www.integrate.io/xplenty-docs-sitemap.xml
sitemap https://www.integrate.io/flydata-docs-sitemap.xml
sitemap https://www.integrate.io/dreamfactory-docs-sitemap.xml
sitemap https://www.integrate.io/webinars-sitemap.xml
sitemap https://www.integrate.io/webinars-japanese-sitemap.xml
sitemap https://www.integrate.io/customers-sitemap.xml
sitemap https://www.integrate.io/books-and-guides-sitemap.xml
sitemap https://www.integrate.io/careers-sitemap.xml

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • https://developers.google.com/webmasters/control-crawl-index/
  • To ban all spiders from the entire site uncomment the next two lines: