joinwagroups.link
robots.txt

Robots Exclusion Standard data for joinwagroups.link

Resource Scan

Scan Details

Site Domain joinwagroups.link
Base Domain joinwagroups.link
Scan Status Ok
Last Scan2025-08-14T05:22:18+00:00
Next Scan 2025-09-13T05:22:18+00:00

Last Scan

Scanned2025-08-14T05:22:18+00:00
URL https://joinwagroups.link/robots.txt
Domain IPs 2a02:4780:38:97b7:50fd:10db:12e8:98a5, 2a02:4780:39:9ccc:215b:b1bb:95cb:f7ac, 84.32.84.169, 84.32.84.56
Response IP 77.37.75.124
Found Yes
Hash 9fd5708962018403cbc2a33633a16e0837fc377916c3532af29e4deb5a148b2e
SimHash 250b194934d6

Groups

*

Rule Path Comment
Disallow /admin.php -
Disallow /admin/ -
Disallow /bulk_upload.php -
Disallow /login.php If added
Disallow /db.php -
Disallow /config.php If exists
Disallow /submit_process.php -
Disallow /delete.php -
Disallow /bulk_delete.php -
Disallow /get_groups.php API/AJAX handlers
Disallow /getcountry_groups.php API/AJAX handlers
Disallow /api/ General API directory
Disallow /search.php -
Disallow /search If using clean URLs
Allow /css/ -
Allow /js/ -
Allow /assets/ If exists
Allow /uploads/ -
Allow /uploads/groups/ -
Allow /uploads/group_images/ -
Allow /images/ -

googlebot-image

Rule Path
Allow /uploads/
Allow /uploads/groups/
Allow /uploads/group_images/
Allow /images/

bingbot

Rule Path
Allow /uploads/
Allow /uploads/groups/
Allow /uploads/group_images/
Allow /images/

Other Records

Field Value
sitemap https://joinwagroups.link/sitemap.xml

Comments

  • ===================================================================
  • robots.txt for JoinWAGroups (joinwagroups.link)
  • Optimized for SEO and Security - Corrected Structure
  • ===================================================================
  • Define rules for all ethical crawlers first
  • --- Disallow rules for ALL crawlers ---
  • Admin / Backend / Sensitive Areas
  • Search results (often low SEO value)
  • Parameter blocking (use with caution - canonicals preferred)
  • Disallow: /*?search=
  • Disallow: /*?tab=
  • Disallow: /*?offset=
  • Disallow: /*?limit= # Blocking pagination parameters
  • --- Allow rules for ALL crawlers ---
  • Allow essential resources for rendering
  • Allow crawling of user-uploaded images (vital for Image Search SEO)
  • Ensure server prevents directory listing (e.g., .htaccess Options -Indexes)
  • Allow main site images
  • --- Specific Bot Rules ---
  • Google Image Bot (Explicitly allow image paths)
  • Bing Bot (Includes image crawling)
  • Example: Block GPTBot if desired (Optional)
  • User-agent: GPTBot
  • Disallow: /
  • ===================================================================
  • Sitemap Location
  • ===================================================================
  • IMPORTANT: Ensure these point to your ACTUAL sitemap URLs