You are here:
Robots.txt Files for B2C Commerce
When a robot visits a website, such as http://www.yourshophere.com/,
it first checks for the robots.txt file and analyzes it to determine if it can index
the site. For the robots.txt file to be used by Salesforce B2C Commerce, you must set up your
hostname alias.
A robots.txt file tells crawlers which parts of your site they can access. B2C Commerce serves this file to any crawler that requests it, including traditional search engine crawlers such as Googlebot and AI-powered crawlers such as GPTBot, ClaudeBot, and PerplexityBot. Major crawlers honor robots.txt directives, though there can be minor differences in how specific directives are interpreted.
A default robots.txt file is included in the storefront reference application cartridge, in the cartridge/static/default directory. PWA Kit storefronts serve a default robots.txt through their Express server configuration. Robots.txt is delivered as UTF-8 content type.
Here is a sample robots.txt file that prevents all robots from visiting the entire site. This is appropriate for non-production instances such as Staging or Sandbox, where you don't want crawlers indexing your content:
User-agent: *
Disallow: /
The robot simply looks for a /robots.txt URI on your site, where a site is
defined as an HTTP server running on a particular host and port number.
You can also generate a robots.txt file using Business Manager, Google Search Console, or a third-party tool.
Storefront Password Protection and the Robots.txt File
If Storefront Password Protection is enabled, a robots.txt file is automatically generated and denies access to all static resources for a site. If Storefront Password Protection is disabled, the robots.txt file determines whether content is crawled. Because Storefront Password Protection automatically generates a robots.txt file, it must be disabled before you can specify another type of robots.txt file.
Creating Robots.txt for Single or Multiple Sites
If you want to create a single robots.txt file that can be used for multiple sites, you can use Google Search Console to create this file. However, you must have created a Google account to do so. If you choose not to use Google, you can use other third-party tools to create this file. This file must be uploaded to your cartridge after you create it. You must also invalidate the Static Content Cache for a new or different robots.txt file to be generated or served.
Understanding Caching and the Robots.txt File
If caching isn't enabled on your Staging site, any changes to the robots.txt file are immediately detected. However, if caching is enabled for your Staging instance, you must invalidate the Static Content Cache for a new or different robots.txt file to be generated or served. This requires permissions in the Administration module. The following instructions include information on invalidating the cache, though you might not need them if you don't have caching enabled.

