Loading
Get Started with B2C Commerce
Table of Contents
Select Filters

          No results
          No results
          Here are some search tips

          Check the spelling of your keywords.
          Use more general search terms.
          Select fewer filters to broaden your search.

          Search all of Salesforce Help
          Generate a Robots.txt File with Business Manager for B2C Commerce

          Generate a Robots.txt File with Business Manager for B2C Commerce

          To create a robots.txt for one or more sites individually, use Business Manager. This robots.txt file is served to any requesting crawlers from the application server. It's stored as a site preference and can be replicated from one instance to another.

          Required Editions

          Available in: B2C Commerce

          In addition to traditional search engine crawlers such as Googlebot, AI-powered crawlers now request your robots.txt file. These include bots used for large-language-model training (GPTBot, ClaudeBot), AI-powered search results (OAI-SearchBot, PerplexityBot, Claude-SearchBot), and user-initiated AI queries (ChatGPT-User, Claude-User, Perplexity-User).

          When you select Custom robots.txt definition, the sample text includes User-agent entries for these crawlers. Major crawlers honor robots.txt directives, though there can be minor differences in how specific directives are interpreted. For the most current list of AI crawler user-agent tokens, see the documentation published by each provider.

          • You can write up to 50,000 characters to this file in Business Manager.
          • URIs are case-sensitive, and "/robots.txt" string must be all lower-case.
          • Blank lines aren’t permitted within a single record in the "robots.txt" file.
          • There must be exactly one User-agent field per record. The robot should be liberal in interpreting this field.
          • A case-insensitive substring match of the name without version information is recommended.
          • If the value is "*", the record describes the default access policy for any robot that hasn’t matched any of the other records.
          • It isn't allowed to have multiple such records in the "/robots.txt" file.
          • The "Disallow" field specifies a partial URI that isn't to be visited. This can be a full path, or a partial path; any URI that starts with this value won’t be retrieved. For example,
            Disallow: /help
            disallows both /help.html and /help/index.html, whereas
            Disallow: /help/
            would disallow /help/index.html but allow /help.html. An empty value for Disallow, indicates that all URIs can be retrieved.
          • At least one Disallow field must be present in the robots.txt file.
          1. Click App Launcher App Launcher, and then select Merchant Tools | Site | SEO & Discoverability | Robots
          2. Select the instance type to create a robots.txt file. If you want to create a robots.txt file for a Production instance, you can do so on a Staging instance and replicate the site preferences, where the robots.txt file definition is stored, from the Staging instance to the Production instance.
          3. Select one of these options:
            1. Use the robots.txt file from a deployed cartridge: Use Google Search Console or another third-party tool to generate your robots.txt file. Add the file to a cartridge on your site path. There can only be one robots.txt file per site. If you want to generate a robots.txt file using another tool and upload it to your cartridge. This option is most useful if you want to use the same robots.txt file for multiple sites. This is not recommended, because usually you want to have different settings for different instance types. For example, you don't want your sandbox or staging sites to be crawled, but you do want your production sites to be crawled. This can cause issues when replicating code to production. This option is only selected before a site goes live to test the robots.txt file.
            2. Define an instance type-specific robots.txt (recommended): Use this option to have B2C Commerce generate a robots.txt file for you or specify a custom robots.txt file for each of your instances.
          4. If you selected Define an instance type-specific robots.txt, select one of these options:
            1. All spiders are allowed to access any static resources (recommended for Production): Use this if you want your storefront to be crawled and available to external search engines, such as Google. This generates a site-specific robots.txt file that indicates spiders can crawl the static resources for the site.
            2. All spiders are disallowed to access any static resources (recommended for Staging): Use this if you don't want your storefront to be crawled and available to external search engines, such as Google. This generates a site-specific robots.txt file that indicates to spiders that they shouldn't crawl the static resources for the site.
            3. Custom robots.txt definition: Use this option if you want to control which parts of your storefront are crawled and available to search engines and AI crawlers. When you select this option, Business Manager populates the text area with a sample robots.txt file. All entries in the sample are commented out. Uncomment and customize the entries for your site.

              The following sample text was introduced in B2C Commerce 26.1 and includes user-agent entries for AI crawlers:

              
              ################################################################################
              ## SAMPLE robots.txt - Uncomment and customize as needed for your site
              ################################################################################
              ## To allow access to all crawlers, including AI crawlers, use a general rule
              # User-agent: *
              ## To allow or deny access to specific crawlers, enable them individually
              # User-agent: Googlebot
              # User-agent: OAI-SearchBot
              # User-agent: GPTBot
              # User-Agent: ChatGPT-User
              # User-Agent: ClaudeBot
              # User-Agent: Claude-SearchBot
              # User-Agent: Claude-User
              # User-Agent: Perplexity-User
              # User-Agent: PerplexityBot
              
              ## Example rules to allow access to content pages, but not customer
              ## specific or functional pages:
              # Allow: /
              # Disallow: */account/
              # Disallow: */cart/
              # Disallow: */login/
              # Disallow: */wishlist/
              # Disallow: */Login-Show/
              # Disallow: */Registration-Shopper/
              # Disallow: */Checkout/
              # Disallow: */Order/
              # Disallow: */Order-History/
              # Disallow: */Order-Confirmation/
              # Disallow: */Order-Track/
              
              ## A Crawl-delay can help throttle server impact. AI bots may not honor
              ## this, but this is good practice.
              # Crawl-delay: 10
              
              ## To help search engines find your content, add a pointer to your sitemap.
              # Sitemap: https://www.yourdomain.com/sitemap.xml

              If your storefront is built on SFRA or SiteGenesis, we recommend also adding the search refinement URL parameter directives shown in the postrequisites to your custom file.

          5. Click Apply.
          6. Select Administration | Sites | Manage Sites | site | Cache.
          7. In the Static Content and Page Caches section, click Invalidate.

          For information on where to upload your robots.txt file, see Upload a Robots.txt File.

          If your storefront is built on SFRA or SiteGenesis, we recommend adding the following lines to your robots.txt. These directives prevent crawlers from indexing filtered or sorted search result pages, which helps avoid duplicate content and keeps search indexes focused on your primary product and category pages. These parameters are specific to the SFRA and SiteGenesis URL format and don't apply to PWA Kit storefronts, which use SCAPI query parameters such as refine and sort.

          # Search refinement URL parameters (SFRA and SiteGenesis)
          Disallow: /*pmin*
          Disallow: /*pmax*
          Disallow: /*prefn1*
          Disallow: /*prefn2*
          Disallow: /*prefn3*
          Disallow: /*prefn4*
          Disallow: /*prefv1*
          Disallow: /*prefv2*
          Disallow: /*prefv3*
          Disallow: /*prefv4*
          Disallow: /*srule*

          If your storefront is built on PWA Kit (Composable Storefront), we recommend adding the following lines instead. PWA Kit uses SCAPI query parameters (refine, sort, and offset) in its /search and /category/ routes.

          # Search refinement, sort, and pagination URL parameters (PWA Kit)
          Disallow: /search?*refine*
          Disallow: /category/*?*refine*
          Disallow: /search?*sort*
          Disallow: /category/*?*sort*
          Disallow: /search?*offset*
          Disallow: /category/*?*offset*

          Set the Googlebot crawl rate to Low through Google Search Console, as Google ignores the crawl-delay directive in robots.txt, outlined in https://support.google.com/webmasters/answer/48620?hl=en. Most AI crawlers also ignore the Crawl-delay directive. To manage crawl rates for AI bots, consult each provider's documentation.

           
          Loading
          Salesforce Help | Article