Loading
Salesforce now sends email only from verified domains. Read More
Own from Salesforce
Table of Contents
Select Filters

          No results
          No results
          Here are some search tips

          Check the spelling of your keywords.
          Use more general search terms.
          Select fewer filters to broaden your search.

          Search all of Salesforce Help
          Prepare Your Data and AWS S3 Environment for Archive Import

          Prepare Your Data and AWS S3 Environment for Archive Import

          To successfully import offloaded Salesforce data into the Archive managed package, you must prepare your AWS S3 environment and format your data files correctly.

          Note
          Note This content relates to Archive. For Salesforce Archive, see Store Data Externally with Salesforce Archive.

          Configure AWS S3 Bucket Permissions

          To establish a secure connection, the service needs specific permissions to access your AWS S3 bucket. Follow these steps to create a dedicated IAM policy for the IAM user that grants the necessary permissions.

          Summary of Required Permissions

          • List Bucket Contents

            • Action: s3:ListBucket
            • Resource: arn:aws:s3:::my-bucket
            • Condition: Only for objects with the prefix path/to/my/data/*
          • Read Objects

            • Action: s3:GetObject
            • Resource: arn:aws:s3:::my-bucket/path/to/my/data/*

            Use this JSON policy template to grant the required permissions.

            {
              "Version": "2012-10-17",
              "Statement": [
                {
                  "Sid": "AllowListFolder",
                  "Effect": "Allow",
                  "Action": "s3:ListBucket",
                  "Resource": "arn:aws:s3:::my-bucket",
                  "Condition": {
                    "StringLike": {
                      "s3:prefix": "path/to/my/data/*"
                    }
                  }
                },
                {
                  "Sid": "AllowReadObjects",
                  "Effect": "Allow",
                  "Action": "s3:GetObject",
                  "Resource": "arn:aws:s3:::my-bucket/path/to/my/data/*"
                }
              ]
            }

          Meet Data Import Prerequisites

          Your imported data files must meet these requirements.

          • Data Validation Requirements

            • In your current schema, make sure each CSV file's header includes at least 80% of the fields defined for that object.
            • The total size of all data in the AWS S3 bucket can't exceed 3TB.
          • File Format

            • Make sure each file is in CSV format.
            • Each CSV file contains records for only one Salesforce object type, such as all Case records in Case.csv.
            • In your current schema, make sure each CSV file name exactly matches the object's API name.
          • File Headers

            • The first row of each CSV must be a header row.
            • The headers must contain the exact API names of the fields for that object, such as FirstName.
            • For objects with physical files, you must include the field that contains the file name. Specifically:
              • For Attachment, CSV header must include Name.
              • For ContentVersion, CSV header must include PathOnClient.
          • Location
            • Place all imported data files in a single AWS S3 bucket.

          When your AWS S3 environment is configured, and your data is correctly formatted, initiate the Data Load from AWS S3 to Archive.

          • Structure Files and Attachments for Archive Import
            To import ContentVersion files and Attachment records into the Archive managed package, prepare your physical files and the corresponding metadata, stored in CSV files, so that Archive can correctly process and link your data.
          • Initiate the Data Load from AWS S3 to Archive
            Start the process of loading your prepared data from an AWS S3 bucket into the secure staging area. This action is the first active step in the import process and prepares your data for the final archival step in the Archive managed package.
          • Create and Run an Archive Import Policy
            After your offloaded Salesforce data has been successfully loaded into the Archive staging area, you must create and run a special import policy to complete the process. Configure, execute, and monitor this policy to move your staged data permanently into the Archive managed package.
          • Archive Import FAQs
            Frequently asked questions about Archive Import.

          Structure Files and Attachments for Archive Import

          To import ContentVersion files and Attachment records into the Archive managed package, prepare your physical files and the corresponding metadata, stored in CSV files, so that Archive can correctly process and link your data.

          Note
          Note This content relates to Archive. For Salesforce Archive, see Store Data Externally with Salesforce Archive.

          Make sure that you meet the general data and AWS S3 environment requirements for Archive Import.

          General CSV Requirements

          • The Id column, which contains an 18-character Salesforce ID, is mandatory for all CSV files in the import process.
          • A CSV header that includes the Id and all other relevant standard and custom fields defined for the object in your Salesforce schema.

          Import ContentVersion Files

          To import ContentVersion files:

          1. In your S3 bucket, create a directory named ContentDocuments.
          2. Place all your physical files, such as PDF and PNG, inside the ContentDocuments directory.
            Note
            Note It's critical that each file name matches the original file name first uploaded into Salesforce. If you have two different files with the same name, this process fails.
            Note
            Note If the same file is associated with multiple records in Salesforce, insert only one copy of that file in the ContentDocuments directory. The Archive managed package attaches that single file to all records that reference it.
          3. Provide these CSV files. You import these files together, so there's no required import order.
            • ContentDocument.csv: This file creates the parent ContentDocument records, which act as containers for file versions. These CSV fields are mandatory.
              • Id: The file's unique 18-character ContentDocument ID.
              • All standard fields like Description and all custom fields like My_Custom__c from your ContentDocument schema.
            • ContentVersion.csv: This file links the file version to its root ContentDocument record. These CSV fields are mandatory.
              • Id: The file's unique 18-character ContentVersion ID.
                Note
                Note This ID must match the file name in your S3 bucket.
              • ContentDocumentId: The ID from your ContentDocument.csv. This ID links the version to its container.
              • PathOnClient: The exact original file name of the corresponding physical file in the ContentDocuments directory.
              • VersionData: This field must be empty.
              • All standard and custom fields from your ContentVersion schema.
            • ContentDocumentLink.csv: This file links the ContentDocument record and all its versions to its associated root Salesforce records, such as Accounts and Cases. This These CSV fields are mandatory.
              • Id: The file's unique 18-character ContentDocumentLink ID.
              • ContentDocumentId: The ID of your ContentDocument.csv. This ID links the file version to its container.
              • LinkedEntityId: The Salesforce ID of the record associated with the file.
              • All standard and custom fields from your ContentDocumentLink schema.

          Import Attachment Records

          The process for importing standard attachments follows a simpler logic than ContentVersions.

          1. In your S3 bucket, create a directory named Attachments.
          2. Place all your attachment files inside the Attachments directory.
            Note
            Note Each file must match the original file name uploaded to Salesforce. If you have two different files with the same name, this process fails.
          3. Create a CSV file named Attachment.csv. This file contains the metadata and the root record link for every attachment file in the directory. These CSV fields are mandatory.
            • Id: The attachment's unique 18-character Attachment ID.
            • ParentId: The Salesforce ID of the record associated with this attachment.
            • Name: This field must contain the exact original file name of the corresponding physical file in the Attachments directory.
            • All standard and custom fields from your Attachment schema.
            Note
            Note If you have a large volume of attachments, you can split the metadata into multiple indexed files, such as Attachment_1.csv and Attachment_2.csv.
            Note
            Note All CSV files must remain in the root folder. Do not place them inside the ContentDocuments or Attachments subdirectories.

          Solve File Name Collisions

          A file name collision happens when two different physical files share the original name. You can't place both files in the same S3 directory. To solve this conflict:

          1. Rename the physical file in your S3 bucket from its original name to its unique 18-character Salesforce ID followed by the file suffix. For example:
            • Attachment record invoice.pdf becomes 00P5g000002aBCDEFGA.pdf
            • ContentVersion file report.pdf becomes 0685g000001ABCDEFE.pdf
          2. Add this new ID-based file name in the appropriate CSV field so that the Archive managed package can find it.

            For ContentVersion files, update the PathOnClient field.

            For Attachment records, update the Name field.

          Validate and Initiate the Data Load

          After you structure your S3 directories and place your corresponding CSVs in the root of the S3 bucket, you can initiate the data load.

          We recommend that you validate the import process first.

          1. Go to the Activities tab in Archive.
          2. Before you start the archive, you can download the activity log to validate that all CSVs and files loaded successfully.
            This log shows the number of records, such as Cases and Attachments, and a list of the physical files that are ready for import. You can use this log to identify failures, such as missing files, before you commit to the archive.

          Initiate the Data Load from AWS S3 to Archive

          Start the process of loading your prepared data from an AWS S3 bucket into the secure staging area. This action is the first active step in the import process and prepares your data for the final archival step in the Archive managed package.

          Note
          Note This content relates to Archive. For Salesforce Archive, see Store Data Externally with Salesforce Archive.

          Initiate the Data Load

          Warning
          Warning To prevent errors, ensure your S3 source folder contains only the CSV files you intend to import. Remove any test files, backups, or unrelated reports before starting.
          1. From the Archive Home Page, click the Import Data icon.
          2. Enter the connection details for your S3 bucket:
            • AWS Access Key ID
            • AWS Secret Access Key
            • S3 bucket Path

              Example: //my-example-bucket/path/to/folder

            • S3 Region

              Example: us-east-2

          3. Click Import to begin the process.
            Note
            Note Import is currently available only from an S3 bucket.

          Monitor the Data Load Process

          1. Validation: The system first validates the Amazon S3 connection and file formats. If an error occurs, the process stops, and a message is displayed.
          2. Staging: Upon successful validation, the data is loaded into the secure Archive staging area.
          3. Track Progress: Go to the Activities tab to view the status of the Import-data-load activity. Progress is measured by the number of files processed.
            Note
            Note The Import-data-load action is only visible on the Activities tab and doesn't appear in the Recent Activities section on the homepage.

          When the Import-data-load activity in completed, you must Create and Run an Import Policy.

          Create and Run an Archive Import Policy

          After your offloaded Salesforce data has been successfully loaded into the Archive staging area, you must create and run a special import policy to complete the process. Configure, execute, and monitor this policy to move your staged data permanently into the Archive managed package.

          Note
          Note This content relates to Archive. For Salesforce Archive, see Store Data Externally with Salesforce Archive.

          This feature is only available after data is successfully loaded into the staging area. Ensure the Import-data-load activity has completed successfully before proceeding.

          Create an Archive Import Policy

          The option to create an import policy appears only after the system detects that data exists in the staging area.

          1. Go to the Policies tab.
          2. Select New Archive Policy.
          3. Click the Archive from Imported Data tab.
          4. Configure the policy settings as needed.
            Important
            Important

            To archive 1–2 million records daily, archive the entire object tree as a single process. In your Import Policy, select Archive All Related Records. This reduces processing overhead and optimizes S3 file creation.

          Key Characteristics of an Import Policy

          • On-Demand Only: The scheduler is disabled. You must use the Run Now option.
          • Runs on Staged Data: The policy processes data from the staging area, not your live Salesforce org, so there's no performance impact.
          • No Query Customization: The policy automatically processes all staged records for the selected root object. Filters aren't available.

          Configure Relationship Archiving

          • Don't Archive Related Records: This setting performs a targeted archive. It includes only the primary records that match your query and their directly dependent child records (those in a cascade-delete relationship). All other related records, such as those in standard lookup relationships, such as Contacts on an Account, are excluded from Archive.
          • Archive All Related Records: Select this option to automatically archive all child records connected to the root record via a lookup relationship. This is a comprehensive way to ensure that all related data is archived together.
          • Manually Select Relationships This option provides specific control, allowing you to choose exactly which direct child relationships to include in the archive (first level only). After selecting this, a list of available child relationships appears, and you can select the specific objects to be archived along with the root sObject.

          Run and Monitor the Archive Import Policy

          1. Execute the Policy: Use the Run Now option on the policy to start the final archiving process.
          2. Monitor the Archive Job
            • Track the status in the Activities tab. The job is logged with the action type Import-data-archive.
            • If a job is aborted, the state of the staged data is saved so the job can be resumed later.

          The system must complete the full Import-data-load and Import-data-archive cycle for the current dataset before a new import can begin. Once the Import-data-archive activity is complete, your data is securely stored, and you're free to start a new import cycle.

          Retention Policy

          Select the criteria from these sources.

          • Archived Date
          • Created Date
          • Last Modified Date

          Select the amount of time you want to retain the archived data. The minimum retention period is one month, and the maximum is 99 years and 11 months.

          Creating a retention period of zero years and zero months means data is automatically purged within 24 hours.

          Archive Import FAQs

          Frequently asked questions about Archive Import.

          Note
          Note This content relates to Archive. For Salesforce Archive, see Store Data Externally with Salesforce Archive.

          Q: What do I do if I can't connect to my Amazon S3 bucket?

          A: If you can't connect to your S3 bucket, check your credentials and permissions. Make sure that you have the correct access key, secret key, and path.

          Q: What happens if the file names or data are incorrect during the Import Data loading stage?

          A: If the file names or data are incorrect, you can identify them from a list in the audit file in the Activities tab. You can fix these errors and try the loading process again.

          Q: If an import is in progress, can I start a new one?

          A: No, if an import is running, you can't start a new one. An error message appears in the UI that says that another process is already in progress. Wait for the current import to finish, or contact support to open the lock. After you archive all the imported records, you can start a new import. Make sure to import all the required data in the same process.

          Q: Should I separate my import into multiple jobs by file type, such as loading tasks first and then images?

          A: No. For optimal performance, include all CSVs and content files in a single import job. Running separate jobs for different file types, such as loading tasks in one job and images in another job, requires Archive to transfer the database to and from S3 repeatedly. This repetition significantly slows down the loading process.

          Q: Can I run multiple archive policies at the same time?

          A: No, you can only run one archive policy at a time. If you try to run another policy, you get an error message in the UI. Because you're working in the same database, you can't have one archive process delete data while another process is reading it.

          Q: How should I select relations when archiving imported data?

          A: When you define your policy, specify what lookup relationships to include. Master-detail relationships are included automatically.

          If you aren't sure which lookup relationships to include, select All Relations. This option makes sure that the app archives the root record and all its related child records. If you unarchive a related child record later, the app restores the entire record hierarchy, starting from the root record.

          Q: How can I monitor the progress of my archive process?

          A: You can monitor the progress of your archive process by using a progress bar in the Activities tab. It shows the percentage of archived records out of the total number of records.

          Q: What happens if the archive process fails to delete records from the secure staging area?

          A: If the archive process fails to delete records from the staging area, an error message appears in the audit file, but the archive continues to complete. To prevent duplicates, the system doesn't archive the same record more than one time, even if a cleanup error occurs after the process.

          Q: If the archive process gets stuck, can I restart it manually?

          A: Yes, if the archive process gets stuck, and a new process doesn't start automatically, you can restart it manually. However, if you experience this issue, contact support to make sure that everything is working correctly.

          Q: What do I do if I don't see the audit file in the Activities tab?

          A: If you don't see the audit file or activity logs in the Activities tab, check your filters. Make sure you're not filtering by a start or end date that excludes the current activity. If the issue persists, contact support for assistance.

          Q: How many records can be archived in a single job?

          A: A single archive job can handle up to one million records. If you have more than one million records to archive, the process splits into multiple jobs, each handling one million records. This action is automated.

          Q: If there are leftover records from a previous archive, can I start a new import?

          A: No, you must archive the leftover records first before starting a new import. If you get stuck, support can provide information on what records are missing and require archiving.

          Q: How long does the archive process take?

          A: The time it takes to archive records from the staging areas depends on the number of records in your database and how many you're archiving. The progress bar in the UI in the Activities tab helps you estimate the time.

          Q: What if I see a message that says an object is archived, but it wasn't?

          A: This message doesn't appear if the object wasn't archived. If you see this error, it's from a previous failed delete operation. Contact support to resolve the issue and make sure that the records are properly archived.

           
          Loading
          Salesforce Help | Article