You are here:
Amazon S3 Output Connection
Create a remote connection using the Amazon S3 output connector to write .csv data from
Salesforce Data Pipelines to an Amazon S3 bucket when using Data Prep.
How Salesforce Data Pipelines Output Connectors with Output Nodes Work
With Salesforce Data Pipelines connectors, you can write the
outcome of a recipe to an external system for further analysis, business automation, and
storage. After you configure an output connector to the data source and folder path using
the steps in this guide, create a recipe using Data Prep. Add an Output node to the recipe
and select to write to the output connection. Select the S3 connection you configured, and
choose which S3 object will contain the output data. When the recipe runs, Salesforce Data Pipelines writes data to the specified object in the folder path within the
directory. For example, if you specified the folder path
...support/servicedata and chose the object
APAC, we write to the folder APAC at the path ...support/servicedata/APAC. The folder
contains these files:
- The output dataset .csv file, with the naming standard
folderpath_0.csv. In our example, the output isAPAC_0.csv. For larger datasets, Salesforce Data Pipelines automatically partitions the data into multiple .csv files with increasing numbers appended to the name. In our example, there could be aAPAC_1.csv, aAPAC_2.csv, and more until all data is written. This file doesn’t have a header row. - The metadata file schema.json, with information on each column including the name, data type, and precision.
- The _SUCCESS file, generated when the .csv is completely written.
When you run the recipe again, the files previously generated are deleted and the new version of the files are written.
Enable and Add the Amazon S3 Output Connector
- From Setup, in the Quick Find box, enter Analytics, select Analytics, and then Settings.
- Select Enable Amazon S3 output connection and then save your changes.
- In the Data Manager, click the Connections tab.
- Click New Connection.
- Click Output.
- Click the connector’s icon and enter its settings, as described in the Connection Settings section.
- Click Save & Test. Save & Test validates your settings by attempting to connect to the source. If the connection fails, Salesforce Data Pipelines shows possible reasons.
All settings require a value, unless otherwise indicated.
| Connection Setting | Description |
|---|---|
| Connection Name | Identifies the connection. Use a convention that lets you easily distinguish between different connections. |
| Developer Name | API name for the connection. The developer name can’t include spaces, and you can’t change it after you create the connection. |
| Description | Description |
| Authentication Type | For standard authentication, enter Root. For AWS Identity Access Management (IAM) authentication, enter IAM. For granular access to AWS data, use IAM authentication, setting up IAM users and roles in AWS. For more information on AWS IAM, see Getting Started with IAM on AWS |
| Secret Key | Your Amazon secret access key. |
| Region Name | Region of your S3 service. The default region is US East(N. Virginia). For the list of region names, see the AWS service endpoints documentation. |
| Folder Path | Path to the folder that you want to connect to. The path must start with the bucket name and can’t include the name of the subfolder whose data you want to sync. |
| Access Key | Your Amazon S3 bucket access key ID. |
| AWS Role ARN | For IAM authentication only, the ARN of the IAM role. You can find it on the IAM Role Setup in AWS. |
Keep these behaviors in mind when working with the Amazon S3 output connector.
- You can use an output connection more than once per recipe, but each output node must use a different object. Each output node can use up to the connector's per-run limit independently, but is still subject to the rolling 24 hour limit.
- Output connections are only available for recipes built with Data Prep.
- Each output connection only supports one folder path. To output to another folder path, create another output connection.
- When the prior run’s files are deleted in preparation for the current run, the earlier version of an output dataset is inaccessible. Set up a process to copy or use the output after each run, and use the _SUCCESS file as an indication that the write is complete.
- Null values are output to AWS as
null-[token], where the token is a string of characters. For example,null-OrJLA4JEfCjToa4DDNornull-PaKIW3MEfCaQou9WXM. To interpret the null values, you may need to implement post-processing using AWS CLI or on the target where data is consumed.

