Amazon S3 Output Connection

Create a remote connection using the Amazon S3 output connector to write .csv data from Salesforce Data Pipelines to an Amazon S3 bucket when using Data Prep.

Note If you want to sync data from an Amazon S3 bucket to Salesforce Data Pipelines, use an Amazon S3 Connection instead.

How Salesforce Data Pipelines Output Connectors with Output Nodes Work

With Salesforce Data Pipelines connectors, you can write the outcome of a recipe to an external system for further analysis, business automation, and storage. After you configure an output connector to the data source and folder path using the steps in this guide, create a recipe using Data Prep. Add an Output node to the recipe and select to write to the output connection. Select the S3 connection you configured, and choose which S3 object will contain the output data. When the recipe runs, Salesforce Data Pipelines writes data to the specified object in the folder path within the directory. For example, if you specified the folder path ...support/servicedata and chose the object APAC, we write to the folder APAC at the path ...support/servicedata/APAC. The folder contains these files:

The output dataset .csv file, with the naming standard folderpath_0.csv. In our example, the output is APAC_0.csv. For larger datasets, Salesforce Data Pipelines automatically partitions the data into multiple .csv files with increasing numbers appended to the name. In our example, there could be a APAC_1.csv, a APAC_2.csv, and more until all data is written. This file doesn’t have a header row.
The metadata file schema.json, with information on each column including the name, data type, and precision.
The _SUCCESS file, generated when the .csv is completely written.

When you run the recipe again, the files previously generated are deleted and the new version of the files are written.

Enable and Add the Amazon S3 Output Connector

From Setup, in the Quick Find box, enter Analytics, select Analytics, and then Settings.
Select Enable Amazon S3 output connection and then save your changes.
In the Data Manager, click the Connections tab.
Click New Connection.
Click Output.
Click the connector’s icon and enter its settings, as described in the Connection Settings section.
Click Save & Test. Save & Test validates your settings by attempting to connect to the source. If the connection fails, Salesforce Data Pipelines shows possible reasons.

All settings require a value, unless otherwise indicated.

Connection Setting	Description
Connection Name	Identifies the connection. Use a convention that lets you easily distinguish between different connections.
Developer Name	API name for the connection. The developer name can’t include spaces, and you can’t change it after you create the connection.
Description	Description
Authentication Type	For standard authentication, enter `Root`. For AWS Identity Access Management (IAM) authentication, enter `IAM`. For granular access to AWS data, use IAM authentication, setting up IAM users and roles in AWS. For more information on AWS IAM, see Getting Started with IAM on AWS
Secret Key	Your Amazon secret access key.
Region Name	Region of your S3 service. The default region is US East(N. Virginia). For the list of region names, see the AWS service endpoints documentation.
Folder Path	Path to the folder that you want to connect to. The path must start with the bucket name and can’t include the name of the subfolder whose data you want to sync.
Access Key	Your Amazon S3 bucket access key ID.
AWS Role ARN	For IAM authentication only, the ARN of the IAM role. You can find it on the IAM Role Setup in AWS.

Keep these behaviors in mind when working with the Amazon S3 output connector.

You can use an output connection more than once per recipe, but each output node must use a different object. Each output node can use up to the connector's per-run limit independently, but is still subject to the rolling 24 hour limit.
Output connections are only available for recipes built with Data Prep.
Each output connection only supports one folder path. To output to another folder path, create another output connection.
When the prior run’s files are deleted in preparation for the current run, the earlier version of an output dataset is inaccessible. Set up a process to copy or use the output after each run, and use the _SUCCESS file as an indication that the write is complete.
Null values are output to AWS as null-[token], where the token is a string of characters. For example, null-OrJLA4JEfCjToa4DDN or null-PaKIW3MEfCaQou9WXM. To interpret the null values, you may need to implement post-processing using AWS CLI or on the target where data is consumed.

Did this article solve your issue?

Let us know so we can improve!

Amazon S3 Output Connection

How Salesforce Data Pipelines Output Connectors with Output Nodes Work

Enable and Add the Amazon S3 Output Connector

General Information

Required Cookies

Functional Cookies

Advertising Cookies

General Information

Required Cookies

Functional Cookies

Advertising Cookies

Cookie List

Product Area

Feature Impact

Edition

Experience

Amazon S3 Output Connection

How Salesforce Data Pipelines Output Connectors with Output Nodes Work

Enable and Add the Amazon S3 Output Connector