You are here:
Azure Data Lake Output Connection (Beta)
Create a remote connection using the Azure Data Lake output connector to write .csv data from Salesforce Data Pipelines to Microsoft Azure Data Lake Storage Gen2 when using Data Prep.
How Azure Data Lake Output Connection Works
With Salesforce Data Pipelines connectors, you
can write the outcome of a recipe to an external system for further analysis, business
automation, and storage. After you configure an output connector to the data source and
folder path using the steps in this guide, create a recipe using Data Prep. Add an Output
node to the recipe and select to write to the output connection. Select the Azure Data Lake
connection you configured, and choose which object will contain the output data. When the
recipe runs, Salesforce Data Pipelines writes data to the specified object in the folder
path within the directory. For example, if you specified the folder path
support/servicedata and chose the object
APAC we write to the folder APAC at the path ...support/servicedata/APAC. The folder
contains these files:
- The output dataset .csv file, with the naming standard
folderpath_0.csv. In our example, the output isAPAC_0.csv. For larger datasets, Salesforce Data Pipelines automatically partitions the data into multiple .csv files with increasing numbers appended to the name. In our example, there could be aAPAC_1.csv, aAPAC_2.csv, and more until all data is written. This file doesn’t have a header row. - The metadata file schema.json, with information on each column including the name, data type, and precision.
- The _SUCCESS file, generated when the .csv is completely written.
When you run the recipe again, the files previously generated are deleted and the new version of the files are written.
Enable and Add the Azure Data Lake Output Connector
- From Setup, in the Quick Find box, enter Analytics, select Analytics, and then Settings.
- Select Enable Azure Data Lake output connection and then save your changes.
- In the Data Manager, click the Connections tab.
- Click New Connection.
- Click Output.
- Click the connector’s icon and enter its settings, as described in the Connection Settings section.
- Click Save & Test. Save & Test validates your settings by attempting to connect to the source. If the connection fails, Salesforce Data Pipelines shows possible reasons.
All settings require a value, unless otherwise indicated.
| Connection Setting | Description |
|---|---|
| Connection Name | Identifies the connection. Use a convention that lets you easily distinguish between different connections. |
| Developer Name | API name for the connection. The developer name can’t include spaces, and you can’t change it after you create the connection. |
| Description | Description of the connection for internal use. |
| Access Key | Your Azure Data Lake access key. To learn more, see Microsoft’s Azure Data Lake Help |
| Account | Your Azure Data Lake account name. |
| Container | The name of the Azure Data Lake container that the connector will write to. |
| Folder Path | Path to the folder that you want to connect. There must be a subfolder within
the specified folder. For example, to write to support/servicedata, specify support for the folder path property in the connection then choose
servicedata when configuring the recipe output
node. |
Keep these behaviors in mind when working with the Azure data lake output connector.
- You can use an output connection more than once per recipe, but each output node must use a different object. Each output node can use up to the connector's per-run limit independently, but is still subject to the rolling 24 hour limit.
- Output connections are only available for recipes built with Data Prep.
- Each output connection only supports one folder path. To output to another folder path, create another output connection.
- When the prior run’s files are deleted in preparation for the current run, the earlier version of an output dataset is inaccessible. Set up a process to copy or use the output after each run, and use the _SUCCESS file as an indication that the write is complete.

