Azure Data Lake Output Connection (Beta)

Create a remote connection using the Azure Data Lake output connector to write .csv data from Salesforce Data Pipelines to Microsoft Azure Data Lake Storage Gen2 when using Data Prep.

Note If you want to sync data from Microsoft Azure Synapse Analytics or dedicated SQL pool to Salesforce Data Pipelines, use an Microsoft Azure Synapse Analytics Connection instead. Or, to sync data from Microsoft Azure SQL Database to Salesforce Data Pipelines, use an Microsoft Azure SQL Database Connection.

How Azure Data Lake Output Connection Works

Note This feature is a Beta Service. Customer may opt to try such Beta Service in its sole discretion. Any use of the Beta Service is subject to the applicable Beta Services Terms provided at Agreements and Terms.

With Salesforce Data Pipelines connectors, you can write the outcome of a recipe to an external system for further analysis, business automation, and storage. After you configure an output connector to the data source and folder path using the steps in this guide, create a recipe using Data Prep. Add an Output node to the recipe and select to write to the output connection. Select the Azure Data Lake connection you configured, and choose which object will contain the output data. When the recipe runs, Salesforce Data Pipelines writes data to the specified object in the folder path within the directory. For example, if you specified the folder path support/servicedata and chose the object APAC we write to the folder APAC at the path ...support/servicedata/APAC. The folder contains these files:

The output dataset .csv file, with the naming standard folderpath_0.csv. In our example, the output is APAC_0.csv. For larger datasets, Salesforce Data Pipelines automatically partitions the data into multiple .csv files with increasing numbers appended to the name. In our example, there could be a APAC_1.csv, a APAC_2.csv, and more until all data is written. This file doesn’t have a header row.
The metadata file schema.json, with information on each column including the name, data type, and precision.
The _SUCCESS file, generated when the .csv is completely written.

When you run the recipe again, the files previously generated are deleted and the new version of the files are written.

Enable and Add the Azure Data Lake Output Connector

From Setup, in the Quick Find box, enter Analytics, select Analytics, and then Settings.
Select Enable Azure Data Lake output connection and then save your changes.
In the Data Manager, click the Connections tab.
Click New Connection.
Click Output.
Click the connector’s icon and enter its settings, as described in the Connection Settings section.
Click Save & Test. Save & Test validates your settings by attempting to connect to the source. If the connection fails, Salesforce Data Pipelines shows possible reasons.

All settings require a value, unless otherwise indicated.

Connection Setting	Description
Connection Name	Identifies the connection. Use a convention that lets you easily distinguish between different connections.
Developer Name	API name for the connection. The developer name can’t include spaces, and you can’t change it after you create the connection.
Description	Description of the connection for internal use.
Access Key	Your Azure Data Lake access key. To learn more, see Microsoft’s Azure Data Lake Help
Account	Your Azure Data Lake account name.
Container	The name of the Azure Data Lake container that the connector will write to.
Folder Path	Path to the folder that you want to connect. There must be a subfolder within the specified folder. For example, to write to `support/servicedata`, specify `support` for the folder path property in the connection then choose servicedata when configuring the recipe output node.

Keep these behaviors in mind when working with the Azure data lake output connector.

You can use an output connection more than once per recipe, but each output node must use a different object. Each output node can use up to the connector's per-run limit independently, but is still subject to the rolling 24 hour limit.
Output connections are only available for recipes built with Data Prep.
Each output connection only supports one folder path. To output to another folder path, create another output connection.
When the prior run’s files are deleted in preparation for the current run, the earlier version of an output dataset is inaccessible. Set up a process to copy or use the output after each run, and use the _SUCCESS file as an indication that the write is complete.

Did this article solve your issue?

Let us know so we can improve!

Azure Data Lake Output Connection (Beta)

How Azure Data Lake Output Connection Works

Enable and Add the Azure Data Lake Output Connector

General Information

Required Cookies

Functional Cookies

Advertising Cookies

General Information

Required Cookies

Functional Cookies

Advertising Cookies

Cookie List

Product Area

Feature Impact

Edition

Experience

Azure Data Lake Output Connection (Beta)

How Azure Data Lake Output Connection Works

Enable and Add the Azure Data Lake Output Connector