You are here:
Databricks Connection
Sync Databricks to Salesforce Data Pipelines by creating a remote connection.
Create the Connection
- On the Data Manager Connections tab, click New Connection.
- Click the name of the connector, and click Next.
- Enter the connector settings.
- To validate your settings and attempt to connect to the source, click Save & Test. If the connection fails, Salesforce Data Pipelines shows possible reasons.
Connect to Databricks with OAuth
To use the Databricks connector with OAuth authentication, you must configure your Databricks service principal and generate a client ID and secret. See Databricks help Authorize unattended access to Databricks resources with a service principal using OAuth for more information.
Databricks uses machine-to-machine (M2M) OAuth authentication, which provides additional security, and allows you to specify, at finer granularities, which objects can be accessed and the specific access privileges granted for these objects. With M2M OAuth, no external authentication provider is required.
You specify the client ID and a client secret to use for authentication in the connection settings. Databricks sets an expiration data on these values, which needs to be managed. When you generate the OAuth secret in Databricks, you can set the secret lifetime in days, with a maximum of 730 days (2 years). When client ID and secret values expire, you must update your Databricks connection with the updated values.
Connection Settings
All settings are required.
| Setting | Description |
|---|---|
| Connection Name | The connection name. Use a name that lets you easily distinguish between different connections. |
| Developer Name | The developer name is used in your recipes to reference data extracted through this connection. The name can't include spaces. After you create the connection, you can't change the developer name. |
| Description | The connection description. |
| Authentication Type | The authentication type. Valid values are OAuth or Token. |
| Token | Required for Token authentication only. The personal access token for Databricks. You can find the value on your Databricks User
Settings page on the Access Tokens tab. |
| Client Id | Required for OAuth authentication only. The Databricks generated client ID. You can find the value on your Databricks Service Prinipal page on the Secrets tab.
This value expires and must be updated when a new secret is generated. |
| Client Secret | Required for OAuth authentication only. The Databricks generated client secret. You can find the value on your Databricks Service Prinipal page on the Secrets tab.
This value expires and must be updated when a new secret is generated. |
| Server | The server hostname of the Databricks cluster. |
| HTTPPath | The HTTP path of the Databricks cluster. |
| Database | The name of the Databricks database. |
| Schema | The name of the Databricks database. |
| Catalog | The name of the Databricks catalog. Defaults to hive_metastore. |
For more information, see Establishing a Connection.
Connection Considerations
When working with the Databricks connector, keep these behaviors in mind.
- Connected object names must start with a letter and contain only letters, digits, or underscores. Object names can't end with an underscore.
- Only field names with combinations of alphanumeric, dot, underscore, or dash characters are supported. If a connector includes field names that contain other characters, such as spaces or brackets, the sync fails.
- This connector can sync up to 10 million rows or up to 5 GB per job, whichever limit it reaches first.

