Optimizing a SQL Query Activity

You are here:

Optimizing a SQL Query Activity

SQL query activities time out after 30 minutes. You can prevent timeouts and failures by optimizing your SQL query activities. If your query consistently runs longer than 10 minutes, we recommend using a different tool, such as Salesforce Customer Data Platform, to transform your data.

Review these guidelines for tips, tricks, and general information that you can use to optimize your SQL queries.

Primary Keys
Use primary keys as unique identifiers for your dataset.
Field Length and Data Type
Consider the field length and the type of data that you’re retrieving.
Data Quantity and Retention
To improve performance, limit the quantity of data in your queries and data extensions.
Searchable Arguments (SARGable Queries)
Searchable arguments, also known as SARGable queries, are operators that can use indexes. To make your query run faster, use searchable arguments when possible.
Concurrent Activities
Concurrent activity occurs when multiple processes attempt to access the same data at the same time, which can cause queries to fail or time out. For example, two processes simultaneously attempt to write to the same dataset, or a process is actively writing to a dataset that another process attempts to read.
Data Staging
You can stage your data by dividing a large query with multiple joins into small queries that are more performant. This method is especially effective for long-running queries that could otherwise time out.

Primary Keys

Use primary keys as unique identifiers for your dataset.

Be strategic when designating a field as a primary key. Having too many primary keys defeats the purpose of a unique identifier and decreases performance.
Automation Studio improves performance by automatically indexing primary keys, sendable relationship fields, and your other most-used fields. But you don’t need to designate fields as primary keys to force indexing. Overusing primary keys slows inserts into data extensions from queries, APIs, and other processes.
To analyze multiple datasets, use the same primary key field in each data extension that you want to link. Then join the data extensions using the primary key field.

Field Length and Data Type

Consider the field length and the type of data that you’re retrieving.

When defining the field length, keep the length to how many characters are needed. For example, if the field is a two-digit state code, limit the column length to 2 characters instead of the default 50.
Avoid using the data types nvarchar (max) and varchar (max).
When storing data, less is more. Only store data that’s critical to your process and to the specific table.
We recommend limiting row widths to 4,000 characters or fewer.
Only join columns of the same data type.
Use the same data type for columns that match in the source and target data extensions.

Data Quantity and Retention

To improve performance, limit the quantity of data in your queries and data extensions.

Use a SQL query when working with a clean, structured dataset where audiences have already been created. If you want to unify and segment a large dataset, we recommend using your own internal data warehouse or an application, such as Salesforce Data 360.
Don’t use a SQL query for large tasks like copying data between data extensions. Instead, use a data extension extract with an import activity in Automation Studio.
If you have a data extension with a large quantity of data, divide it into multiple data extensions that are related through foreign keys.
Only query as much data as you need. For example, if you need only 24 hours of data, don’t retrieve 30 days of data.
Only retrieve data that has changed since the last query. For example, use a Modified Date field to limit your query.
Store frequently changing data in a separate data extension. When you make requests for this data or update the data extension, avoid running other automations or requests at the same time.
If you notice performance issues with a data extension in which several columns were previously deleted, rebuild the data extension. To rebuild, create a data extension and use a query activity to transfer data from the old extension to the new extension.
Use data retention settings to specify how long your account retains data.
When building a query, choose the most performant way to write data for your use case.
- Append is typically best for event logs and other use cases where you need only the most recent data.
- Overwrite is the most performant option for most other use cases.
- Update is typically the least performant option and can significantly extend your query time.

Searchable Arguments (SARGable Queries)

Searchable arguments, also known as SARGable queries, are operators that can use indexes. To make your query run faster, use searchable arguments when possible.

Don’t use these non-searchable arguments in a WHERE clause.

OR
NOT
<
>
NOT EXISTS
NOT IN
NOT LIKE
Intrinsic functions that operate on a column value

Here are some examples of non-searchable intrinsic functions that can be rewritten into searchable arguments.

Non-Searchable Argument Searchable Argument

Non-Searchable Argument	Searchable Argument
`AND datediff(day, convert(DATE, calculated_subscription_start_dtm), GetDate()) = 26`	`AND calculated_subscription_start_dtm >= DATEADD(DAY, -26, CAST(GETDATE() AS DATE))` `AND calculated_subscription_start_dtm < DATEADD(DAY, -25, CAST(GETDATE() AS DATE))`
`WHERE convert(DATE, gsl.SendDate) >= convert(DATE, dateadd(dd, - 10, GETUTCDATE()))`	`WHERE gsl.SendDate >= DATEADD(DAY, -10, CAST(GETUTCDATE() AS DATE))` `AND gsl.SendDate < DATEADD(DAY, -9, CAST(GETUTCDATE() AS DATE))`

AND datediff(day, convert(DATE, calculated_subscription_start_dtm), GetDate()) = 26

AND calculated_subscription_start_dtm >= DATEADD(DAY, -26, CAST(GETDATE() AS DATE))

AND calculated_subscription_start_dtm < DATEADD(DAY, -25, CAST(GETDATE() AS DATE))

WHERE convert(DATE, gsl.SendDate) >= convert(DATE, dateadd(dd, - 10, GETUTCDATE()))

WHERE gsl.SendDate >= DATEADD(DAY, -10, CAST(GETUTCDATE() AS DATE))

AND gsl.SendDate < DATEADD(DAY, -9, CAST(GETUTCDATE() AS DATE))

Use SELECT DISTINCT only when duplicate data exists in multiple data extensions.
When using SELECT TOP, also use ORDER BY to avoid unpredictable results.
Specify columns by name instead of using SELECT *. When using SELECT *, the query activity parser caches the columns that exist when the query is saved. If you add another column in the future, it isn’t automatically included.
SQL Server isn’t case-sensitive, so don’t use the LOWER function.

Concurrent Activities

Concurrent activity occurs when multiple processes attempt to access the same data at the same time, which can cause queries to fail or time out. For example, two processes simultaneously attempt to write to the same dataset, or a process is actively writing to a dataset that another process attempts to read.

Prevent timeouts and performance issues by limiting query activities to one query per step of an automation.
Prevent concurrent activity by staggering your automations and creating dedicated datasets for your processes.
The majority of automations are scheduled on the hour. Improve performance by scheduling your automations at other times, such as 8:30 instead of 8:00.

Data Staging

You can stage your data by dividing a large query with multiple joins into small queries that are more performant. This method is especially effective for long-running queries that could otherwise time out.

Use multiple small queries to gather data from different data extensions and write that data to holding tables.
After all the small queries have run, use a final query to unite the data from the holding tables.

For an example, see Use Intermediate Tables to Optimize a Query.

Did this article solve your issue?

Let us know so we can improve!

Optimizing a SQL Query Activity

Primary Keys

Field Length and Data Type

Data Quantity and Retention

Searchable Arguments (SARGable Queries)

Concurrent Activities

Data Staging

General Information

Required Cookies

Functional Cookies

Advertising Cookies

General Information

Required Cookies

Functional Cookies

Advertising Cookies

Cookie List

Product Area

Feature Impact

Edition

Experience

Optimizing a SQL Query Activity

Primary Keys

Field Length and Data Type

Data Quantity and Retention

Searchable Arguments (SARGable Queries)

Concurrent Activities

Data Staging