You are here:
Data Stream Settings and Refresh Modes
With data streams, you can schedule or run jobs to ingest data from a source system. These jobs pull data from the source system into the connected data stream. The method and frequency options for data ingestion depends on the type of connector. Use the data stream settings to create or change the execution parameters of a data stream.
Required Editions
| Available in: Lightning Edition |
Refresh Modes
The Refresh Mode settings determine how data is ingested into a data lake object (DLO). Efficient use of these modes ensures data freshness and cost reduction by improving data processing efficiency.
| Setting | Description |
|---|---|
| Incremental | This mode is available for most non-file-based data streams that have a Datetime field. Select the field that indicates when the incoming record was last modified. For each refresh, the data stream is updated with data added or changed since the last refresh. To make sure that there’s no data loss, the last few records from the previous run are reprocessed during each job run. To remove
records from an incrementally updated DLO, mark deleted records in your source
system with a Delete Record Flag. For example, Incremental refresh is also supported in Zero Copy Data Federation, but deletion of records during incremental refresh operations isn't supported. For more information, refer Caching or Acceleration in Data Federation. |
| Upsert | This mode is available for data streams created from file-based connectors. For each refresh, the data stream is updated with data added or changed since the last refresh. When the refresh mode is set to upsert, you can efficiently delete records from a DLO using a file. See Delete Records by Using a Delete File. |
| Full Refresh | During each refresh cycle, all existing data is deleted and replaced with the newly imported dataset. If no other refresh setting is available, full refresh is the default refresh mode. |
Conditional Settings
| Setting | Description |
|---|---|
| Filter | Use a filter to extract only the records that fulfill the specified condition.
The syntax of the condition depends on the source system. For example, on a source
that relies on SQL92 syntax, the filter could be, s_nationkey = 10 and
s_acctbal > 2000. |
| Supported Operators | For database sources, see the list in the section entitled "Supported Operators" below for more details. |
File Settings
These settings are specific to data streams that use file-based connectors.
| Setting | Description |
|---|---|
| Log an error if no file is found | When enabled and a file isn’t found during the data stream run, the line item in the refresh history for that run shows a failure status. A notification is triggered if you subscribe to data stream failure notifications. For more information, see Data Cloud: Control Event Notifications Using Flows. When this setting isn’t enabled and a file isn’t found during the data stream run, the refresh history shows 0 records but with a success status. |
| Is headerless file retrieval enabled | When set, Data 360 accepts the data records without headers. Don’t turn on this setting while you’re creating the data stream because the header information is necessary for the initial run when the schema is being generated. You can enable the setting after the initial run. To avoid data corruption, maintain the order of fields in the incoming data records. |
| Refresh only new files | When set, all files are retrieved when the data stream is first run. For subsequent runs, only new files are retrieved. |
| Refresh initial file immediately | When set, Data 360 doesn't wait until the scheduled run and retrieves the initial file immediately. |
Supported Operators, Functions, and Keywords
For database sources,the WHERE clause supports the following:
Supported Operators
- Comparison: <, >, <=, >=, =, <>
- Pattern matching: LIKE, RLIKE
- Null checks: IS NULL, IS NOT NULL
- Set membership: IN, NOT IN
- Range evaluation: BETWEEN
- Uniqueness: DISTINCT
Supported Functions and Keywords
The following SQL functions and expressions are allowed within the WHERE
clause:
CEILCFLOORCASTTRIMTIMESTAMPDATE
Not Supported in the WHERE Clause
The following constructs are not supported and will result in an error if used:
- DML operations:
INSERT,UPDATE,DELETE - Complex query structures:
JOIN,UNION,INTERSECT, and subqueries - Control characters: semicolons (;)
- Any other SQL functions or keywords not explicitly listed in the supported sections above
Schedule
The frequency setting describes the time interval between executions of your data stream job. Based on connector capabilities, supported schedule intervals can vary.
| Setting | Description |
|---|---|
| Frequency | Set the frequency of data refresh by choosing the relevant frequency. The available refresh frequencies are:
All frequency times are in UTC and are applicable for both full and incremental refreshes. AWS S3, Azure Blob Storage, and Google Cloud Storage data streams have 5-, 15-, or 30-minute intervals. These connectors can synchronize data and deduplicate jobs to provide these narrow intervals. |

