You are here:
Why does my dataflow run after a data sync failure?
When a data sync job fails for an object that is used by a downstream dataflow, the system is designed so that the dataflow execution is deferred until the failed object is successfully reprocessed. This behavior ensures that dataflows work with up-to-date and complete data, and is consistent in the vast majority of cases.
Expected Behavior
A data sync job runs at 11pm and processes multiple objects.
If one of those entities fails, and is referenced by a dataflow scheduled to run at 7am, the system waits for the failed object to be retried and completed successfully.
The dataflow is deferred until the dataset's timestamp reflects a successful update after failure.
Why the Dataflow Sometimes Still Runs
In rare cases, the dataflow doesn’t wait, even though the data sync for an object failed.
The system determines data readiness based on dataset timestamps, like
modifiedDate or lastRun, rather than the actual success
of the data sync job. These discrepancies can occur when:
- The failure happens after the dataset update completes and timestamps are refreshed.
- The failure occurs in a way that still results in an updated
lastRuntime.
In most cases, data sync failures delay dataflow execution as expected, but timestamp anomalies can create exceptions.
To ensure that these exceptions don’t happen, create an event-based dataflow schedule that validates the data sync job status rather than a time-based schedule. For more information, see Schedule a Dataflow to Run Automatically.

