You are here:
Delete Records by Using a Delete File
To delete data records ingested by using the AWS S3, Google Cloud Storage (GCS), Microsoft Azure, Ingestion API, or SFTP connector, create a delete file. Create the file in CSV or Parquet file format. S3, GCS, and Azure data streams support Parquet file ingestion.
For considerations regarding various file formats, review Supported File Formats in Data 360.
When you delete more than 600,000 records between refreshes, the data stream performs a full refresh.
Delete Records
To delete records:
-
Create a delete file with the list of primary keys of the records to delete.
The primaryKey field in the delete file must match the primary key data type of the data lake object (DLO). For example, if the DLO primary key field is a number, the primary key data type in the delete file must be a number.
- For correct file delimiter detection, wrap the primary key values in double quotes.
- For records ingested from a CSV file, create a headerless delete file in CSV format.
- For records ingested from a Parquet file, create a Parquet delete file with a single column named primaryKey.
-
Name the delete file by using the prefix Deletion_ followed by the matching CSV the source file name.
The file name is case-sensitive so make sure that the prefix is exactly "Deletion_" and the source file name remains unchanged. For example, if a file is named Customer*, which contains a wildcard, you can name your deletion file Deletion_Customer.csv or Deletion_Customer.parquet.Important
To prevent processing errors or unintended data loss, don't overwrite existing deletion files via the SFTP Connector. Instead, create a new, uniquely named file for every deletion request. Make sure that file names are unique and append a timestamp by using this format:
Format:
Deletion_{FileNameWildcard}_YYYYMMDDHHMMSSExample: If your file name wildcard is
UserData, name your uploaded deletion file similarly to: Deletion_UserData_20240514103000.csvThis timestamp format guarantees that the SFTP Connector processes each deletion file independently.
- In the directory containing the source files, create a subfolder named delete.
- Upload the delete file to the delete folder.
Delete Encrypted Records
To delete encrypted records ingested by using an SFTP connector:
- Locate the schema file created during the SFTP connection setup.
- Verify that the schema file is in the correct format, and that it's not corrupted. For the file format requirements, see the steps for deleting records.
- Verify the delete folder and its parent and root folders have Read, Write, and Delete permissions.
- To delete the schema file, add it to the delete folder.
Your data stream ingests a file named accounts-file by using the accounts-* wildcard. The file comes from a bucket named accounts-bucket and a folder named accounts-folder.
- To delete records, create a delete folder under the -folder.
- Depending on the file format, create a file named the accounts Deletion_-.csv or Deletion_-.parquet.
- Your folder structure can be accounts-folder/delete/Deletion_accounts-.csv or accounts-folder/delete/Deletion_accounts-.parquet.
Sample CSV delete file
"0000-0000-0000"
"0000-0000-0001"
"0000-0000-0002"
Sample Parquet delete file
primaryKey
"0000-0000-0000"
"0000-0000-0001"
"0000-0000-0002"

