ReferenceClientAdvancedDelete Data

Delete Data

Safe data deletion with idempotency guarantees and rollback considerations.

The Synnax client allows deletion of time ranges of data in any channel: after each deletion operation is complete, all future reads will no longer include the deleted data. However, it may take a while before the underlying file sizes decrease - this allows deletion operations to be served in a rapid manner and only actually collect the unwanted data when the load on the cluster is low.

Note the differences between deleting data and deleting a channel - once a channel is deleted, it no longer exists; whereas when some data in a channel is deleted, that time range can be written over with new data or even more data can be deleted. Even if an entire channel’s data is deleted, the channel is still in the database, albeit empty.

Deleting Data From a Channel

The delete method of the client allows deletion of data (not to be confused with the delete method of the Channel class, which deletes channels). To delete a chunk of data, simply pass in the channel name(s) or key(s) and the time range to delete. As throughout Synnax, remember that a time range is start-inclusive and end-exclusive, i.e. data at the start time stamp is deleted and data at the end time stamp is not.

For example, to remove data in the range [00:01, 00:03) on the channel_1 and channel_2 channels:

Python

TypeScript

client.delete(
    ["channel_1", "channel_2"],
    sy.TimeStamp(1 * sy.TimeSpan.SECOND).range(sy.TimeStamp(3 * sy.TimeSpan.SECOND))
)

Using channel name(s) to delete data will delete data in all channels with the given name(s). Using keys to delete is preferable to prevent accidental deletion.

Idempotency

The delete method is idempotent, meaning consecutive calls to delete on overlapping time ranges are allowed:

Python

TypeScript

# No additional data deleted after previous example call
client.delete(
    ["channel_1", "channel_2"],
    sy.TimeStamp(1 * sy.TimeSpan.SECOND).range(sy.TimeStamp(3 * sy.TimeSpan.SECOND))
)

# 00:01 to 00:10 deleted
client.delete(
    ["channel_1", "channel_2"],
    sy.TimeStamp(1 * sy.TimeSpan.SECOND).range(sy.TimeStamp(10 * sy.TimeSpan.SECOND))
)

Limitations of Deletions

In some situations, delete raises an error. If some channel keys or names do not exist in the database, the entirety of the delete operation fails, no data is deleted, and a NotFound error is returned:

Python

TypeScript

# Suppose 111 and 112 are keys to channels that do exist. Since 113
# does not exist, none of these channels' data get deleted.
client.delete([111, 112, 113], time_range_to_delete)

In the case where a requested channel is not found, delete is atomic: no data will be deleted and the operation will fail. However, in all other cases, delete is not atomic: failure in deleting data on one channel halts the entire operation and raises an error immediately.

Index Channel Dependencies

If a delete call is made to an index channel that other channels depend on in the requested time range, an error is raised:

Python

TypeScript

# If my_tc is indexed by my_index_ch from 1 second to 3 seconds,
# my_index_ch cannot be deleted. This call raises an error.
client.delete(
    ["my_index_ch"],
    sy.TimeStamp(1 * sy.TimeSpan.SECOND).range(sy.TimeStamp(3 * sy.TimeSpan.SECOND))
)

# If my_tc, the dependent, is deleted at the same time as my_index_ch,
# no errors are raised.
client.delete(
    ["my_tc", "my_index_ch"],
    sy.TimeStamp(1 * sy.TimeSpan.SECOND).range(sy.TimeStamp(3 * sy.TimeSpan.SECOND))
)

Active Writer Conflicts

delete calls on any channel with a writer whose start time is before the deleting time range will raise an error. This is to ensure that the writer and the deleter do not contend over data in the same region.

Python

TypeScript

writer = client.open_writer(
    start=sy.TimeStamp(10 * sy.TimeSpan.SECOND),
    channels=["my_tc"],
)

# Error raised since writer start 00:10 is before deleting time range [00:12 - 00:30)
client.delete(
    ["my_tc"],
    sy.TimeStamp(12 * sy.TimeSpan.SECOND).range(sy.TimeStamp(30 * sy.TimeSpan.SECOND))
)

Once writers starting before the deleting time range are closed, calls to delete may proceed normally.