-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: support for exporting data before deleting it via drop_chunks #572
Comments
Without having much of substance to add, I'll say this looks like a suitable & desirable feature as described. |
This actually looks really good - you could use |
This can be really useful for me, and I have an additional requirement for a newer_than filter! |
@eduardotsj interesting, do you mind providing more detail on the use case for your newer_than filter? |
This is my main use case to the export chunks with a newer_than option:
|
@eduardotsj excellent, that makes perfect sense. Thank you for the additional context. I'll mark you down as being interested in this feature, and update you as we have more progress. |
What happens if changes are made to the chunk while it's being exported? Would updates be lost when calling I'd only use this if |
@alanhamlett We are now actively working on this feature. Regarding what happens when changes are made to |
That would work for me, since the application layer could handle any write failures gracefully. |
Wonderful, thank you for the feedback @alanhamlett! Will reping here once I have more updates. |
Not implemented yet, so re-opening. |
@dianasaur323 is there an estimate timeline for this feature? Thanks! |
Support for a workflow like this would be very helpful:
So this way, I could e.g., create daily backups of data that is older than 7 days and limit the active database to just 7 days worth of data. Then, when I need to look at older data for troubleshooting / analysis / or something else, I can restore one or more of the daily chunk backups to another instance of TimescaleDB to work with as needed. |
@pv97 apologies for taking a while to respond. we are actually working on a different feature in this release, so no specific timeline yet. For now, you'll have to go with a more manual approach. |
In 2.0 you will be able to implement this with a user-defined action. |
As of now, TimescaleDB's
drop_chunks
provides an easy to use interface to delete the chunks that are entirely before a given time. Note that this is not about deleting all the data (rows) before the given time. Rather,drop_chunks
allows deleting the chunks whose time window is before the specified point (i.e., based on the intervals that can be configured during hypertable creation).This is useful performance-wise, as deleting chunks are basically just deleting an entire file from disk, while deleting individual rows goes through the entire MVCC process and needs to be later garbage collected through VACUUMing.
However, a common use-case is to dump such data into some kind of cold storage before deleting. For this, there is currently no easy-to-use API in Timescale. This can be achieved manually using
_timescaledb_catalog.chunk
table. The following is a proposal to integrate such functionality in Timescale API by adding the following two functions:show_chunks
- takes a hypertable along with a time indicator and returns the list of chunk-tables representing chunks that ended before the given time. Note thatdrop_chunks
would useshow_chunks
to figure out which chunks need to be dropped. This means that the user could useshow_chunks
to find out which chunks would be affected bydrop_chunks
(as a sanity check or for some other reason).export_chunks
- takes the parameters above along with a file format string and exports CSV(s) into the location specified by format stringBelow are some use-cases along with the resulting SQL:
And this is the exposed API:
Note that
export_chunks
would perform a server side copy.Client side copy (and other desired business specific export functionality) can be achieved with the help of a simple python script and
show_chunks
function.As an example, this is how chunks could be exported to separate files.
The text was updated successfully, but these errors were encountered: