Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Unexpected time_bucket_gapfill behavior (with locf and last) #6528

Closed
jflambert opened this issue Jan 15, 2024 · 5 comments
Closed

Comments

@jflambert
Copy link

jflambert commented Jan 15, 2024

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Gapfill

What happened?

I'm trying to "downsample" discrete readings with the time_bucket_gapfill and locf functions but I end up with unexpected results. Using the replication steps below I expect the red row to have 3 as a value. I want the bucket of the third minute to contain the last value in the previous minute, up until the third minute precisely. It seems to be doing the opposite and grabbing the last value in the third minute bucket.

image

TimescaleDB version affected

2.13.1

PostgreSQL version used

15.5

What operating system did you use?

timescaledb-ha

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

create table test
(
  ts TIMESTAMPTZ,
  val INTEGER
);
select create_hypertable('test', 'ts');

delete from test;
INSERT INTO TEST (ts, val) VALUES
('2023-01-01 00:00:00', 0),
('2023-01-01 00:00:30', 0),
('2023-01-01 00:01:00', 1),
('2023-01-01 00:02:00', 2),
('2023-01-01 00:03:00', 3), -- expecting 3 for third minute
('2023-01-01 00:03:59', 4), -- expecting 4 for fourth minute
('2023-01-01 00:05:00', 5);

--explain analyze
SELECT time_bucket_gapfill('1 minute', ts) AS ts_fill, locf(last(val, ts), treat_null_as_missing=>true)
FROM test
WHERE ts BETWEEN '2023-01-01 00:00:00' AND '2023-01-01 00:07:00'
group by ts_fill
order by ts_fill ASC;
@jflambert jflambert added the bug label Jan 15, 2024
@jflambert jflambert changed the title [Bug]: Unexpected time_bucket_gapfill behavior (duplicate timestamp) [Bug]: Unexpected time_bucket_gapfill behavior (with locf and last) Jan 15, 2024
@konskov konskov self-assigned this Jan 16, 2024
@konskov
Copy link
Contributor

konskov commented Jan 16, 2024

Hello @jflambert, thank you for reaching out. This is the output of time_bucket_gapfill without locf:

        ts_fill         | last
------------------------+------
 2023-01-01 00:00:00+02 |    0
 2023-01-01 00:01:00+02 |    1
 2023-01-01 00:02:00+02 |    2
 2023-01-01 00:03:00+02 |    4
 2023-01-01 00:04:00+02 | NULL
 2023-01-01 00:05:00+02 |    5
 2023-01-01 00:06:00+02 | NULL
 2023-01-01 00:07:00+02 | NULL
(8 rows)

so I think the results that you get are consistent with gapfill/locf behavior, but I understand this is not what you are looking for.
Based on what you have described, I believe what you are looking for would be a variant of time_bucket_gapfill with an offset parameter (#3212), similar to time_bucket, but unfortunately this is not supported at the moment.

@jflambert
Copy link
Author

jflambert commented Jan 16, 2024

Hi @konskov so you confirm that I'm not doing anything wrong and that this is expected behavior. Yes I would like an option such that each bucket represents a "final" state of the previous one.

Here's my current alternative. Basically I want to round up timestamps to the next minute, unless they are exactly a given minute. Then I reduce to one row with a DISTINCT ON. Of course I can't write an awesome continous aggregate with it, but I can materialize it with a timescaledb job

CREATE FUNCTION IMM_CEIL_TIMESTAMP(TIMESTAMPTZ) RETURNS TIMESTAMPTZ AS $$
  SELECT COALESCE(NULLIF(DATE_TRUNC('MINUTE', $1), $1) + '1 MINUTE'::INTERVAL, $1);
$$ LANGUAGE SQL IMMUTABLE PARALLEL SAFE;

select distinct on (ts_ceil) IMM_CEIL_TIMESTAMP(ts) AS ts_ceil, val
from test
order by ts_ceil asc, ts desc;

image

@erimatnor
Copy link
Contributor

@jflambert I am going to relabel this as a feature request if you don't mind.

@erimatnor erimatnor added feature-request Feature proposal and removed bug labels Jan 29, 2024
@erimatnor erimatnor changed the title [Bug]: Unexpected time_bucket_gapfill behavior (with locf and last) [Feature]: Unexpected time_bucket_gapfill behavior (with locf and last) Jan 29, 2024
@erimatnor erimatnor added bug and removed feature-request Feature proposal labels Jan 29, 2024
@erimatnor
Copy link
Contributor

@jflambert Actually, if you could open a new issue with a feature request that would be better. Then I'll close this issue and link to it from the new feature request for the records.

@jflambert
Copy link
Author

@erimatnor I will gladly do this shortly. However, I think having an offset could be what I'm looking for, and there are already 2 or 3 open feature requests? Anyway I'll open a new one with my specific use case and you can decide if there is overlap with the other ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants