[Bug]: recompress_chunk doesn't update compression_stats #6221

Tindarid · 2023-10-20T11:14:48Z

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Compression

What happened?

Hi!

After calling recompress_chunk on a chunk which is partially compressed, compression statistics is not updated (hypertable has segment_by column list).

TimescaleDB version affected

2.12.1

PostgreSQL version used

14.7

What operating system did you use?

What installation method did you use?

Other

What platform did you run on?

On prem/Self-hosted, Other

Relevant log output and stack trace

No response

How can we reproduce the bug?

1. Create a hypertable with segment_by setting
2. Insert some data to create 1 chunk
3. Compress the data
4. Fetch compression stats
5. Insert more data (to the same chunk)
6. Recompress the data
7. Fetch compression stats -> they are equal to 4.
8. Run vacuum full (noticed that space reclaimed only after this operation; could be on the chunk itself)
9. fetch compression stats -> they are equal to 4.

mkindahl · 2023-10-23T11:15:40Z

@Tindarid Thank you for the bug report. This was trivial to reproduce:

import psycopg2
from psycopg2.extras import RealDictCursor

def insert_data(cursor, name):
    cursor.execute(f"INSERT INTO {name} SELECT time, (random()*30)::int, random()*80 - 40 FROM generate_series(NOW() - INTERVAL '14 days', NOW(), '1 minute') AS time")

def get_compression_stats(conn, name):
    with conn.cursor() as cursor:
        cursor.execute("SELECT * FROM hypertable_compression_stats(%s)", (name,))
        return dict(cursor.fetchone())

def enable_compression(cursor, name, segmentby=None):
    options = ['timescaledb.compress']
    if segmentby:
        fields = ",".join(segmentby)
        options.append(f"timescaledb.compress_segmentby = '{fields}'")
        cursor.execute(f"ALTER TABLE {name} SET ({','.join(options)})")

def setup(conn, name):
    with conn.cursor() as cursor:
        cursor.execute(f"DROP TABLE {name}")
        cursor.execute(
            f"CREATE TABLE {name}(time TIMESTAMPTZ NOT NULL, device INTEGER, temperature FLOAT)"
        )
        cursor.execute(r"SELECT * FROM create_hypertable(%s, 'time', 'device', 4)", (name,))
        insert_data(cursor, name)
        enable_compression(cursor, name, segmentby=['device'])
        cursor.execute("SELECT compress_chunk(show_chunks(%s))", (name,))

def recompress_chunks(conn, name):
    with conn.cursor() as cursor:
        cursor.execute("SELECT * FROM show_chunks(%s)", (name,))
        chunks = [row['show_chunks'] for row in cursor]
        for chunk in chunks:
            cursor.execute("CALL recompress_chunk(%s, if_not_compressed => true)",
                           (chunk,))

def main():
    """Entrypoint for script."""
    conn = psycopg2.connect(dbname='mats', user='mats', host='/tmp')
    conn.autocommit = True
    conn.cursor_factory = RealDictCursor
    name = 'conditions'
    setup(conn, name)
    with conn.cursor() as cursor:
        cursor.execute(f"SELECT count(*) FROM {name}")
        count_before = cursor.fetchone()['count']
    stats_before = get_compression_stats(conn, name)
    with conn.cursor() as cursor:
        insert_data(cursor, name)
    recompress_chunks(conn, name)
    with conn.cursor() as cursor:
        cursor.execute(f"SELECT count(*) FROM {name}")
        count_after = cursor.fetchone()['count']
    stats_after = get_compression_stats(conn, name)

    print("count", "::", count_before, "->", count_after)
    for key in stats_before.keys():
        print(key, "::", stats_before[key], "->", stats_after[key])
        
if __name__ == '__main__':
    main()

Producing the output:

count :: 20161 -> 40322
total_chunks :: 12 -> 12
number_compressed_chunks :: 12 -> 12
before_compression_table_bytes :: 1392640 -> 1392640
before_compression_index_bytes :: 2146304 -> 2146304
before_compression_toast_bytes :: 0 -> 0
before_compression_total_bytes :: 3538944 -> 3538944
after_compression_table_bytes :: 229376 -> 237568
after_compression_index_bytes :: 196608 -> 196608
after_compression_toast_bytes :: 704512 -> 1433600
after_compression_total_bytes :: 1130496 -> 1867776
node_name :: None -> None

mkindahl · 2023-10-23T11:18:06Z

A similar issue, but not a duplicate, is #5881

jflambert · 2025-02-14T17:58:57Z

I have a similar, but opposite issue: #7713

In my case, stats don't update unless I recompress the chunks

Tindarid added the bug label Oct 20, 2023

mkindahl self-assigned this Oct 23, 2023

mkindahl removed their assignment Nov 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: recompress_chunk doesn't update compression_stats #6221

[Bug]: recompress_chunk doesn't update compression_stats #6221

Tindarid commented Oct 20, 2023 •

edited

Loading

mkindahl commented Oct 23, 2023

mkindahl commented Oct 23, 2023

jflambert commented Feb 14, 2025

[Bug]: recompress_chunk doesn't update compression_stats #6221

[Bug]: recompress_chunk doesn't update compression_stats #6221

Comments

Tindarid commented Oct 20, 2023 • edited Loading

What type of bug is this?

What subsystems and features are affected?

What happened?

TimescaleDB version affected

PostgreSQL version used

What operating system did you use?

What installation method did you use?

What platform did you run on?

Relevant log output and stack trace

How can we reproduce the bug?

mkindahl commented Oct 23, 2023

mkindahl commented Oct 23, 2023

jflambert commented Feb 14, 2025

Tindarid commented Oct 20, 2023 •

edited

Loading