Multi-column support for SkipScan #3094

svenklemm · 2021-04-07T10:19:54Z

Initial implementation of skipscan only supports single column distinct but the implementation could be enhanced to support multiple columns as well.

jgm-ktg · 2021-06-08T20:19:26Z

-- index (a, b, time desc)
select distinct on (a, b) a, b, time from tbl where a in ('val1','val2') order by a, b, time desc;

The above is what we are talking about, right? Any estimate regarding when this feature will be added? Any workarounds in the meantime? Note that I cannot make a column combining a and b because I want to filter on a with the index.

jgm-ktg · 2021-06-15T14:56:24Z

                        Table "public.tbl4"
 Column |           Type           | Collation | Nullable | Default 
--------+--------------------------+-----------+----------+---------
 col    | text                     |           | not null | 
 col2   | text                     |           | not null | 
 col3   | timestamp with time zone |           | not null | 
Indexes:
    "tbl4_col_col2_col3_idx" btree (col, col2, col3 DESC)

with recursive cte(c1,c2,c3) as (
    (select distinct on (col) col, col2, col3 from tbl4) -- parens required
    union all
    select * from (
        with tmp(c) as (select c2 from cte limit 1) -- can't use cte in subquery directly
        select distinct on (tbl4.col) tbl4.col, tbl4.col2, tbl4.col3 from tbl4
        where tbl4.col=any(array(select distinct on (col) col from tbl4))
        and tbl4.col2>(select c from tmp)
    ) t
)
select * from cte;

One try at mimic of this feature... depends on every value of col2 existing for each value of col

svenklemm · 2021-06-15T16:47:49Z

You can emulate this with lateral join like so:

SELECT * FROM   (SELECT DISTINCT ON (dev) dev FROM <table>) a,   LATERAL (SELECT DISTINCT ON (time) dev, time FROM <table> WHERE dev = a.dev) b;

jgm-ktg · 2021-06-15T16:52:53Z

You can emulate this with lateral join like so

@svenklemm Thanks for the tip. Can you put this in terms of col/col2/col3 above? I only see two columns referenced... time and dev

EDIT... it is almost as if I need the distinct on (col) to provide a list of starting records... if there are N of them, this would start N recursive CTEs to consume all of col2 for each col value? Can a recursive CTE be the rhs of a lateral?

EDIT: this has too many columns from a and doesn't quite seem to work out yet...

select * from
    (select distinct on (col) col, col2, col3 from tbl7) a,
    lateral (
        with recursive cte as (
           (select distinct on (col) col, col2, col3 from tbl7 where col=a.col and col2=a.col2)
           union all
           select * from (
             with tmp(c) as (select col2 from cte limit 1)
             select distinct on (col) col, col2, col3 from tbl7 where col=a.col and col2>(select c from tmp)
           ) t
        ) select * from cte
    ) b;

svenklemm · 2021-06-15T17:25:43Z

SELECT t2.* FROM 
  (SELECT DISTINCT ON (col) col FROM tbl4 t1) t1,
  LATERAL(SELECT DISTINCT ON (col2) * FROM tbl4 t2 WHERE t2.col=t1.col ORDER BY col2, col3 DESC) t2;

This emulates DISTINCT on (col, col2) and uses skipscan for both columns

jgm-ktg · 2021-06-15T17:27:56Z

This emulates DISTINCT on (col, col2) and uses skipscan for both columns

Thank you!!

jgm-ktg · 2021-06-15T17:36:21Z

linkup=# \d tbl7
                        Table "public.tbl7"
 Column |           Type           | Collation | Nullable | Default 
--------+--------------------------+-----------+----------+---------
 col    | bigint                   |           | not null | 
 col2   | character(36)            |           | not null | 
 col3   | timestamp with time zone |           | not null | 
Indexes:
    "tbl7_col_col2_col3_idx" btree (col, col2, col3 DESC)

Nested Loop  (cost=1.12..1060514.95 rows=10000000 width=53)
   ->  Unique  (cost=0.56..3.00 rows=5 width=8)
         ->  Custom Scan (SkipScan) on tbl7 t1  (cost=0.56..2.98 rows=5 width=8)
               ->  Index Only Scan using tbl7_col_col2_col3_idx on tbl7 t1  (cost=0.56..367214.09 rows=10000000 width=8)
                     Index Cond: (col > NULL::bigint)
   ->  Unique  (cost=0.56..172102.38 rows=2000000 width=53)
         ->  Index Only Scan using tbl7_col_col2_col3_idx on tbl7 t2  (cost=0.56..167102.38 rows=2000000 width=53)
               Index Cond: (col = t1.col)

@svenklemm Any idea why it doesn't use skipscan with these alternate data types? In this case col2 is not as repetitive as with the data loaded in the first case, although there is some repetition.

EDIT:

linkup=# explain SELECT DISTINCT ON (col2) * FROM tbl4 t2 WHERE t2.col='1' ORDER BY col2, col3 DESC;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Unique  (cost=0.43..2.44 rows=5 width=12)
   ->  Custom Scan (SkipScan) on tbl4 t2  (cost=0.43..2.43 rows=5 width=12)
         ->  Index Only Scan using tbl4_col_col2_col3_idx on tbl4 t2  (cost=0.43..101287.71 rows=1976867 width=12)
               Index Cond: ((col = '1'::text) AND (col2 > NULL::text))
(4 rows)

linkup=# explain SELECT DISTINCT ON (col2) * FROM tbl7 t2 WHERE t2.col=1 ORDER BY col2, col3 DESC;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Unique  (cost=0.56..172478.21 rows=2012267 width=53)
   ->  Index Only Scan using tbl7_col_col2_col3_idx on tbl7 t2  (cost=0.56..167447.54 rows=2012267 width=53)
         Index Cond: (col = 1)
(3 rows)

I notice the (col2 > NULL::text) aspect of the first plan....

EDIT: I also noticed that col2 could not be type uuid in previous testing... more went wrong sooner. We can make these fields type text if absolutely necessary.

svenklemm · 2021-06-15T18:42:39Z

Hmm not sure but the datatypes are not the reason why it doesnt use skipscan. But in general you should always use text for any character columns.

https://wiki.postgresql.org/wiki/Don't_Do_This#Text_storage

dev=# \d+ tbl4
                                            Table "public.tbl4"
 Column |           Type           | Collation | Nullable | Default | Storage  | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
 col    | bigint                   |           |          |         | plain    |              |
 col2   | character(36)            |           |          |         | extended |              |
 col3   | timestamp with time zone |           |          |         | plain    |              |
Indexes:
    "tbl4_col_col2_col3_idx" btree (col, col2, col3 DESC)
Access method: heap

dev=# explain SELECT * FROM (SELECT DISTINCT ON (col) col FROM tbl4 t1) t1, LATERAL(SELECT DISTINCT ON (col2) * FROM tbl4 t2 WHERE t2.col=t1.col) t2;
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.56..2.47 rows=4 width=61)
   ->  Unique  (cost=0.28..0.74 rows=2 width=8)
         ->  Custom Scan (SkipScan) on tbl4 t1  (cost=0.28..0.74 rows=2 width=8)
               ->  Index Only Scan using tbl4_col_col2_col3_idx on tbl4 t1  (cost=0.28..435.72 rows=5004 width=8)
                     Index Cond: (col > NULL::bigint)
   ->  Unique  (cost=0.28..0.81 rows=2 width=53)
         ->  Custom Scan (SkipScan) on tbl4 t2  (cost=0.28..0.81 rows=2 width=53)
               ->  Index Only Scan using tbl4_col_col2_col3_idx on tbl4 t2  (cost=0.28..301.04 rows=2502 width=53)
                     Index Cond: ((col = t1.col) AND (col2 > NULL::bpchar))
(9 rows)

jgm-ktg · 2021-06-15T21:13:42Z

linkup=# delete from tbl8
;
DELETE 10000000
linkup=# insert into tbl8
select trunc(random()*5), substring(uuid_generate_v4()::text from 1 for 3), to_timestamp(random()*31540000) from generate_series(1,10000000);
INSERT 0 10000000
linkup=# analyze tbl8;
ANALYZE
linkup=# explain select count(t2.*) FROM 
  (SELECT DISTINCT ON (col) col FROM tbl8 t1) t1,
  LATERAL(SELECT DISTINCT ON (col2) * FROM tbl8 t2 WHERE t2.col=t1.col ORDER BY col2, col3 DESC) t2;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=18177.56..18177.57 rows=1 width=8)
   ->  Nested Loop  (cost=1.12..18126.36 rows=20480 width=77)
         ->  Unique  (cost=0.56..3.43 rows=5 width=8)
               ->  Custom Scan (SkipScan) on tbl8 t1  (cost=0.56..3.41 rows=5 width=8)
                     ->  Index Only Scan using tbl8_col_col2_col3_idx on tbl8 t1  (cost=0.56..1230713.24 rows=10018872 width=8)
                           Index Cond: (col > NULL::bigint)
         ->  Subquery Scan on t2  (cost=0.56..3583.62 rows=4096 width=77)
               ->  Unique  (cost=0.56..3542.66 rows=4096 width=53)
                     ->  Custom Scan (SkipScan) on tbl8 t2_1  (cost=0.56..3532.42 rows=4096 width=53)
                           ->  Index Only Scan using tbl8_col_col2_col3_idx on tbl8 t2_1  (cost=0.56..605953.96 rows=2003774 width=53)
                                 Index Cond: ((col = t1.col) AND (col2 > NULL::bpchar))
(11 rows)

linkup=# delete from tbl8
;
DELETE 10000000
linkup=# insert into tbl8
select trunc(random()*5), substring(uuid_generate_v4()::text from 1 for 36), to_timestamp(random()*31540000) from generate_series(1,10000000);
INSERT 0 10000000
linkup=# analyze tbl8;
ANALYZE
linkup=# explain select count(t2.*) FROM 
  (SELECT DISTINCT ON (col) col FROM tbl8 t1) t1,
  LATERAL(SELECT DISTINCT ON (col2) * FROM tbl8 t2 WHERE t2.col=t1.col ORDER BY col2, col3 DESC) t2;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=3694802.14..3694802.15 rows=1 width=8)
   ->  Nested Loop  (cost=1.12..3669602.14 rows=10080000 width=77)
         ->  Unique  (cost=0.56..3.47 rows=5 width=8)
               ->  Custom Scan (SkipScan) on tbl8 t1  (cost=0.56..3.45 rows=5 width=8)
                     ->  Index Only Scan using tbl8_col_col2_col3_idx on tbl8 t1  (cost=0.56..1320389.02 rows=10080002 width=8)
                           Index Cond: (col > NULL::bigint)
         ->  Subquery Scan on t2  (cost=0.56..713759.72 rows=2016000 width=77)
               ->  Unique  (cost=0.56..693599.72 rows=2016000 width=53)
                     ->  Index Only Scan using tbl8_col_col2_col3_idx on tbl8 t2_1  (cost=0.56..688559.72 rows=2016000 width=53)
                           Index Cond: (col = t1.col)
(10 rows)

@svenklemm The query plan appears to be affected by the number of distinct values in col2. Is there something like pg_hint_plan that can help? Postgres really needs a way to force a query plan... sometimes the user knows best.

svenklemm · 2021-06-15T22:13:34Z

Yes skip scan is affected by the number of distinct values. If the number of distinct values is close to the number of rows not using skipscan is more efficient.

jgm-ktg · 2021-06-15T22:22:21Z

In my sample of real data the lateral strategy you cooked up above (thanks again!) is selecting about 4M of 14M records... that 14M being a small subsample of billions. It was fine (used SkipScan) after I ran ANALYZE, but at some point the stats were messed up and I can tell you that with only 3 or 4 "col2" values per "col" (on average) it was a lot slower to avoid SkipScan. I still think this should be under user control when needed...

jgm-ktg · 2021-08-30T21:51:46Z

@svenklemm Just FYI, on 2.4.1 (Timescale Forge), the following instance of your method above seems to cause problems with EXPLAIN in the context of a distributed hypertable space-partitioned by the_id:

explain verbose select the_table.* from
(select distinct on (the_id) the_id from the_table where the_id in (1111,2222) order by the_id, receive_time desc offset 0) cid,
lateral (

        select distinct on (id) * from the_table where the_table.the_id=cid.the_id order by id, receive_time desc offset 0

) the_table;

ERROR:  [data-node-0000]: bind message supplies 0 parameters, but prepared statement "" requires 1

...the query itself runs fine. Should this be a separate bug report?

This was in the context of timescaledb.enable_remote_explain=on ... this seems to be critical when reproducing the issue.

svenklemm · 2021-08-30T22:48:51Z

I dont think this is related to SkipScan there is already a bugreport for this: #3128

jgm-ktg · 2021-08-30T23:02:42Z

I dont think this is related to SkipScan there is already a bugreport for this: #3128

Odd, the query above worked! Only the EXPLAIN failed, and even that worked again with enable_remote_explain=off

svenklemm added the enhancement label Apr 7, 2021

k-rus added the planner label Apr 12, 2021

NunoFilipeSantos added the skip-scan label May 11, 2021

jgm-ktg mentioned this issue Aug 31, 2021

LATERAL JOINS don't work on distributed hypertables #3128

Closed

NunoFilipeSantos added feature-request and removed feature-enhancement labels Sep 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-column support for SkipScan #3094

Multi-column support for SkipScan #3094

svenklemm commented Apr 7, 2021

jgm-ktg commented Jun 8, 2021

jgm-ktg commented Jun 15, 2021 •

edited

Loading

svenklemm commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021 •

edited

Loading

svenklemm commented Jun 15, 2021 •

edited

Loading

jgm-ktg commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021 •

edited

Loading

svenklemm commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021

svenklemm commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021

jgm-ktg commented Aug 30, 2021 •

edited

Loading

svenklemm commented Aug 30, 2021

jgm-ktg commented Aug 30, 2021

Multi-column support for SkipScan #3094

Multi-column support for SkipScan #3094

Comments

svenklemm commented Apr 7, 2021

jgm-ktg commented Jun 8, 2021

jgm-ktg commented Jun 15, 2021 • edited Loading

svenklemm commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021 • edited Loading

svenklemm commented Jun 15, 2021 • edited Loading

jgm-ktg commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021 • edited Loading

svenklemm commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021

svenklemm commented Jun 15, 2021

jgm-ktg commented Jun 15, 2021

jgm-ktg commented Aug 30, 2021 • edited Loading

svenklemm commented Aug 30, 2021

jgm-ktg commented Aug 30, 2021

jgm-ktg commented Jun 15, 2021 •

edited

Loading

jgm-ktg commented Jun 15, 2021 •

edited

Loading

svenklemm commented Jun 15, 2021 •

edited

Loading

jgm-ktg commented Jun 15, 2021 •

edited

Loading

jgm-ktg commented Aug 30, 2021 •

edited

Loading