sql - Indexing a column used to ORDER BY with a constraint in PostgreSQL -
i've got modest table of 10k rows sorted column called 'name'. so, added index on column. selects on fast:
explain analyze select * crm_venue order name asc limit 10; ...query plan... limit (cost=0.00..1.22 rows=10 width=154) (actual time=0.029..0.065 rows=10 loops=1) -> index scan using crm_venue_name on crm_venue (cost=0.00..1317.73 rows=10768 width=154) (actual time=0.026..0.050 rows=10 loops=1) total runtime: 0.130 ms
if increase limit
60 (which use in application) total runtime doesn't increase further.
since i'm using "logical delete pattern" on table consider entries delete_date null
. common select make:
select * crm_venue delete_date null order name asc limit 10;
to make query snappy put index on name
column constraint this:
create index name_delete_date_null on crm_venue (name) delete_date null;
now it's fast ordering logical delete constraint:
explain analyze select * crm_venue delete_date null order name asc limit 10; limit (cost=0.00..84.93 rows=10 width=154) (actual time=0.020..0.039 rows=10 loops=1) -> index scan using name_delete_date_null on crm_venue (cost=0.00..458.62 rows=54 width=154) (actual time=0.018..0.033 rows=10 loops=1) total runtime: 0.076 ms
awesome! myself trouble. application calls first 10 rows. so, let's select more rows:
explain analyze select * crm_venue delete_date null order name asc limit 20; limit (cost=135.81..135.86 rows=20 width=154) (actual time=18.171..18.189 rows=20 loops=1) -> sort (cost=135.81..135.94 rows=54 width=154) (actual time=18.168..18.173 rows=20 loops=1) sort key: name sort method: top-n heapsort memory: 21kb -> bitmap heap scan on crm_venue (cost=4.67..134.37 rows=54 width=154) (actual time=2.355..8.126 rows=10768 loops=1) recheck cond: (delete_date null) -> bitmap index scan on crm_venue_delete_date_null_idx (cost=0.00..4.66 rows=54 width=0) (actual time=2.270..2.270 rows=10768 loops=1) index cond: (delete_date null) total runtime: 18.278 ms
as can see goes 0.1 ms 18!!
clearly happens there's point ordering can no longer use index run sort. noticed increase limit
number 20 higher numbers takes around 20-25 ms.
am doing wrong or limitation of postgresql? best way set indexes type of queries?
my guess since, logically, index comprised of pointers set of rows on set of data pages. if fetch page known have "deleted" records on it, doesn't have recheck page once fetched fetch records deleted.
therefore, may when limit 10 , order name, first 10 come index on data page (or pages) comprised of deleted records. since knows these pages homogeneous, doesn't have recheck them once it's fetched them disk. once increase limit 20, @ least 1 of first 20 on mixed page non-deleted records. force executor recheck each record since can't fetch data pages in less 1 page increments either disk or cache.
as experiment, if can create index (delete_date, name) , issue command cluster crm_venue on index new index. should rebuild table in sort order of delete_date name. super-sure, should issue reindex table crm_venue. try query again. since not nulls clustered on disk, may work faster larger limit values.
of course, off-the-cuff theory, ymmv...
Comments
Post a Comment