大きなテーブルでCTEを使用してキーセットページネーションクエリを最適化する方法

Question

私はあなたに迷惑をかけるためにここに来る前にこのトピックについてできる限り自分自身を文書化しようとしましたが、とにかくここにいます。

このテーブルにキーセットのページネーションを実装したいと思います：

create table api.subscription ( subscription_id uuid primary key, token_id uuid not null, product_id uuid not null references api.product(product_id) deferrable, spid bigint null, attributes_snapshot jsonb not null, created_at timestamp not null, refreshed_at timestamp, enriched_at timestamp null, valid_until timestamp not null, is_cancelled boolean not null, has_been_expired boolean not null, has_quality_data boolean not null );

そのために、このクエリを使用してページ分割メタデータを準備します。

with book as ( select created_at, subscription_id from api.subscription where token_id = $1 and refreshed_at >= $2 and valid_until >= now() and not is_cancelled and not has_been_expired ), total as ( select count(*) as total from book ), page as ( select * from book where (case when $4 is null then true else (created_at > $4 or (created_at = $4 and subscription_id > $5)) end) order by created_at asc, subscription_id asc limit $3 ), last_row as ( select last_value(created_at) over() as last_seen_created_at, last_value(subscription_id) over() as last_seen_subscription_id from page ), ids as ( select array_agg(subscription_id) ids from page ) select * from last_row, total, ids

これにより、ユーザーが現在ページ分割しているアイテムの総数、最後に表示されたキー（次のページ）、および現在のページのID（IN句（またはany）ですが、それは別の話です）。

問題は、テーブルに1900万行が含まれている場合にかかる時間です。

 Nested Loop (cost=50000.14..50000.22 rows=1 width=64) (actual time=143.677..143.677 rows=0 loops=1) Output: last_row.last_seen_created_at, last_row.last_seen_subscription_id, total.total, ids.ids Buffers: shared hit=395 read=24611 CTE book -> Seq Scan on api.subscription (cost=0.00..50000.00 rows=1 width=24) (actual time=143.646..143.646 rows=0 loops=1) Output: subscription.created_at, subscription.subscription_id Filter: ((NOT subscription.is_cancelled) AND (NOT subscription.has_been_expired) AND (subscription.token_id = '73759739-7af5-4d28-91bc-897c959bddcd'::uuid) AND (subscription.valid_until >= now()) AND (subscription.refreshed_at >= (now() - '4 days'::interval))) Rows Removed by Filter: 1000000 Buffers: shared hit=389 read=24611 CTE total -> Aggregate (cost=0.02..0.03 rows=1 width=8) (never executed) Output: count(*) -> CTE Scan on book (cost=0.00..0.02 rows=1 width=0) (never executed) Output: book.created_at, book.subscription_id CTE page -> Limit (cost=0.03..0.04 rows=1 width=24) (actual time=143.674..143.674 rows=0 loops=1) Output: book_1.created_at, book_1.subscription_id Buffers: shared hit=395 read=24611 -> Sort (cost=0.03..0.04 rows=1 width=24) (actual time=143.673..143.673 rows=0 loops=1) Output: book_1.created_at, book_1.subscription_id Sort Key: book_1.created_at, book_1.subscription_id Sort Method: quicksort Memory: 25kB Buffers: shared hit=395 read=24611 -> CTE Scan on book book_1 (cost=0.00..0.02 rows=1 width=24) (actual time=143.646..143.646 rows=0 loops=1) Output: book_1.created_at, book_1.subscription_id Buffers: shared hit=389 read=24611 CTE last_row -> WindowAgg (cost=0.00..0.03 rows=1 width=24) (actual time=143.675..143.675 rows=0 loops=1) Output: last_value(page.created_at) OVER (?), last_value(page.subscription_id) OVER (?) Buffers: shared hit=395 read=24611 -> CTE Scan on page (cost=0.00..0.02 rows=1 width=24) (actual time=143.674..143.674 rows=0 loops=1) Output: page.created_at, page.subscription_id Buffers: shared hit=395 read=24611 CTE ids -> Aggregate (cost=0.02..0.03 rows=1 width=32) (never executed) Output: array_agg(page_1.subscription_id) -> CTE Scan on page page_1 (cost=0.00..0.02 rows=1 width=16) (never executed) Output: page_1.created_at, page_1.subscription_id -> Nested Loop (cost=0.00..0.05 rows=1 width=32) (actual time=143.677..143.677 rows=0 loops=1) Output: last_row.last_seen_created_at, last_row.last_seen_subscription_id, total.total Buffers: shared hit=395 read=24611 -> CTE Scan on last_row (cost=0.00..0.02 rows=1 width=24) (actual time=143.676..143.676 rows=0 loops=1) Output: last_row.last_seen_created_at, last_row.last_seen_subscription_id Buffers: shared hit=395 read=24611 -> CTE Scan on total (cost=0.00..0.02 rows=1 width=8) (never executed) Output: total.total -> CTE Scan on ids (cost=0.00..0.02 rows=1 width=32) (never executed) Output: ids.ids Planning time: 0.580 ms Execution time: 143.786 ms

注：これは100万行しかない例ですが、1900万行を超えると20秒以上かかります。

私たちは運が悪く、さまざまなインデックスの設定を試しました：

Indexes: "subscription_pkey" PRIMARY KEY, btree (subscription_id) "has_been_expired_idx" btree (has_been_expired) "is_cancelled_idx" btree (is_cancelled) "pagination_idx" btree (token_id, refreshed_at, valid_until, is_cancelled, has_been_expired) "refreshed_at_idx" btree (refreshed_at) "token_and_created_at_idx" btree (token_id, created_at) "token_and_refreshed_at_idx" btree (token_id, refreshed_at) "token_id_idx" btree (token_id) "valid_until_idx" btree (valid_until) Foreign-key constraints: "subscription_product_id_fkey" FOREIGN KEY (product_id) REFERENCES api.product(product_id) DEFERRABLE

だから私の質問は：

正しいインデックスを見つけることによってこのクエリを改善する方法はありますか？
とにかくそれは良いアプローチですか？

ヒントがあれば歓迎します:)私の質問が理にかなっているといいのですが。

お読みいただきありがとうございます。

Jasen · Answer

合計がなければ、次のように書き直すことができます

with page as ( select created_at, subscription_id from api.subscription where token_id = $1 and refreshed_at >= $2 and valid_until >= now() and not is_cancelled and not has_been_expired and (case when $4 is null then true else (created_at > $4 or (created_at = $4 and subscription_id > $5)) end) order by created_at asc, subscription_id asc limit $3 ), last_row as ( select last_value(created_at) over() as last_seen_created_at, last_value(subscription_id) over() as last_seen_subscription_id from page ), ids as (valid_until select array_agg(subscription_id) ids from page ) select * from last_row, ids

is_cancelled、has_been_expired、token_idの複合インデックスとrefreshed_at、valid_untilまたはcreated_atのいずれかが役立ちます。