Imagine you have one million record in a table, and you need to display them in a web page. Would you fetch 1 million records in a one shot and send it through an API? Probably not. Offset based pagination is one of the most common technique for fetching small chunk of data for one page at a time. But it has its own performance bottlenecks. checkout my previous article to know more about offset based pagination https://sureshdsk.dev/offset-based-pagination-in-postgres
In this article, we will explore about keyset based pagination. This method does not use count or offset clauses.
How it works
- Unique auto increment column is used as a key column which is mostly primary key.
- This key column is sorted so that it will produce consistent results.
- Instead using offset to skip records, it used ketset_column > last record pk from previous page condition is used
- by using ketset_column in where clause, the database does index only scan to skip records because primary key has an index by default.
Postgres keyset based pagination example
-- to fetch each page
select * from table_name where {{keyset_column}} > {{keyset_value}} order by {{keyset_column}} LIMIT {{per_page}};
-- fetch first page
select * from employee order by emp_id LIMIT 100;
-- fetch second page
-- first 100 records are skipped based on index only scan
select * from employee where emp_id > 100 LIMIT 100;
-- fetch third page
-- first 200 records are skipped based on index only scan
select * from employee where emp_id > 200 LIMIT 100;
Analyze query plan
Lets take a look at the query plan for this keyset based pagination,
explain analyze select * from employee where emp_id > 200 LIMIT 100;
-- index scan is done on employee table to fetch the results
-- index scan is faster than seq scan
Limit (cost=200.76..210.80 rows=100 width=930) (actual time=0.502..0.530 rows=100 loops=1)
-> Index Scan on employee (cost=0.00..1878.10 rows=18710 width=930) (actual time=0.010..0.445 rows=2100 loops=1)
Index Cond: (emp_id > 200)
Planning Time: 0.073 ms
Execution Time: 0.550 ms
Pros
- Faster than offset based pagination
- No expensive count query and offsets
- Pagination uses index only scan instead of sequential table scan
- Inserting/deleting a row on page n doesn't cause missing or duplicate record in page navigation
Cons
- Client needs to remember the keyset column and previous highest key value
- If sorting key is different than keyset column, we can't utilise keyset pagination