I have had a pleasure to work with various database engines. As you probably noticed I write mostly about SQL Server and Oracle so I am familiar with those two at least. But I have also had a chance to develop systems in MySQL and PostgreSQL. All those database engines except Oracle have had a feature allowing to limit a number of returned rows from a SELECT query.
Oracle Database 12c contains ANSI standard for such functionality. Let me show you how it works. As there is a huge field for performance optmizations, I will also take a look from that perspective.
Limiting a number of returned rows (called also data paging, pagination or top N rows problem) has been difficult for many years in Oracle databases. Ok, maybe not so difficult but it was not obvious how it should have been done properly. Many junior database developers stumbled over it.
SQL Server has had TOP statement and combined with ROW_NUMBER() it was quite straightforward how to create a query that returns a range of rows – for example from row 10th to row 19th. Oracle has not even had such TOP statement. So how it could have been achieved?
The old way
I am not going to present any of the incorrect attempts made by inexperienced developers. You can see below how it could have been done right:
SELECT f.*
FROM (
SELECT t.*, rownum r
FROM (
SELECT id, account_id, value
FROM transactions
ORDER BY id) t
WHERE rownum <= 19) f
WHERE r >= 10
This query returns data but only rows from the 10th row to 19th row. Why is it the correct one? Because this structure allows the database engine to recognize your intention and optimize the execution plan. If this structure is used, the engine does not read all rows and filter them out just before returning them to the client, but it knows you are interested of the particular range of rows. In this case, reading data after 19th row does not make sense. It can be even further optimized to try skipping the first 9 rows, but it is not always possible. The below picture shows an execution plan of the old of data paging in Oracle.
You can notice that not all rows are read. It can be inferred from COUNT(STOPKEY) element in the execution plan.
Never mind. Fortunately, there is an easier way to do this now.
The new way - ANSI way
Oracle implemented ANSI standards for data paging in 12c release. Now, if you want to filter out data returned by a query on the database side and you want to let the database engine doing it in a smart way, you can use OFFSET ... FETCH NEXT ... clause.
In my case, a complete query looks as below:
SELECT id, account_id, value
FROM transactions
ORDER BY id
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY
It is much simpler than the one with rownum, isn't it? But that is not all. I expect it is optimized and the database engine knows there is no need to read all rows from the table. Let's take a look at the execution plan:
Hmm … that is not what I expected. The execution plan looks worse in the case of OFFSET ... FETCH ... clause.
I use AUTOTRACE option to check number of reads in both cases. The ROWNUM version:
SET AUTOTRACE ON;
SELECT f.*
FROM (
SELECT t.*, rownum r
FROM (
SELECT id, account_id, value
FROM TRANSACTIONS
ORDER BY id) t
WHERE rownum <= 19) f
WHERE r >= 10
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19 | 988 | 4 (0)| 00:00:01 |
|* 1 | VIEW | | 19 | 988 | 4 (0)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | VIEW | | 19 | 741 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| TRANSACTIONS | 495K| 7259K| 4 (0)| 00:00:01 |
| 5 | INDEX FULL SCAN | PK_TRANSACTIONS | 19 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("R">=10)
2 - filter(ROWNUM<=19)
Statistics
-----------------------------------------------------------
1 DB time
3 Requests to/from client
4 consistent gets
2 consistent gets examination
2 consistent gets examination (fastpath)
4 consistent gets from cache
2 consistent gets pin
2 consistent gets pin (fastpath)
1 enqueue releases
1 enqueue requests
14 global enqueue gets sync
14 global enqueue releases
3 non-idle wait count
2 opened cursors cumulative
1 opened cursors current
1 pinned cursors current
1 recursive calls
4 session logical reads
-262144 session pga memory
101016 session uga memory
4 user calls
And the OFFSET ... FETCH ... version:
SET AUTOTRACE ON;
SELECT id, account_id, value
FROM TRANSACTIONS
ORDER BY id
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 495K| 30M| | 3158 (1)| 00:00:01 |
|* 1 | VIEW | | 495K| 30M| | 3158 (1)| 00:00:01 |
|* 2 | WINDOW SORT PUSHED RANK| | 495K| 7259K| 13M| 3158 (1)| 00:00:01 |
| 3 | TABLE ACCESS FULL | TRANSACTIONS | 495K| 7259K| | 617 (1)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=CASE WHEN (10>=0) THEN 10
ELSE 0 END +10 AND "from$_subquery$_002"."rowlimit_$$_rownumber">10)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "ID")<=CASE WHEN (10>=0) THEN 10 ELSE 0 END
+10)
Statistics
-----------------------------------------------------------
9 CPU used by this session
14 CPU used when call started
21 DB time
4 Requests to/from client
2266 consistent gets
2266 consistent gets from cache
2266 consistent gets pin
2266 consistent gets pin (fastpath)
8 global enqueue gets sync
8 global enqueue releases
3 non-idle wait count
2 opened cursors cumulative
1 opened cursors current
1 pinned cursors current
2266 session logical reads
166616 session uga memory
4 user calls
It does not look good either – 2266 consistent gets vs 4. I am curious how it compares to reading all data from the table:
SET AUTOTRACE ON;
SELECT id, account_id, value
FROM TRANSACTIONS
ORDER BY id
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 495K| 7259K| | 3158 (1)| 00:00:01 |
| 1 | SORT ORDER BY | | 495K| 7259K| 13M| 3158 (1)| 00:00:01 |
| 2 | TABLE ACCESS FULL| TRANSACTIONS | 495K| 7259K| | 617 (1)| 00:00:01 |
-------------------------------------------------------------------------------------------
Statistics
-----------------------------------------------------------
12 CPU used by this session
12 CPU used when call started
13 DB time
13 Requests to/from client
2266 consistent gets
2266 consistent gets from cache
2266 consistent gets pin
2266 consistent gets pin (fastpath)
1 enqueue releases
1 enqueue requests
4 global enqueue gets sync
4 global enqueue releases
13 non-idle wait count
2 opened cursors cumulative
1 opened cursors current
1 pinned cursors current
1 recursive calls
2266 session logical reads
6750208 session pga memory max
101016 session uga memory
14 user calls
The query without pagination also needs 2266 consistent reads. I don't see much optimization in this case. Just by looking at AUTOTRACE results, I would say OFFSET ... FETCH ... uses the same strategy as reading all data. That is bad, so bad.
Conclusions
First, I need to say I am happy that Oracle implemented pagination with OFFSET ... FETCH .... It is way more convenient to use than the ROWNUM version. Unfortunately, it is done terribly wrong ... or am I missing something? OFFSET … FETCH … appears to be slower than the old method with ROWNUM. At least in my simple test, performance of OFFSET ... FETCH ... is not satisfying. Summing it up - I am going to keep away from that feature until it is fixed.
If you have used that feature and achieved different results, please let me know.