Checksum vs Torn Page Detection - performance

Arkadiusz Fronc SQL Server 25 April 2014

page There is a setting on a database level called PAGE_VERIFY. It defines a method of page verification - how data pages are verified by SQL Server engine to make sure they are not broken. Starting from SQL Server 2005 three options are allowed: NONE, TORN_PAGE_DETECTION and CHECKSUM. Although Microsoft defined CHECKSUM as default and as a best practice, more work by a database engine needs to be done which may cause performance degradation.

For those of you who administer databases created on SQL Server 2000 and still have not switched PAGE_VERIFY to CHECKSUM, I would like to present results of performance test that I performed.

Options

Storage where SQL Server stores files can get broken in many different ways. A DBA is lucky if it is reported explicitly by OS. Unfortunately, some damages can remain invisible for Operating System and not get reported to SQL Server. A broken data page stored on a disk can be readable but some bits can have incorrect values. If it remained unnoticed, SQL Server would return wrong data for queries that need this page. A user would not notice there is something wrong and the problem could be uncovered for a long time. The longer it is uncovered the higher cost of fixing it is.

To prevent or at least reduce the risk of this problem, Microsoft implemented page verification methods on SQL Server engine level. The method can be chosen on Database Properties window on Options page.

databaseProperties pageVerify

Starting from SQL Server 2005 there are three options available:

NONE

When this option is used, SQL Server does not verify correctness of pages read from storage. From obvious reasons it generally should not be used. Even with extremely reliable storage devices and hardware configuration it is not a good choice. Any device may fail, it is better to know about it immediately.

TORN_PAGE_DETECTION

It was the default choice in SQL Server 2000. In this algorithm, for each page some bits are additionally stored in the page header during write operation. When the page is read from a disk during read operation, bits from the page header are compared to those in the page body. If a discrepancy is detected, the page is considered broken and the problem is reported to a user. It causes some small data duplication but it is very low - only small number of bits are duplicated. It is good because it does not consume much space, but unfortunately it does not detect all errors. If a broken bit is one of those that were not duplicated in the page header, the damage is not detected.

CHECKSUM

It is the default since SQL Server 2005. During write operation a checksum is calculated from the entire page and the result is saved in the page header. When the page is read from a disk, checksum is calculated again and compared to the value in the page header. The idea is similar to TORN_PAGE_DETECTION - both calculate a verification value for a page and store the result in the page header, but CHECKSUM uses the whole page for calculation which TORN_PAGE_DETECTION only a small number of bits.

Comparing to TORN_PAGE_DETECTION, it provides much better error cases coverage which makes it more reliable and sensitive for IO errors.

Performance

It is easy to guess, option NONE has no negative impact on performance of IO operations. TORN_PAGE_DETECTION requires copying some bits to the page header during writing and comparing them during reading. In theory, from all three page verification options, CHECKSUM should have the most significant impact as it requires calculating checksums for all IO operations. To verify these statements, I have performed a serious of tests.

Test description

I created a script which executed a SELECT query and returned one row by Primary Key then inserted a new row into the table. Both operations were executed in a loop one million times. It looked like below:

set nocount on

declare @numberOfTrans int = 0
declare @transID int
declare @startTime datetime
declare @value money

set @startTime = getDate()
while @numberOfTrans <= 1000000 begin
 
   select @value = value + 1
   from transactions
   where trId = @numberOfTrans - 10000

   if @value is null
      set @value = 0

   insert into transactions (userID, creationDate, value, tax)
   values (1, getDate(), @value, null)
 
   set @numberOfTrans = @numberOfTrans + 1
end

Test results

The script was executed against databases with PAGE_VERIFY set to NONE, TORN_PAGE_DETECTION and CHECKSUM. I measured amount of time needed for database engine to complete in each case. The test was repeated for each option 30 times. The results can be found in a table below:

Test id	Script duration [s]
Test id	NONE	TORN_PAGE_DETECTION	CHECKSUM
1	991	1068	971
2	1010	979	1061
3	945	978	961
4	932	924	987
5	1041	924	1002
6	899	1262	1053
7	960	1042	923
8	971	964	1083
9	968	968	984
10	1070	1028	946
11	1063	981	973
12	998	1097	1174
13	1009	1972	1044
14	1248	1511	1297
15	1430	3231	2552
16	1776	1474	1378
17	1284	1261	1420
18	1262	1258	1197
19	1049	941	1128
20	1141	1280	1270
21	1133	1099	1384
22	1057	968	1577
23	1384	1210	1214
24	1067	1033	1159
25	989	1044	1116
26	961	973	930
27	1283	1031	967
28	1286	1102	1060
29	1107	1059	993
30	934	1085	983
Average	1108.267	1191.567	1159.567

Average times are not much different. TORN_PAGE_DETECTION seems to slow down IO operations a little comparing to NONE and CHECKSUM which can be surprising but the difference is so small that it can be qualified as measurement error. Not surprising is that the script executed fastest when NONE was chosen.

I have repeated the test a few times and the result was generally similar - not much difference in average execution times. CHECKSUM has not shown up as slower than TORN_PAGE_DETECTION.

Based on my test I have no doubts to switch my databases to CHECKSUM. Nevertheless, I recommend to make your own tests that best match your case.

Making the switch

Switching a database from TORN_PAGE_DETECTION to CHECKSUM is easy and quick. It can be done by running the following statement:

alter database [TestDb] set page_verify CHECKSUM

This command is lightweight and executes quickly. It is so, because nothing is done with already existing pages. There is no calculation of checksums and updating pages involved. Each page has its own marker which defines which method is used for the page. If a page is read from a disk, database engine checks the page header for a page verification method and verifies data accordingly to an algorithm. During write operation when a page is writen to a disk, it is stored with a checksum in the page header.

If you like what I do, consider buying me a coffee :)

Buy me a coffee