A few weeks ago I showed how to configure and use FILESTREAM for storing files in a SQL Server database. The feature is quite easy to use, but requires some configuration. Is it worth doing that? This time I would like to share results of a performance test that I ran. It should help to answer a question whether it makes sense using it or not. I tested both: read and write operations.
What the documentation says
The documentation lists three conditions that should be true to consider FILESTREAM for storing BLOBs. Those conditions are:
- objects are larger than 1 MB
- fast read operations are needed
- the middle tier takes care of the application logic
It also states clearly that the traditional way of storing BLOBs using varbinary(max) data type is faster when the objects are small.
Knowing that, I decided to make three tests on files with different sizes: 94 kB, 2.6 MB and 54 MB. Each test consists of a write operation (INSERT statement) and a read operation (SELECT statement).
INSERT INTO images(id, imageFile, createDate)
SELECT NEWID(), BulkColumn, getDate()
FROM OPENROWSET(BULK 'd:/tmp/big.pdf', SINGLE_BLOB) as f;
SELECT TOP 1 *
FROM dbo.images
WHERE id IN (
SELECT TOP 5 id
FROM dbo.images
ORDER BY createDate DESC)
ORDER BY createDate
FILESTREAM with a small file
The first test inserts a small (94 kB) file into a table configured in a traditional way.
CREATE TABLE [dbo].[images](
[id] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[imageFile] [varbinary](max) NULL,
[createDate] [datetime2] NOT NULL,
UNIQUE NONCLUSTERED (
[id] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
GO
Than the same thread reads the 5th most recently inserted file. Those two set a baseline for comparison.
The second part of the test does exactly the same thing but uses a table with FILESTREAM configured:
CREATE TABLE [dbo].[images](
[id] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[imageFile] [varbinary](max) FILESTREAM NULL,
[createDate] [datetime2] NOT NULL,
UNIQUE NONCLUSTERED (
[id] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] FILESTREAM_ON [PRIMARY_FILESTREAM];
GO
This pair of operations (INSERT and SELECT) is executed 1000 times and an average duration is calculated separately for each. In a case of the small 94 kB file, the values are presented in the below table.
VARBINARY | FILESTREAM | |
insert | 7 | 18 |
select | 4 | 5 |
TOTAL | 5 | 12 |
The result show a substantial difference in a performance in favor of the traditional way. It is aligned with the documentation which warns that the small files may behave worse with FILESTREAM.
FILESTREAM with a middle size file
The second test was executed with a 2.6 MB file which exceeds the break-even point defined in the documentation which is 1 MB. Again, the test inserted and queried files 1000 times. The durations in seconds are gathered in the below table and presented on the chart.
VARBINARY | FILESTREAM | |
insert | 140 | 113 |
select | 86 | 99 |
TOTAL | 113 | 106 |
FILESTREAM with a big file
In the third test I used 54 MB file. As it would take a long time to insert them 1000 times, I reduced the number of operations to 100. It does not really matter because I measured average duration not a sum.
They are presented below.
VARBINARY | FILESTREAM | |
insert | 3958 | 1275 |
select | 5436 | 2285 |
TOTAL | 4697 | 1780 |
It is easy to say by just taking a quick look at the chart that both operations more than twice faster with FILESTREAM.
Conclusions
The tests I ran suggest that big files reveal true power of FILESTREAM - its speed. As always a decision of choosing one or the other way of implementing a specific functionality must be taken wisely with considering the particular case. FILESTREAM can significantly speed up BLOBs access but only in certain cases. For some, like small BLOBs, can slow it down. Not even mentioning the whole configuration effort needed to be made including paperwork if the company has some procedures around it. But if you think using FILESTREAM is justified in your case, do not hesitate. Feel free to use my FILESTREAM introduction as a starting point.