FILESTREAM – Sharing Experiences

FILESTREAM – Hardware/OS Consideration

There should be some considerations regarding hardware selection/configuration and OS configuration like:

Hard disk – SCSI is better for reliability and performance
RAID level – RAID 10 is recommended even though at least 50% space is redundant depending on configuration
Storage layout – all data in one volume may not be the right choice if load is high
Cluster size – 64KB and above to reduce fragmentation, also defragment periodically
Compression – cannot be enabled if cluster size is 4KB or less, may not be effective for all kinds of data, comes at the expense of CPU
Anti-virus configuration – AV can need to be setup properly so that it does not delete FILESTREAM files
Disable 8.3 naming for NTFS – if number of files are high then NTFS has to work harder to get a unique name
Turn off last access time update – since files are not accessed by explorer rather SQL Server turning off this feature is OK and it saves a lot of time
Security – SQL server need to be configured to use integrated security (Win32 file APIs to work that uses Windows token associated with the client’s thread accessing files)

FILESTREAM – What and When

If you have to store files (BLOB) in MS SQL Server database, you might consider using FILESTREAM (started from MS SQL Server 2008). This is essentially enabling FILESTREAM attribute of varbinary (max) type column. Some other setups are necessary prior to that.

FILESTREAM combines MS SQL Server with NTFS file system. Database is good at reading/writing/processing small pieces (rows and columns) of information scattered around in the database files while file system is good at reading/writing files as a whole. File system start performing better when the files are larger than 1 MB.

By combining the two you can get the best of the both worlds, like getting past the low max size limit of database BLOB and use max NTFS size, good streaming performance of file system, integration with structured data (transactional consistency), better fragmentation management, manageability (backups etc can be done together) etc.

However if you are doing frequent small updates FILESTREAM performs poor. This is because here it has to work more – guarantee transactional consistency as well write to file system that essentially creates/writes a new copy and delete the old one. It also takes more time to take backup.

FILESTREAM data can be accessed in two ways: standard Transact-SQL and by using Win32 APIs. If you want the best performance of File System you need to use Win32 APIs. There are provisions for the Win32 APIs to work within the context of database transaction.

FILESTREAM BLOBS use file system cache and does not use database buffer pool. This improves query processing and overall database performance.

However do note that certain features like Transparent Data Encryption (TDE) are not supported with FILESTREAM.

To summarize, if you are to save files 1 MB or larger in database that will be read a lot with few small updates and that a middle tier is available where you can use Win32 APIs for accessing them then FILESTREAM is a good choice.

Reference

Index

Tag: FILESTREAM

Other Posts

FILESTREAM – Hardware/OS Consideration

FILESTREAM – What and When