MS SQL Server Nvarchar Issues

Do you need to store non-English characters, like Chinese characters?

Then you need UNICODE characters. nvarchar, ntext, nchar are your options. n is for ISO synonym national. Uses UNICODE UCS-2 character set, 2 bytes for a character, double than that of char/varchar/text. This is a price you need to pay for multi-ligual support.

nchar or nvarchar/ntext?

If you know the data length is always going to be around a fixed size, say, 20 characters then nchar, else nvarchar/ntext.

nvarchar ($fixed size$) or nvarchar (max)/ntext?

If you know your data is always going to be somewhere between a fixed range, say, 1 to 100 characters and would never exceed 100 then nvarchar (100). It makes sense, looks good. Use fixed length nvarchar as long the size you know would be less than 8000 bytes. More than 8K means the data would be saved out-row. Fixed length nvarchar can’t help.

If your data size can be NULL to 2 GB and anything in between then you have no option but nvarchar (max)/ntext.

nvarchar (max) or ntext?

Go for nvarchar (max), not ntext. ntext will not be supported in future and nvarchar (max) better in performance than ntext.

Is nvarchar (max) bad for performance?

SELECT 
[name] AS tablename, 
large_value_types_out_of_row 
FROM sys.tables

If large_value_types_out_of_row for the table that contains your nvarchar (max) column is 0 (=OFF) then MS SQL Server would try to save the data in-row, given that the row fits in 8K page size. If the row does not fit in 8K then the nvarchar (max) data would go out-row, one or more extra pages depending on the nvarchar (max) column data size.

If MS SQL Server has to read extra pages for a row, performance would suffer. But what to do? The data is big, right?

If large_value_types_out_of_row is 1 (=ON), then no matter how small is your nvarchar (max) column data, it will always be saved out-row.

In-row or out-row?

If the select statements you use for the table excludes the nvarchar (max) columns then use out-row. Because when you will be reading the non-nvarchar (max) columns, you are not reading the other pages used to store long nvarchar (max). Not only that, since all those big data are not in the same page, your single page can now fit more rows with the small data. You will be doing good here.

But if you have to read the whole row (all columns) at one go, then in-row good.

How to change in-row to out-row and vice versa?

You can change anytime.

sp_tableoption N'MyTableName', 'large value types out of row', 'ON'

However, doing so won’t immediately change the existing data layout. New inserted data would follow the new way. Existing data, if updated would be saved in the new layout. Well, you can always read the existing data, change it and then update it again with the old value. That way, you save all your data in the new layout.

What about ntext?

Well, we already decided not to use next. But since we are discussing this let’s see the respective issues.

This is how you can know the existing layout.

SELECT 
[name] AS tablename, 
text_in_row_limit 
FROM sys.tables

text_in_row_limit = 0 says, the option is not set. If non-zero, say, 256 then that is the number of bytes that will be saved in-row.

This is how to change it.

sp_tableoption N'MyTableName', 'text in row', ON

text_in_row_limit would be 256.

sp_tableoption N'MyTableName', 'text in row', '1000'

If you want that limit to be 1000 bytes. Yes, setting itself won’t change the existing data layout just like nvarchar.

How full are the pages?

Remember, you pages should be full as much as possible. Otherwise you will be reading a page but get very small data. You will be doing bad.

SELECT 
page_count, 
avg_page_space_used_in_percent 
FROM sys.dm_db_index_physical_stats(DB_ID(N'MyDatabaseName'), 
                                    OBJECT_ID(N'MyTableName'), 
                                    NULL, NULL, 'DETAILED')

But ignore small tables having less than 1000 pages (page_count).

Are the rows fitting in 8K?

SELECT 
OBJECT_NAME(object_id) as TableName, 
OBJECT_NAME(index_id) as IndexName, 
min_record_size_in_bytes, 
max_record_size_in_bytes, 
avg_record_size_in_bytes 
FROM sys.dm_db_index_physical_stats(DB_ID(N'MyDatabaseName'), 
                                    OBJECT_ID(N'MyTableName'), 
                                    NULL, NULL, 'DETAILED')

It is better to design tables in way so that data of n (n >= 1) rows fit in 8K, a page.

Is there any other limitation for nvarchar (max)?

You cannot create an index on nvarchar (max) column.

Also avoid using nvarchar (max) column in where clause.

What are the issues for migrating varchar to nvarchar?

Your migrated data is going to be double. Some rows might no longer fit in-row. They may over flow to out-row. Performance might be affected.

Index

Author: Gopal Das

I am working as a Principal Consultant at JLT Interactive Pte Ltd, Singapore. I have a BS in Computer Science & Engineering from Khulna University and ME in Internet Science & Engineering from Indian Institute of Science (IISc). I have a few publications on Query Optimization in RDBMS in ACM SIGMOD, IEEE ICDE etc. I was a founding team member and VP Engineering of iTwin, a spinoff from A*STAR. I am working as a software engineer for 14 years. I am interested in Algorithms, Database, Security, Machine Learning, and Blockchain among others. I am a father of two children and live in Singapore. www.linkedin.com/in/dasgopal https://github.com/gopalcdas

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s