What is Collation?
Collation in SQL Server is a predefined set of rules that determine how data is saved, accessed, and compared. In other words, it's a configuration setting that indicates how the database engine should handle character data. SQL Server has a vast number of collations for dealing with the language and regional differences that come with supporting users and applications worldwide. This article will discuss collations and show a few examples to deal with collations.
Where do I find collations?
SQL collation can be found at the server, database, and column levels. It should be noted that there is no need to be the same collation settings on the server, database, and column. We can also use certain collations by updating our queries. If the collation is not consistent, we will appreciate the necessity of defining the correct collation across our environment as there is a high risk of unforeseen errors.
What are the different types of collations in SQL Server?
SQL Server provides the below system function to get the full list of available collations:
What are the different options in the collation name?
The collation name contains the following options. These options perform differently to deal with character data for sorting and searching operations:
CS: case-sensitive
AI: accent-insensitive
KS: kana type-sensitive
WS: width-sensitive
SC: supplementary characters
UTF8: Encoding standard
"CS" / "CI" – Obviously, this one is just case sensitivity between upper and lower case. The default during SQL Server install as well as a database creation is "case insensitive". With case-insensitive, your queries will execute equally whether you use upper or lower case or a combination of both. However, if you set this to case-sensitive, you will need to get the upper and lower case correct or the query will not return the expected results. Thus, in a "CS" case-sensitive collation, HumanResources.Department is different from humanresources.department and SQL treats them as two different objects.
"AS" / "AI" – This is referring to accent sensitivity. An example would be that 'a' is the same as 'á' if accent-insensitive is set (using "AI" instead of "AS"), but 'a' is not the same as 'á' if accent-sensitive is called.
"KS" - Most people will never see or use this one. KS or Kanatype distinguishes between the two types of "Kana" characters for Japan. Those are Hiragana and Katakana. If "KS" is present, SQL Server treats them as equal for sorting purposes.
"WS" – Width sensitive distinguishes between a single-byte character and a double-byte character. If "WS" is not set/selected, SQL Server treats the single-byte and double-byte characters as the same for sorting purposes.
"VSS" – Somewhat relating back to the "KS" or Kanatype, this too is referencing Japanese characters so you most likely will not see nor use this in your day to day activities.
"_SC" – This will always be at the end of the arguments. It only affects how built-in functions work with surrogate pairs. Without "_SC" at the end, SQL Server doesn't see single supplementary characters; instead it sees two code points that make up a surrogate pair.
Suppose we are using the case-sensitive option in SQL collation. The database engine will behave differently when the query operation is looking for "Andrew" or "andrew". If the query does not find the row with the first name "Andrew," the query will return no results. It is due to the collation's "CS-Case sensitive" option.
Collation Levels
Collations can be specified at several levels of the database engine in SQL Server, but by default, every level inherits the parent level's collation settings. The following are a list of collation supported in SQL Server in descending order:
- SQL Server instance level collation
- Database level collation
- Column level collation
- Query level collation
Ref: https://www.javatpoint.com/sql-server-collation
Post a Comment
0 Comments