I don’t understand Collation? (Mysql, RDBMS, Character sets)

I Understand Character sets but I don’t understand Collation. I know you get a default collation with every Character set in Mysql or any RDBMS but I still don’t get it! Can someone please explain in layman terms?

Thank you in advance 😉


Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The main point of a database collation is determining how data is sorted and compared.

Case sensitivity of string comparisons

SELECT "New York" = "NEW YORK";`

will return true for a case insensitive collation; false for a case sensitive one.

Which collation does which can be told by the _ci and _cs suffix in the collation’s name. _bin collations do binary comparisons (strings must be 100% identical).

Comparison of umlauts/accented characters

the collation also determines whether accented characters are treated as their latin base counterparts in string comparisons.

SELECT "Düsseldorf" =  "Dusseldorf";
SELECT "Èclair" =      "Eclair";

will return true in the former case; false in the latter. You will need to read each collation’s description to find out which is which.

String sorting

The collation influences the way strings are sorted.

For example,

  • Umlauts Ä Ö Ü are at the end of the alphabet in the finnish/swedish alphabet latin1_swedish_ci
  • they are treated as A O U in German DIN-1 sorting (latin_german1_ci)
  • and as AE OE UE in German DIN-2 sorting (latin_german2_ci). (“phone book” sorting)
  • In latin1_spanish_ci, “ñ” (n-tilde) is a separate letter between “n” and “o”.

These rules will result in different sort orders when non-latin characters are used.

Using collations at runtime

You have to choose a collation for your table and columns, but if you don’t mind the performance hit, you can force database operations into a certain collation at runtime using the COLLATE keyword.

This will sort table by the name column using German DIN-2 sorting rules:

FROM table
ORDER BY name COLLATE latin1_german2_ci;

Using COLLATE at runtime will have performance implications, as each column has to be converted during the query. So think twice before applying this do large data sets.

MySQL Reference:

Method 2

Collation is information about how strings should be sorted and compared.

It contains for example case sensetivity, e.g. whether a = A, special character considerations, e.g. whether a = á, and character order, e.g. whether O < Ö.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Notify of

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x