|
Mimer SQL Unicode Collation Charts
Prerequisite
The collation charts that can be obtained from this page may,
for a proper display, need additional fonts to be installed on your computer.
We suggest you to look at the
Unicode Display Problems?
page, or the
Babelstone Custom Font List,
for help in this matter.
An alternative to get a complete character display is to use the PDF link provided, which opens up a PDF version of the respective chart.
Languages - Predefined and Downloadable
Below are specifications on sorting adjustments for various languages, so called tailorings,
needed to get the correct national sort order compared to the Unicode default sorting order.
In the table below, languages with their names bolded are among the predefined collations included in the current version of Mimer SQL.
For a summary, see our Collation Tailorings overview.
For some of the languages that are not bolded, the collation definition can be found and easily used by copy/paste.
Where applicable, see Uyghur for example, the respective language's page contains a Collation link (in the top of the page) that leads to the CREATE COLLATION statement used to define the collation.
Scripts
In this context a script is a collection of symbols used to represent textual information.
The Unicode Character Database (UCD)
provides data for a mapping from Unicode characters to script names.
European Ordering Rules (EOR) is a standard
that defines how Latin, Greek and Cyrillic scripts should be sorted.
It should provide guidance on sorting European repertoires in Unicode.
ISO/IEC 8859-1 (SQL datatype CHAR)
The following script for Latin-1 representation is used with the CHAR datatype in SQL.
Unicode (SQL datatype NCHAR)
Below are scripts for the Unicode representation,
used with the NCHAR datatype in SQL.
The Default Unicode Collation Element Table (DUCET) is provided in the
AllKeys table, as stated in the
specification for the Unicode Collation Algorithm (UCA).
This table provides a mapping from characters to collation elements.
The following scripts represent different parts of the table,
given in the order they are defined.
The Variable script above includes characters that may be set to Ignorable by using a collation option.
Among these characters space, punctuation marks and most symbols can be found.
The Common script above includes digits, currency symbols, etc.
|