Status: Last update at 21. of December 2011.

Evaluation of the MS 'unicode' (including the MS 'utf-16') and the MS 'unicodeFFFE' charset declarations

MS 'unicode'/MS 'utf-16' plus MS 'unicodeFFFE' are described in the following documents: Character Set Recognition and Code Page Identifiers. The goal with the tests is in particular to document how the MS 'unicode'/MS 'utf-16 plus MS 'unicodeFFFE' are handled compared to how the official IANA 'utf-16', 'utf-16be' and 'utf-16le' charsets are handled.

Legend, test principles, reading advice and results

The section on evaluation results, at the end of the page.

Same mark-up
The mark-up is the same, in the XML section and the HTML section.
abbreviations
koi8r = koi8-r; utf8 = utf-8; utf16 = utf-16; utf16be = utf-16be; utf16le = utf-16le

Note! The tests could contain errors! Be critical! If you suspect that some part of the test is in error, then contact me via the relevant channel …

XML tests

XML table 1: Defaults.
No Byte order mark or any other explisit encoding info with the files.
koi8-rutf-816be16le
defaultdefaultdefaultdefault
XML table 2: BOM.
Each file starts with the Byte Order Mark
koi8-rutf-816be16le
bombombombom
XML table 3: HTTP + BOM.
The HTTP charset parameter sends encoding info to files which includes the BOM.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
XML table 4: XML Encoding declaration.
Each file contains an XML declaration with an encoding declaration inside.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
XML table 5: HTTP + meta.
Charset is sent via HTTP and meta element.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
XML table 6: HTTP + XML declaration.
Charset is announce via HTTP. In addition the file contains the XML prolog.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
XML table 7: HTTP.
Charset is sent via HTTP, only.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
XML table 8: META.
Charset is sent via the meta element.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
XML table 9: XML prolog.
No explicit encoding declaration, but the file contains the XML prolog.
koi8-rutf-816be16le
xmldecxmldecxmldecxmldec

HTML tests

HTML table 1: Defaults.
No Byte order mark or any other explisit encoding info with the files.
koi8-rutf-816be16le
defaultdefaultdefaultdefault
HTML table 2: BOM.
Each file starts with the Byte Order Mark
koi8-rutf-816be16le
bombombombom
HTML table 3: HTTP + BOM.
The HTTP charset parameter sends encoding info to files which includes the BOM.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
HTML table 4: XML Encoding declaration.
Each file contains an XML declaration with an encoding declaration inside.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
HTML table 5: HTTP + meta.
Charset is sent via HTTP and meta element.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
HTML table 6: HTTP + XML declaration.
Charset is announce via HTTP. In addition the file contains the XML prolog.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
HTML table 7: HTTP.
Charset is sent via HTTP, only.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
HTML table 8: META.
Charset is sent via the meta element.
koi8-rutf-816be16le
koi8rkoi8rkoi8rkoi8r
unicodeunicodeunicodeunicode
unicode­fffeunicode­fffeunicode­fffeunicode­fffe
utf16utf16utf16utf16
utf16beutf16beutf16beutf16be
utf16leutf16leutf16leutf16le
utf8utf8utf8utf8
HTML table 9: XML prolog.
No explicit encoding declaration, but the file contains the XML prolog.
koi8-rutf-816be16le
xmldecxmldecxmldecxmldec

XML Evaluation

Evaluation of XML table 1: Defaults.
koi8-rutf-816be16le
IE9nyyy
FF8nyyy
O11nyyy
Safnynn
Chrnynn
Evaluation of XML table 2: BOM.
koi8-rutf-816be16le
IE9nyyy
FF8nyyy
O11nyyy
Safnyyy
Chrnyyy
Evaluation of XML table 3: HTTP + BOM.
koi8-rutf-816be16le
IE9bbb
FF8m
HTTP trusted:
koi8-r, utf-16, utf-16be, utf-8
BOM used instead of HTTP:
unicode, unicodefffe, utf-16le
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
O11 m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
Safbbb
Chrbbb
Evaluation of XML table 4: XML encoding declaration
koi8-rutf-816be16le
IE9
FF8tt/s
Enc. decl. trusted:
koi8-r,utf-8
Sniffs UTF-8 instead:
all the other encodings
s
Sniffs UTF-8 instead:
all the other encodings
s
Sniffs the encoding:
All encodings
O11tt/s Like FF8s Like FF8s Like FF8
Saf
Chr
Evaluation of XML table 5: HTML + meta.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of XML table 6: HTTP + XML declaration.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of XML table 7: HTTP.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of XML table 8: meta.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of XML table 9: XML prolog.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr

HTML Evaluation

Evaluation of HTML table 1: Defaults.
koi8-rutf-816be16le
IE9nnnn
FF8nnnn
O11nyyy
Safnnnn
Chrnynn
Evaluation of HTML table 2: BOM.
koi8-rutf-816be16le
IE9nyyy
FF8nyyy
O11nyyy
Safnyyy
Chrnyyy
Evaluation of HTML table 3: HTTP + BOM.
koi8-rutf-816be16le
IE9bbb
FF8m
HTTP trusted:
utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
koi8-r, unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
koi8-r, unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
koi8-r, unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
O11m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
m
HTTP trusted:
koi8-r,utf-16,utf-16be,utf-16le,utf-8
BOM used instead of HTTP:
unicode, unicodefffe
UTF-16 defaults to:
UTF-16BE
Safbbb
Chrbbb
Evaluation of HTML table 4: XML encoding declaration
koi8-rutf-816be16le
IE9
FF8iiii
O11
Saf
Chr
Evaluation of HTML table 5: HTTP + meta.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of HTML table 6: HTTP + XML declaration.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of HTML table 7: HTTP.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of HTML table 8: meta.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr
Evaluation of HTML table 9: XML prolog.
koi8-rutf-816be16le
IE9
FF8
O11
Saf
Chr