blob: ec9b8185e266c2ef39b17650b3addf7608b2b9b6 [file] [log] [blame]
/****************************************************************************
**
** Copyright (C) 2016 The Qt Company Ltd.
** Contact: https://www.qt.io/licensing/
**
** This file is part of the documentation of the Qt Toolkit.
**
** $QT_BEGIN_LICENSE:FDL$
** Commercial License Usage
** Licensees holding valid commercial Qt licenses may use this file in
** accordance with the commercial license agreement provided with the
** Software or, alternatively, in accordance with the terms contained in
** a written agreement between you and The Qt Company. For licensing terms
** and conditions see https://www.qt.io/terms-conditions. For further
** information use the contact form at https://www.qt.io/contact-us.
**
** GNU Free Documentation License Usage
** Alternatively, this file may be used under the terms of the GNU Free
** Documentation License version 1.3 as published by the Free Software
** Foundation and appearing in the file included in the packaging of
** this file. Please review the following information to ensure
** the GNU Free Documentation License version 1.3 requirements
** will be met: https://www.gnu.org/licenses/fdl-1.3.html.
** $QT_END_LICENSE$
**
****************************************************************************/
/*!
\page codec-big5.html
\title Big5 Text Codec
\ingroup codecs
The Big5 codec provides conversion to and from the Big5 encoding.
The code was originally contributed by Ming-Che Chuang
\<mingche@cobra.ee.ntu.edu.tw\> for the Big-5+ encoding, and was
included in Qt with the author's permission, and the grateful
thanks of the Qt team. (Note: Ming-Che's code is QPL'd, as
per an mail to qt-info@nokia.com.)
However, since Big-5+ was never formally approved, and was never
used by anyone, the Taiwan Free Software community and the Li18nux
Big5 Standard Subgroup agree that the de-facto standard Big5-ETen
(zh_TW.Big5 or zh_TW.TW-Big5) be used instead.
The Big5 is currently implemented as a pure subset of the
Big5-HKSCS codec, so more fine-tuning is needed to make it
identical to the standard Big5 mapping as determined by
Li18nux-Big5. See \l{http://www.autrijus.org/xml/} for the draft
Big5 (2002) standard.
James Su \<suzhe@turbolinux.com.cn\> \<suzhe@gnuchina.org\>
generated the Big5-HKSCS-to-Unicode tables with a very
space-efficient algorithm. He generously donated his code to glibc
in May 2002. Subsequently, James has kindly allowed Anthony Fok
\<anthony@thizlinux.com\> \<foka@debian.org\> to adapt the code
for Qt.
\sa{Text Codecs: Big5, Big5-HKSCS}
*/
/*!
\page codec-big5hkscs.html
\title Big5-HKSCS Text Codec
\ingroup codecs
The Big5-HKSCS codec provides conversion to and from the
Big5-HKSCS encoding.
The codec grew out of the QBig5Codec originally contributed by
Ming-Che Chuang \<mingche@cobra.ee.ntu.edu.tw\>. James Su
\<suzhe@turbolinux.com.cn\> \<suzhe@gnuchina.org\> and Anthony Fok
\<anthony@thizlinux.com\> \<foka@debian.org\> implemented HKSCS-1999
QBig5hkscsCodec for Qt-2.3.x, but it was too late in Qt development
schedule to be officially included in the Qt-2.3.x series.
Wu Yi \<wuyi@hancom.com\> ported the HKSCS-1999 QBig5hkscsCodec to
Qt-3.0.1 in March 2002.
With the advent of the new HKSCS-2001 standard, James Su
\<suzhe@turbolinux.com.cn\> \<suzhe@gnuchina.org\> generated the
Big5-HKSCS<->Unicode tables with a very space-efficient algorithm.
He generously donated his code to glibc in May 2002. Subsequently,
James has generously allowed Anthony Fok to adapt the code for
Qt-3.0.5.
Currently, the Big5-HKSCS tables are generated from the following
sources, and with the Euro character added:
\list 1
\li \l{http://www.microsoft.com/typography/unicode/950.txt}
\li \l{http://www.info.gov.hk/digital21/chi/hkscs/download/big5-iso.txt}
\li \l{http://www.info.gov.hk/digital21/chi/hkscs/download/big5cmp.txt}
\endlist
There may be more fine-tuning to the QBig5hkscsCodec to maximize its
compatibility with the standard Big5 (2002) mapping as determined by
Li18nux Big5 Standard Subgroup. See \l{http://www.autrijus.org/xml/}
for the various Big5 CharMapML tables.
\sa{Text Codecs: Big5, Big5-HKSCS}
*/
/*!
\page codec-eucjp.html
\title EUC-JP Text Codec
\ingroup codecs
The EUC-JP codec provides conversion to and from EUC-JP, the main
legacy encoding for Unix machines in Japan.
The environment variable \c UNICODEMAP_JP can be used to
fine-tune the JIS, Shift-JIS, and EUC-JP codecs. The \l{ISO
2022-JP (JIS) Text Codec} documentation describes how to use this
variable.
Most of the code here was written by Serika Kurusugawa,
a.k.a. Junji Takagi, and is included in Qt with the author's
permission and the grateful thanks of the Qt team.
\sa{Text Codec: EUC-JP}
*/
/*!
\page codec-euckr.html
\title EUC-KR Text Codec
\ingroup codecs
The EUC-KR codec provides conversion to and from EUC-KR, KR, the
main legacy encoding for Unix machines in Korea.
It was largely written by Mizi Research Inc. Here is the
copyright statement for the code as it was at the point of
contribution. The subsequent modifications are covered by
the usual copyright for Qt.
\sa{Text Codec: EUC-KR}
*/
/*!
\page codec-gbk.html
\title GBK Text Codec
\ingroup codecs
The GBK codec provides conversion to and from the Chinese
GB18030/GBK/GB2312 encoding.
GBK, formally the Chinese Internal Code Specification, is a commonly
used extension of GB 2312-80. Microsoft Windows uses it under the
name codepage 936.
GBK has been superseded by the new Chinese national standard
GB 18030-2000, which added a 4-byte encoding while remaining
compatible with GB2312 and GBK. The new GB 18030-2000 may be described
as a special encoding of Unicode 3.x and ISO-10646-1.
Special thanks to charset gurus Markus Scherer (IBM),
Dirk Meyer (Adobe Systems) and Ken Lunde (Adobe Systems) for publishing
an excellent GB 18030-2000 summary and specification on the Internet.
Some must-read documents are:
\list
\li \l{ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/pdf/GB18030_Summary.pdf}
\li \l{http://oss.software.ibm.com/cvs/icu/~checkout~/charset/source/gb18030/gb18030.html}
\li \l{http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-2000.xml}
\endlist
The GBK codec was contributed to Qt by
Justin Yu \<justiny@turbolinux.com.cn\> and
Sean Chen \<seanc@turbolinux.com.cn\>. They may also be reached at
Yu Mingjian \<yumj@sun.ihep.ac.cn\>, \<yumingjian@china.com\>
Chen Xiangyang \<chenxy@sun.ihep.ac.cn\>
The GB18030 codec Qt functions were contributed to Qt by
James Su \<suzhe@gnuchina.org\>, \<suzhe@turbolinux.com.cn\>
who pioneered much of GB18030 development on GNU/Linux systems.
The GB18030 codec was contributed to Qt by
Anthony Fok \<anthony@thizlinux.com\>, \<foka@debian.org\>
using a Perl script to generate C++ tables from gb-18030-2000.xml
while merging contributions from James Su, Justin Yu and Sean Chen.
A copy of the source Perl script is available at
\l{http://people.debian.org/~foka/gb18030/gen-qgb18030codec.pl}
\sa{Text Codec: GBK}
*/
/*!
\page codecs-jis.html
\title ISO 2022-JP (JIS) Text Codec
\ingroup codecs
The JIS codec provides conversion to and from ISO 2022-JP.
The environment variable \c UNICODEMAP_JP can be used to
fine-tune the JIS, Shift-JIS, and EUC-JP codecs. The mapping
names are as for the Japanese XML working group's
\l{XML Japanese Profile},
because it names and explains all the
widely used mappings. Here are brief descriptions, written by
Serika Kurusugawa:
\list
\li "unicode-0.9" or "unicode-0201" for Unicode style. This assumes
JISX0201 for 0x00-0x7f. (0.9 is a table version of jisx02xx mapping
used for Unicode 1.1.)
\li "unicode-ascii" This assumes US-ASCII for 0x00-0x7f; some
chars (JISX0208 0x2140 and JISX0212 0x2237) are different from
Unicode 1.1 to avoid conflict.
\li "open-19970715-0201" ("open-0201" for convenience) or
"jisx0221-1995" for JISX0221-JISX0201 style. JIS X 0221 is JIS
version of Unicode, but a few chars (0x5c, 0x7e, 0x2140, 0x216f,
0x2131) are different from Unicode 1.1. This is used when 0x5c is
treated as YEN SIGN.
\li "open-19970715-ascii" ("open-ascii" for convenience) for
JISX0221-ASCII style. This is used when 0x5c is treated as REVERSE
SOLIDUS.
\li "open-19970715-ms" ("open-ms" for convenience) or "cp932" for
Microsoft Windows style. Windows Code Page 932. Some chars (0x2140,
0x2141, 0x2142, 0x215d, 0x2171, 0x2172) are different from Unicode
1.1.
\li "jdk1.1.7" for Sun's JDK style. Same as Unicode 1.1, except that
JIS 0x2140 is mapped to UFF3C. Either ASCII or JISX0201 can be used
for 0x00-0x7f.
\endlist
In addition, the extensions "nec-vdc", "ibm-vdc" and "udc" are
supported.
For example, if you want to use Unicode style conversion but with
NEC's extension, set \c UNICODEMAP_JP to \c {unicode-0.9,
nec-vdc}. (You will probably need to quote that in a shell
command.)
Most of the code here was written by Serika Kurusugawa,
a.k.a. Junji Takagi, and is included in Qt with the author's
permission and the grateful thanks of the Qt team.
\sa{Text Codec: ISO 2022-JP (JIS)}
*/
/*!
\page codec-sjis.html
\title Shift-JIS Text Codec
\ingroup codecs
The Shift-JIS codec provides conversion to and from Shift-JIS, an
encoding of JIS X 0201 Latin, JIS X 0201 Kana and JIS X 0208.
The environment variable \c UNICODEMAP_JP can be used to
fine-tune the codec. The \l{ISO 2022-JP (JIS) Text Codec}
documentation describes how to use this variable.
Most of the code here was written by Serika Kurusugawa, a.k.a.
Junji Takagi, and is included in Qt with the author's permission
and the grateful thanks of the Qt team. Here is the
copyright statement for the code as it was at the point of
contribution. The subsequent modifications are covered by
the usual copyright for Qt.
\sa{Text Codec: Shift-JIS}
*/
/*!
\page codec-tscii.html
\title TSCII Text Codec
\ingroup codecs
The TSCII codec provides conversion to and from the Tamil TSCII
encoding.
TSCII, formally the Tamil Standard Code Information Interchange
specification, is a commonly used charset for Tamils. The
official page for the standard is \l{Information Technology in Tamil}
Tamil uses composed Unicode which might cause some
problems if you are using Unicode fonts instead of TSCII fonts.
A Tamil codepage layout can be found on \l {Tamil Script Code}.
Most of the code was written by Hans Petter Bieker and is
included in Qt with the author's permission and the grateful
thanks of the Qt team.
\sa{Text Codec: TSCII}
*/