Chilkat .NET
Components for Email, Zip Compression, Encryption, XML, S/MIME, and Character
Encoding Conversion
Character encoding conversion class. Convert text or HTML documents from one charset to another. All major character encodings are supported, including utf-8, ucs-2 (unicode) iso-8859-*, windows-*, shift-jis, iso-2022-jp/kr/cn, euc-jp/kr/cn, gb2312, big5, tis-620, and more.
![]() |
Properties | Description | ||
| FromCharset | Character encoding of source data,
such as "iso-8859-1", "utf-8", or "euc-kr".
|
|||
| LastError | XML error information for the last failed method call.
|
|||
| LastInputAsHex | The input data for the most recent character encoding conversion
in hexidecimal format. Useful for debugging.
|
|||
| LastInputAsQP | The input data for the most recent character encoding conversion
in quoted-printable format. Useful for debugging.
|
|||
| LastMethodFailed | True if the last method call failed.
|
|||
| LastOutputAsHex | The output data for the most recent character encoding conversion
in hexidecimal format. Useful for debugging.
|
|||
| LastOutputAsQP | The output data for the most recent character encoding conversion
in quoted-printable format. Useful for debugging.
|
|||
| SaveLast | If true, the component will save
the input and output data for the last conversion, which can then be retrieved
in quoted-printable format.
|
|||
| ToCharset | Character encoding of destination,
such as "iso-8859-1", "utf-8", or "euc-kr".
|
|||
| Version | The version of the component, such as "9.0.0"
|
![]() |
Methods | Description | ||
| ConvertData | Converts text data from one charset to another. Returns the converted data. The text is converted according to the FromCharset and ToCharset properties.
|
|||
| ConvertFile | Convert a file from one character encoding to another.
The file is converted according to the FromEncoding and ToEncoding properties.
Returns true if the conversion is successful.
|
|||
| ConvertFromUnicode | Convert Unicode wide-character data to a multibyte character encoding
such as iso-8859-1, utf-8, shift-jis, big5, gb2312, tis-620, etc.
|
|||
| ConvertHtml | Converts HTML text to another charset. The "from" charset is determined by parsing out the charset information found in a META-tag, or if that is not present, it uses the FromCharset property. The HTML is converted to the charset specified by the ToCharset property. The META-tag is updated, or added if it previously did not exist. The converted HTML is returned. If the HTML could not be converted, a NULL is returned.
|
|||
| ConvertHtmlFile | Converts HTML text to another charset. The "from" charset is determined by parsing out the charset information found in a META-tag, or if that is not present, it uses the FromCharset property. The HTML is converted to the charset specified by the ToCharset property. The META-tag is updated, or added if it previously did not exist. Returns true if successful, otherwise false.
|
|||
| ConvertToUnicode | Convert multibyte character data to a Unicode wide character string.
Multibyte character encodings include iso-8859-*, euc-*, iso-2022-*, windows-125*, utf-8,
and anything else except for ucs-2 or Unicode.
|
|||
| DetectCharset | Tries to detect the charset by examining the character data. This method is more accurate when more data is available to examine, but is never 100% accurate.
|
|||
| DownloadHtml | A convenient method to download the HTML from a URL. It does not download all external referenced parts of the Web page -- it only downloads the HTML text of a page. A NULL is returned if the Web page could not be downloaded.
|
|||
| From_BIG5 | Sets the FromCharset property to Chinese Big5
|
|||
| From_EUC_CN | Sets the FromCharset property to euc-cn
|
|||
| From_EUC_JP | Sets the FromCharset property to euc-jp
|
|||
| From_EUC_KR | Sets the FromCharset property to euc-kr
|
|||
| From_GB2312 | Sets the FromCharset property to Chinese GB2312
|
|||
| From_ISO_2022_JP | Sets the FromCharset property to iso-2022-jp
|
|||
| From_ISO_2022_KR | Sets the FromCharset property to iso-2022-kr
|
|||
| From_ISO_8859_1 | Sets the FromCharset property to iso-8859-1
|
|||
| From_ISO_8859_2 | Sets the FromCharset property to iso-8859-2
|
|||
| From_ISO_8859_3 | Sets the FromCharset property to iso-8859-3
|
|||
| From_ISO_8859_4 | Sets the FromCharset property to iso-8859-4
|
|||
| From_ISO_8859_5 | Sets the FromCharset property to iso-8859-5
|
|||
| From_ISO_8859_6 | Sets the FromCharset property to iso-8859-6
|
|||
| From_ISO_8859_7 | Sets the FromCharset property to iso-8859-7
|
|||
| From_ISO_8859_8 | Sets the FromCharset property to iso-8859-8
|
|||
| From_ISO_8859_9 | Sets the FromCharset property to iso-8859-9
|
|||
| From_KOI8_R | Sets the FromCharset property to koi8-r
|
|||
| From_KOI8_U | Sets the FromCharset property to koi8-u
|
|||
| From_SHIFT_JIS | Sets the FromCharset property to shift-jis (sjis)
|
|||
| From_US_ASCII | Sets the FromCharset property to us-ascii
|
|||
| From_UTF_8 | Sets the FromCharset property to utf-8
|
|||
| From_Windows_1250 | Sets the FromCharset property to Windows-1250
|
|||
| From_Windows_1251 | Sets the FromCharset property to Windows-1251
|
|||
| From_Windows_1252 | Sets the FromCharset property to Windows-1252
|
|||
| From_Windows_1253 | Sets the FromCharset property to Windows-1253
|
|||
| From_Windows_1254 | Sets the FromCharset property to Windows-1254
|
|||
| From_Windows_1255 | Sets the FromCharset property to Windows-1255
|
|||
| From_Windows_1256 | Sets the FromCharset property to Windows-1256
|
|||
| From_Windows_1257 | Sets the FromCharset property to Windows-1257
|
|||
| From_Windows_1258 | Sets the FromCharset property to Windows-1258
|
|||
| GetHtmlCharset | Parses HTML text and returns the charset, such as "iso-8859-1" found in the
META-tag that specifies the document's charset. The string "unknown" is returned if
the charset was not specified. This method does not try to detect the charset based
by examining the character data (like DetectCharset) but simply looks for the META
tag containing the charset information.
|
|||
| GetHtmlFileCharset | Parses HTML text and returns the charset, such as "iso-8859-1" found in the
META-tag that specifies the document's charset. The string "unknown" is returned if
the charset was not specified. This method does not try to detect the charset based
by examining the character data (like DetectCharset) but simply looks for the META
tag containing the charset information.
|
|||
| ReadFile | Convenience method for reading the entire contents of a file into memory.
|
|||
| SaveLastError | Save the XML error log for the last failed method call.
|
|||
| To_BIG5 | Sets the ToCharset property to Chinese Big5
|
|||
| To_EUC_CN | Sets the ToCharset property to euc-cn
|
|||
| To_EUC_JP | Sets the ToCharset property to euc-jp
|
|||
| To_EUC_KR | Sets the ToCharset property to euc-kr
|
|||
| To_GB2312 | Sets the ToCharset property to Chinese GB2312
|
|||
| To_ISO_2022_JP | Sets the ToCharset property to iso-2022-jp
|
|||
| To_ISO_2022_KR | Sets the ToCharset property to iso-2022-kr
|
|||
| To_ISO_8859_1 | Sets the ToCharset property to iso-8859-1
|
|||
| To_ISO_8859_2 | Sets the ToCharset property to iso-8859-2
|
|||
| To_ISO_8859_3 | Sets the ToCharset property to iso-8859-3
|
|||
| To_ISO_8859_4 | Sets the ToCharset property to iso-8859-4
|
|||
| To_ISO_8859_5 | Sets the ToCharset property to iso-8859-5
|
|||
| To_ISO_8859_6 | Sets the ToCharset property to iso-8859-6
|
|||
| To_ISO_8859_7 | Sets the ToCharset property to iso-8859-7
|
|||
| To_ISO_8859_8 | Sets the ToCharset property to iso-8859-8
|
|||
| To_ISO_8859_9 | Sets the ToCharset property to iso-8859-9
|
|||
| To_KOI8_R | Sets the ToCharset property to koi8-r
|
|||
| To_KOI8_U | Sets the ToCharset property to koi8-u
|
|||
| To_SHIFT_JIS | Sets the ToCharset property to shift-jis (sjis)
|
|||
| To_US_ASCII | Sets the ToCharset property to us-ascii
|
|||
| To_UTF_8 | Sets the ToCharset property to utf-8
|
|||
| To_Windows_1250 | Sets the ToCharset property to Windows-1250
|
|||
| To_Windows_1251 | Sets the ToCharset property to Windows-1251
|
|||
| To_Windows_1252 | Sets the ToCharset property to Windows-1252
|
|||
| To_Windows_1253 | Sets the ToCharset property to Windows-1253
|
|||
| To_Windows_1254 | Sets the ToCharset property to Windows-1254
|
|||
| To_Windows_1255 | Sets the ToCharset property to Windows-1255
|
|||
| To_Windows_1256 | Sets the ToCharset property to Windows-1256
|
|||
| To_Windows_1257 | Sets the ToCharset property to Windows-1257
|
|||
| To_Windows_1258 | Sets the ToCharset property to Windows-1258
|
|||
| UnlockComponent | Unlocks the component allowing for the full functionality to be used.
Returns true if the unlock code is valid.
|
|||
| VerifyData | Verify that a memory buffer contains only characters that are valid for the specified character set. Returns true if valid.
|
|||
| VerifyFile | Verify that a file contains only characters that are valid for the specified character set. Returns true if the file is valid.
|
|||
| WriteFile | Convenience function for writing text to a file.
|
Copyright 2002, Chilkat Software, Inc. All Rights Reserved.