Unicode {ISOcodes} | R Documentation |
Unicode Data
Description
Basic Unicode data, including the Universal Character Set (UCS) code
points as defined by the ISO/IEC 10646 International Standard.
Usage
data("Unicode")
Format
A data frame with the following variables:
Code
:- a character vector with the UCS/Unicode hex
codes
Name
:- a character vector with the Unicode character
names
General_Category
:- a factor providing a basic
classification into various character types.
Canonical_Combining_Class
:- a factor giving the classes
used for the Canonical Ordering Algorithm in the Unicode
Standard.
Bidi_Class
:- a factor giving the categories required by
the Bidirectional Behavior Algorithm in the Unicode Standard.
Decomposition
:- a character vector giving the
decomposition types and mappings published with the character
names in the Unicode Standard.
Numeric_Value_Decimal_Digit
:- a character vector giving
the numeric (integer) value of the character if it has the decimal
digit property.
Numeric_Value_Digit
:- a character vector giving the
numeric (integer) value of the character if it has the digit
property.
Numeric_Value
:- a character vector goving the numeric
(integer or rational) value of the character if it has the numeric
property.
Bidi_Mirrored
:- a factor with levels
"Y"
and
"N"
indicating whether the character has been identified as
a “mirrored” character in bidirectional text or not.
Unicode_1_Name
:- a character vector with the old name
as published in Unicode 1.0.
ISO_Comment
:- a character vector with the ISO 10646
comment.
Simple_Uppercase_Mapping
:- a character vector with the
(hex code of the) simple uppercase mapping. Omitted if the
uppercase is the same as the code point itself.
Simple_Lowercase_Mapping
:- a character vector with the
(hex code of the) simple lowercase mapping. Omitted if the
lowercase is the same as the code point itself.
Simple_Titlecase_Mapping
:- a character vector with the
(hex code of the) simple titlecase mapping. Omitted if the
titlecase is the same as the code point itself.
Source
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
References
http://en.wikipedia.org/wiki/Unicode,
http://en.wikipedia.org/wiki/ISO_10646;
http://www.unicode.org/Public/UNIDATA/UCD.html for details on
the Unicode data sets.
[Package
ISOcodes version 0.1-1
Index]