Java Unicode 4.1.0 Constants (UCC)

Summary

Have you ever been wondering what the Unicode value of a special character might be? Or did you come along 'magic' character codes in some code? Then you might want to use these Java constants: The UCC, UniCode Constants, are derived from the Unicode Database. For every character there is a constant with its official name and corresponding char or int value. All characters of version 4.2.0 up to \u1FFFF are covered except CJK Ideographs.

Installation

To install UCC, simply extract the ucc-*.zip and add ucc.jar to your classpath. UCC is JDK 1.1 compliant and does not depend on any other libraries. To use characters beyond \u10000, called code-points, you need Java 1.5 or newer.

Usage

For each Unicode block, e.g. Basic Latin (\u0000..\u007F) or Aegean Numbers (\u10100..\u1013F), there is a separate interface with the block's name defining all code-points defined in this block. First you need to import the blocks.

   import unicode.AegeanNumbers;
   import unicode.BasicLatin;
   import unicode.NumberForms;

Then you can use the constants in your code:

   println("count=" + Character.charCount(BasicLatin.DIGIT_NINE)); // -> 1
   println("value=" + Character.getNumericValue(BasicLatin.DIGIT_NINE)); // -> 9

   println("count=" + Character.charCount(NumberForms.ROMAN_NUMERAL_FIVE_HUNDRED)); // -> 1
   println("value=" + Character.getNumericValue(NumberForms.ROMAN_NUMERAL_FIVE_HUNDRED)); // -> 500

   println("count=" + Character.charCount(AegeanNumbers.NUMBER_EIGHT)); // -> 2
   println("value=" + Character.getNumericValue(AegeanNumbers.NUMBER_EIGHT)); // -> 8

License Agreement

UCC is Open Source under the GPL license.

Version History