Home > Articles > Programming > Java

  • Print
  • + Share This
From the author of

Summary and additional reference

Enabling your product for the world market simply makes economic sense. And the steps above show that the process is relatively straightforward. Now here's that quiz we mentioned in the introduction:

True or False: The majority of IBM's worldwide software sales revenue is within the United States.

False. Indeed, more than 50% of IBM software revenue comes from outside the United States.

Fortunately, those developers with products based on the Eclipse platform benefit from having ready translations of the base product. All that is left is to follow the clear steps outlined in this article to open your Eclipse-based product to a worldwide market!

Eclipse-specific (non-Java) translatable resources

Here is a summary of the previously presented list of translatable resources, along with a brief explanation of how they are processed.

Table 2. Eclipse-specific (non-Java) translatable resources

Translated items

Required or optional

High-level steps

Plug-in files


  1. Bundle the plug-in *.properties files and send for translation, including plugin.properties
  2. Convert the translated *.properties files
  3. Insert the converted files into the plug-in directory of the NL fragment

Plug-in "About" file


  1. Bundle the plug-in about.html and send for translation (displayed from Help > About > Plugin Info... > More Info...)
  2. It is possible to reference external Web sites. May need to update about.html to pass language indicator or refer to a country-specific Web site

Online help


  1. Bundle the *.html files and send for translation.
  2. Bundle the *.properties files of your on-line Helps and send for translation, including doc.properties
  3. Convert the translated *.properties files
  4. Zip the *.html files into doc.zip, then copy it, the *.xml wiring, and the *.properties into the online Helps of the NL fragment



To localize the splash screen, you will need to create locale subdirectories under eclipse/splash. The names of these directories follow the standard Java locale-naming conventions. For example the platform looks up the splash screen for USA english locale (en_US) as follows:

  1. eclipse\splash\en_US\<image file>
  2. eclipse\splash\en\<image file>
  3. eclipse\splash\<image file>

    where <image file> is the name of the splash file (for example, splash_full.bmp or splash_full.xpm).

Product configuration*


  1. eclipse\install\install.properties (indicates dominant application)
  2. eclipse\configurations\configuration_n\install.xml
  3. eclipse\dominant_application_plugin_directory\platform.ini (immutable)
  4. eclipse\dominant_application_plugin_directory\platform.properties (immutable)

Plug-in product files*


  1. Translate eclipse\dominant_application_plugin_directory \product.properties
  2. Translate eclipse\dominant_application_plugin_directory\welcome.xml
  3. Save your translated welcome.xml in UTF-8 format (using Notepad, for example)
  4. Insert the translated welcome.xml under the NL fragment



  1. Bundle your license and notice and send for translation: (license.html, notice.html, cpl-v05.html, about.html in eclipse directory)
  2. Place them in the root directory and the plug in directories of your NL fragment

For more information on translatable resources, see the article on the eclipse.org Web site "Creating Product Branding" (see Resources) by Greg Adams.

Unicode codepoints of common accented Latin characters

Table 3. Unicode codepoints of common accented Latin characters



a grave


a acute


A grave


A acute


A circumflex


a circumflex


A tilde


a dieresis


A dieresis


e grave


E grave


e acute


E acute


e circumflex


e dieresis


E dieresis


e circumflex


E circumflex


i dieresis


i grave


i acute


I grave


I acute


i circumflex


I circumflex


o dieresis


O dieresis


a tilde


o circumflex


O circumflex


o grave


O grave


o acute


O acute


o tilde


O tilde


n tilde


N tilde


u grave


U grave


u acute


U acute


u circumflex


U circumflex


u dieresis


U dieresis


s sharp

Special symbols


masculine ordinal indicator


section sign


feminine ordinal indicator


not sign


1 superscript


2 superscript


3 superscript


pound sign


cents sign


degree sign



Characters can be represented by one or more bytes of information. Codepoints are the hexadecimal values assigned to each graphic character.


A codepage is a specification of code points for each graphic character in a set, or in a collection of graphic character sets. Within a given codepage, a codepoint can have only one specific meaning. You can display the active codepage on the Windows® operating system with the CHCP command (only one codepage is active at any given moment).


The codepage associated with a given piece of data. A file is said to be "encoded" in a given code page; for example, Notepad will encode its data in code page 437 on a US-English machine by default. The Save As dialog allows the user to select several other possible encodings, Unicode and UTF-8 most notable among them.

Internationalization (sometimes abbreviated "I18N")

Internationalization refers to the process of developing programs without prior knowledge of the language, cultural data, or character encoding schemes they are expected to handle. In system terms, it refers to the provision of interfaces that enable internationalized programs to change their behavior at run time for specific language operation.

Single-Byte Coded Character Set (SBCS)

In a single-byte coded character set, a one-byte codepoint represents each character in the set. Typically, SBCS is used to represent the characters of the English language, the European languages, the Cyrillic languages, the Arabic language, and the Hebrew language, to name a few.

Double-Byte Coded Character Set (DBCS)

In a double-byte coded character set (DBCS), a two-byte codepoint represents each character in the set. Languages that are ideographic in nature, such as Japanese, Chinese, and Korean, have more characters than can be represented internally by 256 code points and thus require double-byte coded character sets.

Localization (sometimes abbreviated "L10N")

Localization refers to the process of establishing information within a computer system specific to each supported language, cultural data, and coded character set combination.

Mixed-Byte Character Set

A mixed-byte coded character set is a set of characters containing both single-byte characters and double-byte characters. On the MBCS, each byte of data must be examined to see if it is the first byte of a double-byte or single-byte character. If the byte is in a certain range (greater than X'80', for example), then it is the first byte of a double-byte character.


National Language Support.


Directly from http://www.unicode.org/: "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language."


While it is true that Java text manipulation classes are Unicode-centric, this is often not the case for data stored outside of your program's auspices. Java programmers must take into consideration the data encoding by performing local codepage-to-Unicode transformations where necessary.

  • + Share This
  • 🔖 Save To Your Account