Localizing Cocoa


Date: Jan 8, 2010

Article is provided courtesy of Addison-Wesley Professional.

Return to the article

Cocoa provides a lot of features for making it easy to localize your code. David Chisnall, author of Cocoa Programming Developer's Handbook, explains how thinking about localization early on will save a lot of work in the long run.

Localization is a big topic. Most developers start by writing their applications for their own language, using their own conventions, and then worry about translation when the application is finished. This can create a lot of work for them later on.

Good programmers are lazy and try to avoid anything that will create extra work for them later. It's quite likely that you'll want people in other locales to use your application later. Even a country like the USA only accounts for a small fraction of the global market, and localized versions make a program a lot more attractive to end users. Thinking about localization early on will save a lot of work eventually.

Translating GUIs

Cocoa stores user interfaces in nib files. With newer versions of Apple's developer tools, these are created from xib files, which contain property lists describing how all of the objects in the user interface fit together.

Nib files are intended to be created by a user interface designer, who may or may not be a programmer. All of the strings for displaying in the user interface are stored in the nib. This is intentional, because the size of various user interface elements depends on the size of the text stored in them.

In a large organization the user interface designers and the translators may be different people. To help with this, Apple provides a ibtool utility, which extracts strings from a nib or xib file and then lets you merge them back in later. Prior to 10.5, this was called nibtool. The "n" was dropped because it now works with xib files as well.

The first step in translating a Cocoa user interface is usually to extract the strings, translate them, and then merge them back in. You do that with these two commands:

$ ibtool --generate-strings-file MainMenu.strings en.lproj/MainMenu.nib
$ ibtool --strings-file es.lproj/MainMenu.strings --write es.lproj/MainMenu.nib en.lproj/MainMenu.nib

This creates a MainMenu.strings file from the English nib and then creates a Spanish nib from that file. Between these two commands, obviously, you should translate the strings file.

You then have a translated nib file, although some of the strings may be too big or too small for the user interface elements containing them. The next step is to re-size everything and make it look tidy.

This isn't quite the end. You might need to change some of the images in your interfaces to correspond more closely with local associations, and you may want to reorder some of the elements in your interface based on the local reading order. For example, in English where you read from left to right and then top to bottom, this is the order in which you should present interface elements that the user will interact with sequentially. In a right to left reading order locale, you might want to put them the other way around.

One of the things that is very easy to localize if you remember, and very hard if you forget, it time. The simplest problem is handling time zones. This is something that Apple's iCal does very badly. If you are traveling to a different time zone and have a meeting when you arrive, you will arrange it based on the local time at the destination and enter it in the calendar. When you arrive, you'll set the system clock to the new time zone, and find that the meeting time moved. A better solution would be to allow the user to define a time zone when entering a time, rather than forcing a specific locale.

A more complex problem is that of calendars. A calendar is a way of converting from an abstract time, like now, to something relative to a fixed reference point, like the 8th of January, 2010. In most of the western world, we use the Gregorian calendar, which has a complex system of leap years to stay roughly synchronized with the solar year, so the solstice is on the same date every year. In Islamic countries they tend to use a lunar calendar, which shifts by a small amount every solar year. This made sense in more equatorial regions, where variation between seasons was small, but the moon was a constant fixture.

Other countries have their own calendars. OS X supports a small selection of the most common ones. A well-written Cocoa program cleanly separates dates used for storage from dates used for presentation.

When you want to store a date, you should use an NSDate object. This stores a number of milliseconds from some reference date. The reference date is a fixed point, such as the UNIX epoc (the first of January, 1970). As such, it is independent of time zones and calendars. Before you display it to the user, you can convert it using the NSCalendar class. This can create an NSDateComponents object, encapsulating information like the day of the week, month, and year, from an NSDate.

In older code, you may come across the NSCalendarDate class. This was an NSDate subclass that incorporated information about the Gregorian calendar. It was not well suited to localization. It is deprecated on Snow Leopard and its use hasn't been recommended since 10.4.

It is incredibly tempting to hard-code strings in Cocoa. Creating a literal string is very easy. There are very few cases where this is actually a sensible thing to do. If you are using a string as the name of a key in a dictionary, or the name of an exception, then you should declare it like this:

extern NSString *kSomeKey;

And then define it in an implementation file like this:

NSString *kSomeKey = @"kSomeKey";

This means that everywhere you use kSomeKey you will be using the same instance. This saves a small amount of memory but, more importantly, means that you can use pointer comparison to compare the two strings, which is a lot faster than comparing their values. This is such a common idiom that EtoileFoundation provides some macros that let you declare constant strings even more easily.

For strings that are presented to the user, Cocoa provides some functions that make localization easier. The NSLocalizedString() function looks up a string at run time. These strings are loaded from a .strings file in the user dictionary. This is a simplified version of a property list, which just contains a set of key-value pairs. If the strings file is not available, this function returns the input string. This makes it trivial to use while developing, you just pass it the version for your own locale and it is almost equivalent to using hard-coded strings.

You can automatically generate a strings file for a project with the genstrings utility. This will create a strings file containing all of the strings that your code tries to access. The first argument to NSLocalizedString() will be used as a key in the strings table and the second argument will be a used as a comment above the string. This is used to provide some metadata to translators. For example, if a string is a single word, like "Print," then you might want to indicate that you mean the verb, not the noun, as they might be completely different words in the target language. Your application would look quite silly if it had a file menu item which said the local language version of print-as-in-printed-artwork.

To simplify creation of localized strings, GNUstep provides this macro:

#define _(X) NSLocalizedString (X, @"")

This lets you define localized strings trivially, with just three more characters to type than a non-localized string. In your code, you'd just use _(@"Some string") instead of @"Some string".

Looking up the localized version of something can be relatively expensive, so it's quite tempting to cache the result. The best advice is to never do this. You only ever need to localize things that are displayed to the user, and most of the time the cost of drawing something on the screen is a lot more than the cost of getting the localized version.

Occasionally, you may discover that the localization is a bottleneck. In this case, you can try caching localized resources. This can be problematic if the user then switches locales.

Most things will be updated automatically, but there are two exceptions. The first is anything that you've cached, and the second is nib files. Newly instantiated nib files will be selected from the new locale, but existing ones will come from the old locale. This can be confusing for the user.

You can spot when the locale has changed by registering to observe the NSCurrentLocaleDidChangeNotification notification from the NSLocale class.

Some things vary between locales, but should not always be localized. One example of poor localization is the OmniOutliner application. This allows you to select a type for a column, and one of the types is currency. This uses the currency symbol for the current locale. If you produce an outline of costs in the UK, this will be pounds sterling. If you then send it to colleagues in Japan and the USA, they will see it in Yen and US Dollars respectively. The amounts will be two orders of magnitude away from the ones that the original author created. If you send it to someone in Zimbabwe, it will be even worse.

The calendar example from earlier is another case where selecting the current locale is far from ideal. It prevents the user from easily setting a meeting in a different time zone, which makes iCal a problematic application for anyone who travels on business.

These are two very different cases. The first would be solved by allowing the user to select a currency, rather than only permitting the current one, and then using NSLocale to get the default presentation attributes for that currency.

The second one is addressed by presenting the data in the current locale, but allowing a different one to be specified when entering it. If you have a conference call scheduled with someone in a different time zone, they will often give you the time in their time zone. A calendar application should let you enter this time complete with the time zone, but then display it in your time zone. iCal almost gets this right. It separates the storage and display formats, storing the data in a time-zone agnostic way and only using the locale information for display, but it fails by assuming that the user's current locale is the only useful input format. Specifying other display formats would also be useful for iCal. If you're going to be in a different country between certain dates, then it would be useful to be able to specify a different time zone for displaying events on those dates.

When it comes to localization, it is important to think about use cases before blindly applying the techniques in this article.

800 East 96th Street, Indianapolis, Indiana 46240