Parlez-Vous Klingon?

Paper presented at the 1997 Pacific Northwest Software Quality Conference

Parlez-Vous Klingon?
Testing Internationalized Software with Artificial Locales

Introduction

yIvoq 'ach yI'ol
("Trust, but Verify.") [1]

Many companies would like their software to run in multiple languages so that they can market it around the world. I18N [2] is a software methodology that makes it possible to create internationalized software at a reasonable cost by separating the executable code from any user interface components. To adapt I18N software to a new language, only the user interface needs to be translated; the executable code remains unchanged.

Writing I18N software can be straightforward. However, there are many linguistic, technical and psychological obstacles to performing good testing on that software. To exercise I18N code thoroughly, translated user messages are needed, but these translations are typically not available until late in the development cycle. And if the testers do not understand the language, the translated text may be intimidating and frustrating to use.

To overcome these obstacles, we have created user interface artificial locales such as Klingon(tm) and Swedish Chef(tm) for testing I18N code. This paper explains the reasoning behind the locales and how they provide better testing of our software early in the development lifecycle. A specific application of these techniques to the generation of UNIX message catalogs is described.

Writing Internationalized Code

qo'mey poSmoH Hol
("Language opens worlds.") [3]

In UNIX-based internationalization, strings that are to be displayed to the user are separated from the executable code and placed into files called message catalogs. Each string in the message catalog is indexed by a set and message number, and the I18N code accesses a string by this index. (Some non-UNIX systems use application resource files instead of message catalogs; the only difference is that application resources are accessed by the resource name rather than a numeric index.)

Instead of having hardcoded strings in a program such as

		printf("hello, world\n");

I18N stores the string in a message catalog as follows:

		$set 1
		5 hello, world\n

The code retrieves this string by accessing message number 5 in message set 1. This arrangement is very useful for translations. To translate our "hello, world" program into French, a translator merely changes the message string to be

		$set 1
		5 bonjour, le monde\n

No changes need to be made to the I18N code to support the French version of "hello, world".

As a real-world example, the English and Japanese Text Editor Help Menus in Figure 1 below were both generated by the same software; only the message catalog was changed.

Figure 1: English and Japanese Help Menus

Problems with Testing Internationalized Code

Hoch 'ebmey tIjon
("Capture all opportunities.")

Here is the typical process for developing internationalized applications:

the development team writes an application using the default English locale
the development team distributes the English locale message catalog to localizers
the localizers translate the English locale messages into other languages
the development team receives message catalogs back from the localizers
the development team integrates and tests the application with the new message catalogs
the localizers test the localized application to verify the translation.

Since message translation (step 3) is a lengthy procedure, the development team may have to wait several weeks before testing the internationalized parts of their code. Precious test and debugging time is lost waiting for the translated catalog to come back from the localizers.

There are several common mistakes people make when writing I18N code. One mistake is to neglect to leave enough room in the message buffer for the translated message string. Some languages, such as German, require more space than English does for the same message. The length of the translated message string cannot be known until runtime, though a good rule of thumb is to allow for 60% text growth during translation. If the translated message string is still too long for the application to handle, the application should deal with it gracefully, for instance, by truncating the translated string to a reasonable length.

A second common mistake is when developers neglect to accommodate languages, such as Japanese, that require two bytes to store a single character. Most Western languages require only one byte of storage per character, but several Far Eastern languages have large characters sets and need more than one byte per character. If the code does not handle double-byte characters gracefully, the results could range from corrupted characters to a crash of the application.

Testing for these kinds of errors is difficult because translated message catalogs are necessary. As noted above, it may be many weeks before an actual translated message catalog is available. Even when the message catalogs are available, many people, lacking familiarity in the target languages, are reluctant to test the application in locales such as Japanese which are most likely to expose problems.

Also, there is no guarantee that the strings in an official catalog will expose defects in the code. If, for instance, the code does not handle long strings gracefully, we will not see that defect unless the message catalog we are using has very long strings.

We need a fast, simple way to generate message strings that are easy to use in testing and are likely to expose defects.

Our solution has been to construct artificial language message strings that mimic the kinds of problems we see in actual message strings. This instant creation of a translated message string against which to test our software provides us with quick feedback about how our application will perform with actual localized components.

Creating an Artificial Locale Message Catalog

tlhutlhmeH HIq ngeb qaq law' bII qaq puS
("Drinking fake ale is better than drinking water.")

Translating the messages in a message catalog is more difficult than translating plain text, regardless of whether it is a human or a machine doing the translation. The problem lies in the fact that a message catalog, in addition to having words that must be translated, also has a very definite structure that must be followed.

At the very simplest level, all strings in a message catalog are indexed by a set number and a message number. Any valid translation of a message catalog must include all the same set and message numbers as the original.

Many applications use menu mnemonics. To preserve the functionality of the menus, these mnemonics should be kept as single ASCII characters associated with a menu label.

Format conversion specifiers such as "%s" or "%d" in the message strings must be copied into the new catalog, unchanged except for the possible addition of positional parameters. [4]

Control characters such as "\n" and "\t" must be copied into the new catalog unchanged.

All other characters in the message string can be "translated" in any way we deem useful.

Our approach to creating a new message catalog is straightforward. We pass each English message string through a filter that separates the characters we wish to translate from those, such as control characters and format specifiers, that we do not. We then run one of several filters on the characters we wish to translate. We then construct the new message string by merging the newly translated characters with the untranslated characters from the earlier string.

The Swedish Chef Locale

Dal pagh jagh
("No enemy is boring.")

To verify that the software handles long strings correctly, we create a locale where the strings are all significantly longer than their English counterparts. A simple implementation might be to append a static string, such as the alphabet, to each string in the catalog. Preserving the original string in this way makes it easy for the person looking at the screen to determine what the original string said.

We chose a slightly more whimsical approach. We found a program called "Encheferizer" [5] on the Internet that converts English words into a semblance of the speech patterns of the Swedish Chef(tm) from the classic Muppet(tm) television show. (To ensure that the "translated" strings would all be long, we modified the Encheferizer code slightly to append "Bork! Bork! Bork!" to each string.)

Here is the Swedish Chef version of the Help Menu:

Figure 2: Swedish Chef Help Menu

As you can see, the modified "Encheferizer" expands the size of the strings in this menu to roughly double their original size. Other "translation" filters can be substituted if a greater effect is desired.

The Wide Locale

mataHmeH maSachnIS
("To survive, we must expand.")

A common place for programs to fail is in handling codesets that require more than one byte to represent a character. Japanese SJIS (pronounced "Shift-JIS"), for instance, has a very large character set and requires two bytes to store each character. Double-byte characters often cause trouble for an I18N program, and it is useful to verify handling of double-byte characters early in development.

There are two main problems with using the actual Japanese SJIS locale to test the handling of double-byte characters. First, the translation may take weeks. And second, even after the translation is completed, most testers don't feel comfortable running the application in Japanese.

We answer these problems by filtering the strings we wish to translate through a small C language program that maps ASCII characters into recognizable double-byte counterparts that are intelligible to English readers.

Figure 3: Wide C Help Menu

It is important to note that the strings in the menu, such as "O v e r v i e w" are not made up of ASCII characters; rather, they are composed of double-byte characters that look like their ASCII counterparts. (For accuracy's sake, the actual characters are JIS 0208 Latin, using the SJIS encoding.) The actual representation of the string "O v e r v i e w (v)" in the message catalog is

\202n\202\226\202\205\202\222\202\226\202\211\202\205\202\227(v)

The Wide locale is useful for finding places where the code does not handle double-byte characters. It is also useful for locating strings that are hardcoded into the software. Since we have translated all the strings in the message catalog into double-byte format, any strings that look "thin" must be coming from somewhere else. The most common source for these strings is that they are hardcoded into the software, or that the software is failing to access the message catalog correctly.

The Troublesome Wide Locale

bIQapqu'meH tar DaSop 'e' DatIvnIS
("To really succeed, you must enjoy eating poison.")

A further twist on the Wide Locale scheme is to prepend to each label a string of double-byte characters which are known to be difficult to process because their second bytes correspond to ASCII delimiter characters such as backslash ('\') or double quote ('"'). These types of characters are present in only a few codesets, such as ja_JP.SJIS, but can be the source of much difficulty. Typically, they will be misinterpreted if the system is not configured properly to interpret them, causing undetected data corruption.

Here is an example of a menu with such a string prepended to each entry. The string is a list of Japanese "katakana" characters which were selected to be troublesome. In this case, the strings are displayed correctly because the system was configured correctly when the menu was compiled. When the same menu was built on an incorrectly configured system, the data corruption was so severe that the application would not even start.

Figure 4: Troublesome Wide C Help Menu

The "troublesome" type of test is useful for finding a class of problems that are not common, but quite difficult to detect by other means. It is not as convenient as the other example because non-English characters are involved. However, it is also generally applied only once, to one component, to reveal if there are system-wide configuration problems.

The Klingon Locale

tlhIngan Hol Dajatlh'a'
("Do you speak Klingon?")

One persistent complaint we hear is that people feel that they cannot test in a language that they do not understand. Since our test approach claims that it is possible to construct many tests that are "locale-independent" [6], we decided to prove the claim by choosing a language that would be equally difficult for almost anyone in the world.

Having found the Klingon version of Hamlet [7] on the Internet, it was simple work to construct a word list for the Klingon language. The Klingon filter substitutes random Klingon words for the sections of the message that we wish to translate. To make the test even tougher, we convert the single byte characters into double-byte characters as we did with the "Wide C" locale. (We also investigated using the official Klingon Klinzhai characters, but no version of the fonts was available for UNIX.)

Figure 5: Klingon Help Menu

This "random word" replacement method was easy to implement, and it avoided the awkward issue of whether Klingons would even provide a Help menu in their applications.

Advantages of Artificial Locales

reH Suvrup tlhIngan SuvwI'
("A warrior is always prepared to fight.")

Artificial locale message catalogs are useful in early detection of the type of bugs that appear in internationalized software. It eliminates serious technical and psychological barriers to adequate testing of I18N applications by providing messages that are easy to use and useful at exposing bugs.

Disadvantages of Artificial Locales

Dujeychugh jagh nIv yItuHQo'
("There is nothing shameful in falling before a superior enemy.")

The "translation" filters mentioned in this paper are general-purpose utilities, and do not take into account special cases of strings that should not be translated. An example of such a string would be a single-byte string that gets sent to a printer interface. Some special arrangement would have to be made to preserve that string's functionality.

These are not real translations, and testing in these locales does not eliminate the need for testing with the actual message strings when they become available. There are always bugs that crop up in the real world that are difficult or impossible to predict with a simulation.

Conclusion

Qapla'
("Success!")

Actual translations of message catalogs arrive too late in the development process to be useful in eradicating many internationalization bugs. Since we can anticipate some common bugs in I18N code, it is a good idea to create artificial message strings so that we can test for and eliminate these bugs as soon as possible. It is an added benefit to make the "translated" message strings as useful, simple and pleasant to work with as possible.

References

reH tay' ghot tuqDaj je
("One is always of his tribe.")

[1] Marc Okrand, The Klingon Way: A Warrior's Guide, Pocket Books, 1996 (Unless otherwise noted, all Klingon quotations come from this source.)

[2] An excellent introduction to the I18N methodology is Thomas McFarland's X Windows on the World, Prentice-Hall, Inc, 1996

[3] Motto of the Klingon Language Institute, http://www.kli.org/

[4] Positional parameters are an extension of the format conversion specifier that allows localizers to change the order in which arguments appear in the output. Positional parameters are discussed in [2], pp 67-69

[5] John Hagerman, "Ze sveedish chef", http://www.almac.co.uk/chef/chef/chef.html

[6] Harry Robinson and Sankar L. Chakrabarti, Testing CDE In Sixty Languages: One Test Is All It Takes, Proceedings of the 14th International Conference on Testing Computer Software, 1997

[7] Nick Nicholas & Andrew Strader, Hamlet, Prince of Denmark (The Restored Klingon Version), 1996, ftp://ftp.kli.org/pub/Text/KSRP/hamlet

Harry Robinson is currently a software test engineer with Microsoft's Intelligent Search Group

Arne Thormodsen is an R&D engineer at Hewlett Packard's Workstation Technology Center

Return to Harry Robinson's software testing web page