How to Name String Identifiers in Translation Files

Dimitris Glezos

December 11, 2015

5 min read

One of the first steps in the internationalization process is also one of the most critical: establishing a naming convention for string identifiers. Identifiers (commonly known as keys) act as placeholders for translated text, providing context for developers and translators. No matter how many languages your project supports, your string identifiers are one of the few components that remain consistent throughout the internationalization process.

Establishing a set of best practices at the start of the internationalization process makes it easier to add new strings and new languages as your project grows. It also serves to help identify localized strings in code and maintain consistency when strings are added, modified, or removed.

This post looks to establish a set of best practices for naming string identifiers. While the exact method will vary based on your localization framework, these practices are general enough to apply to any internationalization project.

1. Use Namespaces

Namespaces are used in software to group related objects, methods, and identifiers. In terms of localization, namespaces perform a similar function while providing additional information about the localized string. For example, a website built using the Model-View-Controller pattern could use namespaces to specify where in the application a particular string appears. For example, a login button on a home page could use the key “user.login_form.login,” where “user.login_form” defines the location of the string and “login” identifies the actual control. Namespaces can be as simple or as verbose as you like, as long as they identify where the string occurs in your project.

2. Be Descriptive

A descriptive identifier accurately reflects the contents of the underlying string, making it easier for developers to recognize the purpose of the string in code. Consider a user login button with two possible identifiers: “user_login_submit,” or simply “submit.” While both represent the same idea, the first option conveys more information about the purpose of the string without being significantly longer.

3. Be Unique

Each identifier in your project is a one-to-one mapping to a string. As a result, using the same identifier for multiple strings can lead to unexpected issues such as translations repeating or appearing in the wrong locations. One possible exception to this rule is when using the same translation in two different locations, although this can be better handled by using Translation Memory to autofill text that’s already been translated.

There are two approaches to generating identifiers: creating a human-readable ID based on the original string, or creating a computer-generated identifier using a hashing algorithm. Take our user login example from the previous section. The ID “user_login_submit” is effective because it reflects the contents of the string. However, another developer working on another login component could accidentally use the same identifier for a completely different element. Human-readable identifiers are easier to recognize in code, but maintaining uniqueness becomes more difficult as projects get larger.

Hashing algorithms, on the other hand, create computer-generated identifiers that are significantly less likely to collide with one another. Hashed identifiers are often generated by combining multiple attributes about the source string, such as the string itself and a description of its context in the program. This way, it’s only possible for two strings to share the same hash if their source strings and contexts are exactly the same.

4. Carefully Consider Using the Source String

Some localization frameworks recommend using the untranslated string as the string identifier. For instance, gettext will let you use “Hello world!” as both the source string and as the identifier. While this approach may seem simpler, not all localization frameworks fully support its use. For instance, .resx files don’t allow spaces when naming string identifiers.

Source strings also limit your ability to modify your translations. What if the original text changes? Not only do you have to update your other translations, but you also have to change each instance of the identifier throughout your project. Also consider that not all languages use the same word for multiple contexts. For instance, the English word “run” could refer to running a marathon just as much as it could refer to executing code. In Spanish, however, you would have to differentiate between “correr” and “ejecutar.” By using “run” as a common identifier, you’ve limited yourself to a single option, even for languages that may use different words depending on context. In this case, you would either need to change the source string to create two different translations or risk using the wrong translation.

It is possible to use source strings successfully in a localization project. For frameworks where it’s the default option, such as gettext, the original string often acts as a fallback if the translated version can’t be found. It’s somewhat more readable than creating a custom identifier, and it removes a layer between the original text and translated text. Check your internationalization framework for the recommended approach to string identifiers.

5. Stick to a Single Language

If it’s supported by your localization framework, your identifiers should be in the same language as your source language. For instance, using Cyrillic characters for your identifiers when your code is written in English is asking for trouble. Sticking to your default language keeps your identifiers readable while eliminating inconsistencies in your code. This practice also applies to characters that might be interpreted by your code as operators, such as single quotes, double quotes, or escape characters.

Building the Foundation for Localization

As one of the first steps in internationalization, creating a naming convention for your identifiers can have a bigger impact on the success of your localization project than expected. A successful naming convention can streamline the localization process, but a poor convention can just as easily hinder it. Taking the time to establish a solid convention today will help prevent hurdles further along in the internationalization process.

Have other questions about localization? Schedule a Transifex demo with one of our team members and see how we can help streamline your personal workflow.

Best Practices for Naming String Identifiers

1. Use Namespaces

2. Be Descriptive

3. Be Unique

4. Carefully Consider Using the Source String

5. Stick to a Single Language

Building the Foundation for Localization

TX Labs: A Space for Accelerated AI Innovation

AI Localization: Everything You Need to Know

How AI is Transforming Localization