When carrying out localization for your application, there are many things to consider. Among these are code changes, choosing a localization team to collaborate with, and which local markets to enter.
In this post, we will focus on local markets while sharing language usage data for both source and target languages. As a reminder, the source language represents the default language of the project.
First, some TL;DR excerpts for people in a hurry:
- English is the preferred source language across projects. This is true regardless of whether we are talking about for-profit or open-source projects.
- French, Spanish, and German are solid local markets across all projects.
- European languages are the core focus of for-profit projects.
- It seems that Brazil and the Netherlands are contributing to new for-profit applications.
And the top 10 ten target languages for-profit projects are:
- Portuguese (Brazil)
- French (France)
Popular languages on for-profit projects
Data on for-profit projects provide a better view of local markets’ performance. Having this knowledge can help guide you on your first steps with localization. Where are the top-performing local markets? When in doubt, which local markets should you exclude or include in your strategy?
We start with the top ten target languages, as seen below.
Breakdown by for-profit projects count
Breakdown by percentage
In these figures, we see that French is a dominant local market, represented by two locales in the list. We can also note the absence of Asian markets in the top ten list, except Japanese, which is in the top five.
Eight out of the ten target languages are European languages. Native speakers of European languages could be from communities on other continents too. For example, we know that multiple languages are spoken by the US population.
Japanese and Portuguese (Brazil) highlight the dynamic nature of these markets for localization.
Next, we move on to the top ten source languages of for-profit projects. As a reminder, the source language is the default language of a project. Source language is an indicator of the original local market that a project refers to.
As you can see, English is the prevailing market by a considerable margin. The English locales give some information on the projects’ country of origin. The United States is in second place and the United Kingdom in third place.
Based on our experience, the source language locale is usually a light decision. In some cases, the source language locale needs to change to accommodate the company’s localization needs, such as English (United States) to English (en).
Finally, Brazil and the Netherlands seem like important markets for innovation. Portuguese (Brazil) and Dutch as source languages highlight the active development in these languages.
Somebody first used Pirate English as the source language in September 2014. Using Pirate English as a source language is trending in 2019 and 2020. A few projects are using it as their source in 2021.
Popular languages on open-source projects
Checking the target languages of open-source projects, we can identify the most active local communities. We can also make an assumption about the leading local markets for open-source projects.
The following graphs display the top ten target languages for open-source projects.
Breakdown by open-source projects count
Breakdown by percentage
We can see from these graphs that the top four positions share 85% of the total open-source project count. Moreover, these four languages share an even percentage distribution. Portuguese (Brazil), French, Spanish, and German communities are significant for open-source project localization.
Next, we’ll take a look at which are the top ten source languages for open-source projects. Remember, above were the top target languages. The source language of an open-source project shows how “global” or “local” the framing of the project is.
An open-source project using English means that it is created with a worldwide reach in mind. More people can use a project in English, and more contributors can help develop an open-source project.
The following graph illustrates the vast gap between English and other source languages.
Source languages breakdown
English is prevailing, capturing the top three positions in its various forms. Like for-profit projects, English variations are indicators of the country of origin. For example, English (United States) usually means that the project was founded in the United States.
There are four European markets (Russian, French, German, and Spanish), two Asian (Chinese and Japanese), and one from Latin America in Portuguese (Brazil).
Wrapping it all up
Some trends are present in both for-profit and open-source projects. These similarities highlight a global trend regarding software localization in general. Let’s analyze each trend below.
English is the global language. This is demonstrated by the source language used by both subcategories of projects. Even English (US) and English (UK) locales match across for-profit and open-source projects.
Localization focus is on western world locales, as we see for all projects. The most important local markets are speaking French, Spanish, and German. Lack of localization in the East world market could be due to the East world market volume, which may mean that localization is not becoming a priority in those regions.
Brazil is important in software innovation. In both subcategories, Portuguese (Brazil) as the source language is high in the lists. Since we measure usage of the Transifex TMS, we can infer an interest in entry to local markets for software created in Brazil.
The Netherlands is an emerging market in for-profit software innovation. Dutch positioning in the top ten for-profit source languages supports this assumption. A report on the Netherlands SaaS companies backs this up, and localization is playing an important role.
In the next post, we will analyze data from specific verticals of localization. This way, we can gain a better understanding of the localization landscape and its current trends.