On UNIX/Linux systems this setup is called POSIX [7] locales, and standardized as IEEE Std 1003.1-2017 [3]. Such a locale can vary for the system as a whole, and the single user accounts as every single user can individualize his working environment. In this article we will explain to you how to figure out the current locale setup on Debian GNU/Linux, to understand its single adjusting screws, and how to adapt the system to your needs.
Note that this article is tailored to Debian GNU/Linux Release 10 “Buster”. Unless otherwise stated the techniques described here also work for its derivates like Ubuntu or Linux Mint [8].
What is a locale?
Generally speaking, a locale is a set of values that reflect the nature and the conventions of a country, or a culture. Among others these values are stored as environment variables that represent the language, the character encoding, the date and time formatting, the default paper size, the country’s currency as well as the first day of the week.
As touched on before, there is a general setting known as ‘default locale’, and a user-defined setting. The default locale works system-wide and is stored in the file /etc/default/locale. Listing 1 displays the default locale on a Debian GNU/Linux using German as the main language, and 8 bit unicode (UTF-8) as the character set [11].
Listing 1: The default locale on a German Debian GNU/Linux
Please note that in contrast to Debian GNU/Linux, on some earlier Ubuntu versions the system-wide locale setup is stored at /etc/locale.conf.
The user-defined settings are stored as a hidden file in your home directory, and the actual files that are evaluated depend on the login shell that you use [6]. The traditional Bourne shell (/bin/sh) [4] reads the two files /etc/profile and ~/.profile, whereas the Bourne-Again shell (Bash) (/bin/bash) [5] reads /etc/profile and ~/.bash_profile. If your login shell is Z shell (/bin/zsh) [9], the two files ~/.zprofile and ~/.zlogin are read, but not ~/.profile unless invoked in Bourne shell emulation mode [10].
Starting a shell in a terminal in an existing session results in an interactive, non-login shell. This may result in reading the following files – ~/.bashrc for Bash, and /etc/zshrc as well as ~/.zshrc for Z shell [6].
Naming a locale
As explained here [12], the name of a locale follows a specific pattern. The pattern consists of language codes, character encoding, and the description of a selected variant.
A name starts with an ISO 639-1 lowercase two-letter language code [13], or an ISO 639-2 three-letter language code [14] if the language has no two-letter code. For example, it is de for German, fr for French, and cel for Celtic. The code is followed for many but not all languages by an underscore _ and by an ISO 3166 uppercase two-letter country code [15]. For example, this leads to de_CH for Swiss German, and fr_CA for a French-speaking system for a Canadian user likely to be located in Québec.
Optionally, a dot . follows the name of the character encoding such as UTF-8, or ISO-8859-1, and the @ sign followed by the name of a variant. For example, the name en_IE.UTF-8@euro describes the setup for an English system for Ireland with UTF-8 character encoding, and the Euro as the currency symbol.
Commands and Tools
The number of commands related to locales is relatively low. The list contains locale that purely displays the current locale settings. The second one is localectl that can be used to query and change the system locale and keyboard layout settings. In order to activate a locale the tools dpkg-reconfigure and locale-gen come into play – see the example below.
Show the locale that is in use
Step one is to figure out the current locale on your system using the locale command as follows:
Listing 2: Show the current locale
LC_TIME=“de_DE.UTF-8” LC_COLLATE=“de_DE.UTF-8” LC_MONETARY=“de_DE.UTF-8”
LC_MESSAGES=“de_DE.UTF-8” LC_PAPER=“de_DE.UTF-8” LC_NAME=“de_DE.UTF-8”
LC_ADDRESS=“de_DE.UTF-8” LC_TELEPHONE=“de_DE.UTF-8” LC_MEASUREMENT=“de_DE.UTF-8”
LC_IDENTIFICATION=“de_DE.UTF-8” LC_ALL= $ —-
Please note that other Linux distributions than Debian GNU/Linux may use additional environment variables not listed above. The single variables have the following meaning:
- LANG: Determines the default locale in the absence of other locale related environment variables
- LANGUAGE: List of fallback message translation languages
- LC_CTYPE: Character classification and case conversion
- LC_NUMERIC: Numeric formatting
- LC_TIME: Date and time formats
- LC_COLLATE: Collation (sort) order
- LC_MONETARY: Monetary formatting
- LC_MESSAGES: Format of interactive words and responses
- LC_PAPER: Default paper size for region
- LC_NAME: Name formats
- LC_ADDRESS: Convention used for formatting of street or postal addresses
- LC_TELEPHONE: Conventions used for representation of telephone numbers
- LC_MEASUREMENT: Default measurement system used within the region
- LC_IDENTIFICATION: Metadata about the locale information
- LC_RESPONSE: Determines how responses (such as Yes and No) appear in the local language (not in use by Debian GNU/Linux but Ubuntu)
- LC_ALL: Overrides all other locale variables (except LANGUAGE)
List available locales
Next, you can list the available locales on your system using the locale command accompanied by its option -a. -a is short for –all-locales:
Listing 3: Show available locales
Listing 3 contains two locale settings for both German (Germany) and English (US). The three entries C, C.UTF-8, and POSIX are synonymous and represent the default settings that are appropriate for data that is parsed by a computer program. The output in Listing 3 is based on the list of supported locales stored in /usr/share/i18n/SUPPORTED.
Furthermore, adding the option -v (short for –verbose) to the call leads to a much more extensive output that includes the LC_IDENTIFICATION metadata about each locale. Figure 1 shows this for the call from Listing 3.
In order to see which locales already exist, and which ones need further help to be completed you may also have a look at the map of the Locale Helper Project [20]. Red markers clearly show which locales are unfinished. Figure 2 displays the locales for South Africa that look quite complete.
Show available character maps
The locale command comes with the option -m that is short for –charmaps. The output shows the available character maps, or character set description files [16]. Such a file is meant to “define characteristics for the coded character set and the encoding for the characters specified in Portable Character Set, and may define encoding for additional characters supported by the implementation” [16]. Listing 4 illustrates this with an extract of the entire list.
Listing 4: Character set description files
Show the definitions of locale variables
Each variable used for a locale comes with its own definition. Using the option -k (short for –keyword-name) the locale command displays this setting in detail. Listing 5 illustrates this for the variable LC_TELEPHONE as it is defined in a German environment – the phone number format, the domestic phone format, the international selection code as well as the country code (international prefix), and the code set. See the Locale Helper Project [20] for a detailed description of the values.
Listing 5: The details of LC_TELEPHONE
int_select=“00” int_prefix=“49” telephone-codeset=“UTF-8” $ —-
Changing the current locale
The knowledge regarding the locale becomes necessary as soon as you run a system that comes with a different locale than you are used to – for example, on a Linux live system. Changing the locale can be done in two ways – reconfiguring the Debian locales package [19], and adding the required locale using the command locale-gen. For option one, running the following command opens a text-based configuration dialog shown in Figure 3:
Press the space bar in order to choose the desired locale(s) from the list shown in the dialog box, and choose “OK” to confirm your selection. The next dialog window offers you a list of locales that are available for the default locale. Select the desired one, and choose “OK”. Now, the according locale files are generated, and the previously selected locale is set for your system.
For option two, generating the desired locale is done with the help of the command locale-gen. Listing 6 illustrates this for a French setup:
Listing 6: Generating a French locale
Generating locales… fr_FR.UTF-8… done Generation complete. # —-
In order to use the previously generated locale as the default one, run the command in Listing 7 to set it up properly:
Listing 7: Manually setting the locale
As soon as you open a new terminal session, or re-login to your system, the changes are activated.
Compile a locale definition file
The command localectl helps you to manually compile a locale definition file. In order to create a French setting run the command as follows:
Listing 8: Compile a locale definition
Conclusion
Understanding locales can take a while as it is a setup that is influenced by several factors. We explained how to figure out your current locale, and how to change it properly. Adpating the Linux system to your needs should be much easier for you from now on.
Links and References
- [1] Locale, Debian Wiki
- [2] ChangeLanguage, How to change the language of your Debian system
- [3] POSIX Locale, The Open Group Base Specifications Issue 7, 2018 edition
- [4] Bourne shell, Wikipedia
- [5] Bourne-Again shell, Wikipedia
- [6] Difference between Login Shell and Non-Login Shell?, StackExchange
- [7] Portable Operating System Interface (POSIX), Wikipedia
- [8] Linux Mint
- [9] Z shell, Wikipedia
- [10] Zsh Shell Builtin Commands
- [11] UTF-8, Wikipedia
- [12] What should I set my locale to and what are the implications of doing so?
- [13] ISO 639-1, Wikipedia
- [14] ISO 639-2, Wikipedia
- [15] ISO 3166, Wikipedia
- [16] Character Set Description Files
- [17] Locale, Ubuntu Wiki
- [19] locales Debian package
- [20] Locale Helper Project