Linux FAQ's & Manuals


examples

to check your current locale settings you can use the command ``locale''. for example:

     ~$ locale     lang=ja_jp.utf-8     lc_ctype="ja_jp.utf-8"     lc_numeric="ja_jp.utf-8"     lc_time="ja_jp.utf-8"     lc_collate="ja_jp.utf-8"     lc_monetary="ja_jp.utf-8"     lc_messages="ja_jp.utf-8"     lc_paper="ja_jp.utf-8"     lc_name="ja_jp.utf-8"     lc_address="ja_jp.utf-8"     lc_telephone="ja_jp.utf-8"     lc_measurement="ja_jp.utf-8"     lc_identification="ja_jp.utf-8"     lc_all=     ~$ 

the values of variables which are listed in double quotes have not been explicitely set, they have just inherit their effective value from lang or lc_all. in the above example, lang is the only variable which has been set, the others like lc_ctype are unset (echo $lc_ctype will return nothing) but the behaviour with the above locale setting ist the same as if all these variables had been set seperately. therefore it is usually enough to set only lang if you want to use only one language.

you only need to set more variables if you want to mix different locales, for example if you want english messages, japanese input enabled and all the rest default to the settings for german you can set:

     ~$ export lang=de_de.utf-8     ~$ export lc_messages=en_gb.utf-8     ~$ export lc_ctype=ja_jp.utf-8     ~$ locale     lang=de_de.utf-8     lc_ctype=ja_jp.utf-8     lc_numeric="de_de.utf-8"     lc_time="de_de.utf-8"     lc_collate="de_de.utf-8"     lc_monetary="de_de.utf-8"     lc_messages=en_gb.utf-8     lc_paper="de_de.utf-8"     lc_name="de_de.utf-8"     lc_address="de_de.utf-8"     lc_telephone="de_de.utf-8"     lc_measurement="de_de.utf-8"     lc_identification="de_de.utf-8"     lc_all=     ~$ 

note that the values of lc_messages and lc_ctype have no double quotes now because these variables have been explicitely set.

also note that all values above use utf-8 encoding, with other encodings it would not be possible to use japanese and german at the same time.

take care never to use different encodings at the same time in these locale categories, for example it is not allowed to use something like

     ~$ export lang=de_de.iso-8859-15@euro     ~$ export lc_ctype=ja_jp.utf-8 

``the open group base specifications issue 6'' says about this:

if different character sets are used by the locale categories, the results achieved by an application utilizing these categories are undefined.

(see http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap07.html)

not all combinations of ``language'', ``territory'', ``codeset'', and ``modifier'' are allowed. it is not enough if the locale name is syntactically correct according to the definition in the last chapter, the locale needs to exist.

you can quite easily check whether a certain locale exists on your system by executing the command

     lc_ctype=<locale-name> locale charmap 

this command returns the encoding used by the locale ``<locale-name>'' and if such a locale doesn't exist or isn't installed, it only returns ansi_x3.4-1968 (which basically means ascii) because the fallback to the posix locale is used in case of non-existing locales.

for example:

     ~$ lc_ctype=ja_jp.utf-8 locale charmap     utf-8 

i.e. this ``japanese'' for ``japan'' with codeset ``utf-8'' locale exists. one can also use japanese locales with other encodings, for example

     ~$ lc_ctype=ja_jp.eucjp locale charmap     euc-jp 

omitting the ``.eucjp'' is also allowed:

     ~$ lc_ctype=ja_jp locale charmap     euc-jp     ~$ 

actually ``ja_jp'' is the same locale as ``ja_jp.eucjp'', it is just an alias which is defined in /usr/share/locale/locale.alias. and there's another one, japanese with ``shift_jis'' encoding

     ~$ lc_ctype=ja_jp.sjis locale charmap     shift_jis 

but one cannot just combine ``ja_jp'' with some arbitrary encoding, for example the following locale does not exist:

     ~$ lc_ctype=ja_jp.iso-8859-15 locale charmap     ansi_x3.4-1968 

instead of returning ``iso-8859-15'' it returns ``ansi_x3.4-1968'' the same as for the ``posix'' locale

     ~$ lc_ctype=posix locale charmap     ansi_x3.4-1968 

which shows that the locale ``ja_jp.iso-8859-15'' does not exist. well, of course it wouldn't make much sense anyway to try using japanese with ``iso-8859-15'' encoding. but even combinations which might make sense don't have to exist, for example the following locale does not exist:

     ~$ lc_ctype=de_at.iso-8859-15 locale charmap     ansi_x3.4-1968 

that doesn't mean that you can't use german in austria with iso-8859-15 encoding, there are other possible spellings which exist and achieve just that:

     ~$ lc_ctype=de_at@euro locale charmap     iso-8859-15     ~$ lc_ctype=de_at.iso-8859-15@euro locale charmap     iso-8859-15 

but these examples show that you have to take care to choose a valid setting. the command locale -a gives you a list of all locales known by glibc on your system:

     mfabian@gregory:~$ locale -a     c     [...]     posix     af_za     af_za.iso88591     ar_ae     ar_ae.iso88596     ar_ae.utf8     [...]     ja_jp     ja_jp.eucjp     ja_jp.sjis     ja_jp.ujis     ja_jp.utf8     japanese     japanese.euc     japanese.sjis     [...] 

note that there may be aliases which don't follow the syntax for locale names explained in the last section, like ``japanese'', ``japanese.euc'', and ``japanese.sjis'' in the above output. i recommend not to use such aliases, they cause several problems, for example language specific app-default files for x11 programs may not be found when such aliases are used (see ``man xtresolvepathname'' if you want to know why not).

also note the codeset part in the output of ``locale -a'' is always lower case and contains neither underscores (``_'') nor hyphens (``-''). for glibc this doesn't matter, as far as glibc is concerned, the following locales all work and are all the same:

     ~$ lc_ctype=ja_jp.utf-8 locale charmap     utf-8     ~$ lc_ctype=ja_jp.utf8 locale charmap     utf-8     ~$ lc_ctype=ja_jp.u-t_f-_8 locale charmap     utf-8     ~$ 

but that doesn't mean that these spelling differences don't matter, they matter a lot as soon as you want to use x11 applications. x11 has its own ideas about the correct spellings and one should only use spellings which are accepted by both glibc and x11. in case of the japanese utf-8 locale, the only spelling supported by x11 is ``ja_jp.utf-8'' (actually utf-8 is the only officially correct spelling of utf-8 encoding).

for checking which spelling is supported by both glibc and x11 one can use ``xterm'':

 ~$ lc_ctype=ja_jp.utf9 xterm warning: locale not supported by c library, locale unchanged ~$ lc_ctype=ja_jp.utf8 xterm warning: locale not supported by xlib, locale set to c ~$ lc_ctype=ja_jp.utf-8 xterm 

the warning when using the nonsense value ``ja_jp.utf9'' shows the message you get when trying to use xterm in a locale which is not even supported by glibc. the warning after trying to use the value ``ja_jp.utf8'' is different and shows that this locale is supported by glibc but not by x11. only the last try with ``ja_jp.utf-8'' doesn't output any warnings which shows that this locale is supported by both glibc and x11.

to see a list of locales and spelling variants supported by x11, have a look into the files

     /usr/x11r6/lib/x11/locale/locale.dir     /usr/x11r6/lib/x11/locale/locale.alias 

only the locales and spelling variants listed in one of these files are allowed.

2005-03-09