diff --git a/Doc/library/locale.rst b/Doc/library/locale.rst index 426e3a06e1ef11..d48ea04077f366 100644 --- a/Doc/library/locale.rst +++ b/Doc/library/locale.rst @@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions: If *locale* is given and not ``None``, :func:`setlocale` modifies the locale setting for the *category*. The available categories are listed in the data - description below. *locale* may be a string, or an iterable of two strings - (language code and encoding). If it's an iterable, it's converted to a locale - name using the locale aliasing engine. An empty string specifies the user's + description below. *locale* may be a :ref:`string `, or a pair, + language code and encoding. An empty string specifies the user's default settings. If the modification of the locale fails, the exception :exc:`Error` is raised. If successful, the new locale setting is returned. + If *locale* is a pair, it is converted to a locale name using + the locale aliasing engine. + The language code has the same format as a :ref:`locale name `, + but without encoding and ``@``-modifier. + The language code and encoding can be ``None``. + If *locale* is omitted or ``None``, the current setting for *category* is returned. @@ -345,22 +350,26 @@ The :mod:`locale` module defines the following exception and functions: ``'LANG'``. The GNU gettext search path contains ``'LC_ALL'``, ``'LC_CTYPE'``, ``'LANG'`` and ``'LANGUAGE'``, in that order. - Except for the code ``'C'``, the language code corresponds to :rfc:`1766`. - *language code* and *encoding* may be ``None`` if their values cannot be + The language code has the same format as a :ref:`locale name `, + but without encoding and ``@``-modifier. + The language code and encoding may be ``None`` if their values cannot be determined. + The "C" locale is represented as ``(None, None)``. .. deprecated-removed:: 3.11 3.15 .. function:: getlocale(category=LC_CTYPE) - Returns the current setting for the given locale category as sequence containing - *language code*, *encoding*. *category* may be one of the :const:`!LC_\*` values - except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`. + Returns the current setting for the given locale category as a tuple containing + the language code and encoding. *category* may be one of the :const:`!LC_\*` + values except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`. - Except for the code ``'C'``, the language code corresponds to :rfc:`1766`. - *language code* and *encoding* may be ``None`` if their values cannot be + The language code has the same format as a :ref:`locale name `, + but without encoding and ``@``-modifier. + The language code and encoding may be ``None`` if their values cannot be determined. + The "C" locale is represented as ``(None, None)``. .. function:: getpreferredencoding(do_setlocale=True) @@ -615,6 +624,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered part of a character class such as letter or whitespace. +.. _locale_name: + +Locale names +------------ + +The format of the locale name is platform dependent, and the set of supported +locales can depend on the system configuration. + +On Posix platforms, it usually has the format [1]_: + +.. productionlist:: locale_name + : language ["_" territory] ["." charset] ["@" modifier] + +where *language* is a two- or three-letter language code from `ISO 639`_, +*territory* is a two-letter country or region code from `ISO 3166`_, +*charset* is a locale encoding, and *modifier* is a script name, +a language subtag, a sort order identifier, or other locale modifier +(for example, "latin", "valencia", "stroke" and "euro"). + +On Windows, several formats are supported. [2]_ [3]_ +A subset of `IETF BCP 47`_ tags: + +.. productionlist:: locale_name + : language ["-" script] ["-" territory] ["." charset] + : language ["-" script] "-" territory "-" modifier + +where *language* and *territory* have the same meaning as in Posix, +*script* is a four-letter script code from `ISO 15924`_, +and *modifier* is a language subtag, a sort order identifier +or custom modifier (for example, "valencia", "stroke" or "x-python"). +Both hyphen (``'-'``) and underscore (``'_'``) separators are supported. +Only UTF-8 encoding is allowed for BCP 47 tags. + +Windows also supports locale names in the format: + +.. productionlist:: locale_name + : language ["_" territory] ["." charset] + +where *language* and *territory* are full names, such as "English" and +"United States", and *charset* is either a code page number (for example, "1252") +or UTF-8. +Only the underscore separator is supported in this format. + +The "C" locale is supported on all platforms. + +.. _ISO 639: https://www.iso.org/iso-639-language-code +.. _ISO 3166: https://www.iso.org/iso-3166-country-codes.html +.. _IETF BCP 47: https://www.rfc-editor.org/info/bcp47 +.. _ISO 15924: https://www.unicode.org/iso15924/ + +.. [1] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables `_ +.. [2] `UCRT Locale names, Languages, and Country/Region strings `_ +.. [3] `Locale Names `_ + + .. _embedding-locale: For extension writers and programs that embed Python