wiki:SandBox

Internationalization of applications

Since version 0.4.0, the Nagare framework includes a nagare.i18n module that can help you internationalize a Web application.

There are many aspects to take into account when internationalizing an application:

  • extracting the messages (strings) used inside the application and translating them in each language
  • formatting numbers, dates, times, currencies for display or parsing them after user input
  • dealing with timezone calculations
  • displaying the requested pages in the proper language according to the client browser language settings or a setting in the application

All these aspects are addressed by the nagare.i18n module.

Setup an application for internationalization

First, the Nagare i18n extras should be installed in your virtual environment:

$ easy_install nagare[i18n]

This command installs Babel and pytz which are necessary to use the nagare.i18n module.

However, if you plan to use the internationalization feature of Nagare in your Web application, you'd better change the setup.py file that is generated when you create the application, in order to specify that you need the i18n extra as a requirement:

setup(
      ...
      install_requires = ('nagare[i18n]', ...),
      ...
)

Then, Nagare is installed with the i18n extra when you install your application.

When you create a new application with nagare-admin create-app, Nagare automatically creates a setup.cfg file that contains a default configuration for internationalization:

[extract_messages]
keywords = _ , _N:1,2 , _L , _LN:1,2 , gettext , ugettext , ngettext:1,2 , ungettext:1,2 , lazy_gettext , lazy_ugettext , lazy_ngettext:1,2 , lazy_ungettext:1,2
output_file = data/locale/messages.pot

[init_catalog]
input_file = data/locale/messages.pot
output_dir = data/locale
domain = messages

[update_catalog]
input_file = data/locale/messages.pot
output_dir = data/locale
domain = messages

[compile_catalog]
directory = data/locale
domain = messages

This configuration file is used by the distutils commands that are automatically registered to setuptools when Babel is installed. Here is the list of the commands:

Command Effect
python setup.py extract_messages Extract the messages from the python sources to a .pot file
python setup.py init_catalog -l <lang> Create a new message catalog (.po file) for a given language, initialized from the .pot file
python setup.py update_catalog [-l <lang>] Update the message catalog(s) with the new messages found in the .pot file after an extract_messages
python setup.py compile_catalog [-l <lang>] Compile the message catalog(s) in binary form so that they can be used inside the application

See the Babel setup documentation for a complete reference of the commands options.

The catalog files are stored in the data/locale directory by default, which is automatically included into your application source package thanks to the MANIFEST.in file that was generated by nagare-admin create-app. The directory is not created by default, so you have to create it before issuing any of the Babel commands:

$ mkdir data/locale

Marking the messages to translate

The nagare.i18n module provides functions to translate the strings that appear in your code. The most useful function is nagare.i18n._ which returns the translation of the given string as a unicode string:

from nagare import presentation
from nagare.i18n import _

class I18nExample(object):
    pass

@presentation.render_for(I18nExample)
def render(self, h, *args):
    h << _('This is a translated string') << h.br
    h << 'This string is not translated'
    return h.root

app = I18nExample

There are also many more functions that deals with strings in the nagare.i18n module:

Function Effect
_ Shortcut to ugettext
_N Shortcut to ungettext
_L Shortcut to lazy_ugettext
_LN Shortcut to lazy_ungettext
gettext Return the localized translation of a message as a 8-bit string encoded with the catalog's charset encoding, based on the current locale set by Nagare
ugettext Return the localized translation of a message as a unicode string, based on the current locale set by Nagare
ngettext Like gettext but consider plurals forms
nugettext Like ugettext but consider plurals forms
lazy_gettext Like gettext but with lazy evaluation
lazy_ugettext Like ugettext but with lazy evaluation
lazy_ngettext Like ngettext but with lazy evaluation
lazy_nugettext Like nugettext but with lazy evaluation

Note that the lazy_* variants don't fetch the translation when they are evaluated. Instead Nagare automatically takes care of fetching the translation at the rendering time when it encounters a lazy translation. This is especially useful when you want to use translated strings in module or class constants, which are evaluated at definition time and consequently don't have access to the current locale which, as you will see later, is scoped to the request. Here is how you would define a class constant for use in a view:

from nagare import presentation
from nagare.i18n import _, _L

class I18nExample(object):
    # the following line is evaluated when the module is loaded, so _ would not work here
    TITLE = _L("Internationalization Example")

@presentation.render_for(I18nExample)
def render(self, h, *args):
    with h.h1:
        h << self.TITLE
    h << _('This is a translated string') << h.br
    h << 'This string is not translated'
    return h.root

...

Extracting the messages

Babel automatically takes care of the messages extraction from the sources when you run the python setup.py extract_messages command. Specifically, when it extracts the localizable strings from the sources files, it only considers the strings enclosed in specific keywords calls. The list of the supported keywords appear in the setup.cfg file:

[extract_messages]
keywords = _ , _N:1,2 , _L , _LN:1,2 , gettext , ugettext , ngettext:1,2 , ungettext:1,2 , lazy_gettext , lazy_ugettext , lazy_ngettext:1,2 , lazy_ungettext:1,2

All the nagare.i18n functions that deal with strings appear in this list, so you don't have to change this line in most cases. However, if you need to add additional translation functions, don't forget to update the keywords list in this file.

Furthermore, the setup.py file created by nagare-admin create-app contains a line that configures the messages extractors that Babel will use in order to extract the messages from your source files:

setup(
      ...
      message_extractors = { 'i18n_example' : [('**.py', 'python', None)] },
      ...
)

By default, Nagare set up an extractor for the python files. If you have other files that contain localizable strings (such as .js files), you'll have to add additional message extractors by yourself. Please refer the Babel messages extractor documentation to understand how to add additional message extractors.

If you run the python setup.py extract_messages command on the previous code, you should obtain a messages.pot file that looks like this one:

# Translations template for nagare.examples.
# Copyright (C) 2012 ORGANIZATION
# This file is distributed under the same license as the nagare.examples
# project.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2012.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: nagare.examples 0.3.0\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2012-03-05 11:05+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 0.9.6\n"

#: nagare/examples/i18n.py:18
msgid "Internationalization Example"
msgstr ""

#: nagare/examples/i18n.py:25
msgid "This is a translated string"
msgstr ""

All the localizable strings of your application should appear in this file after extraction. Note that the string This string is not translated of the previous example was not included because it was not enclosed in a call to a translation keyword.

Translating the messages

Once the .pot file is produced (after message extraction), you have to create a message catalog for each language with the help of the init_catalog catalog, for example:

$ python setup.py init_catalog -l fr
$ python setup.py init_catalog -l en

Then update the .po files that have just been created and provide the translations for all the strings that do not have one. You can use either a dedicated .po file editor, or a simple text editor and fill the msgstr lines for each msgid. Please refer to the PO Files section of the Gettext Manual for more information about this process.

Finally, compile the catalogs in binary form:

$ python setup.py compile_catalog

If you change the source files and add new messages to translate, you have to update the message catalogs:

$ python setup.py extract_messages
$ python setup.py update_catalog

Then, fill in the missing translations in the .po files and compile the catalogs again.

$ python setup.py compile_catalog

Formatting numbers, dates, times, currencies and dealing with timezones

The nagare.i18n module also defines functions for formatting numbers, dates, times, currencies and for timezones calculations, according to the locale of the user. We have grouped these functions by category. See the Nagare i18n.py API documentation for more information on these functions.

Numbers

Function Effect
get_decimal_symbol Return the symbol used to separate decimal fractions
get_plus_sign_symbol Return the plus sign symbol
get_minus_sign_symbol Return the minus sign symbol
get_exponential_symbol Return the symbol used to separate mantissa and exponent
get_group_symbol Return the symbol used to separate groups of thousands
format_number Return the given integer number formatted
format_decimal Return the given decimal number formatted
format_percent Return the formatted percentage value
format_scientific Return the value formatted in scientific notation, e.g. 1.23E06
parse_number Parse the localized number string into a long integer
parse_decimal Parse the localized decimal string into a float

Currencies

Function Effect
get_currency_name Return the name used for the specified currency
get_currency_symbol Return the symbol used for the specified currency
format_currency Return the formatted currency value

Dates & timezones

Function Effect
get_period_names Return the names for day periods (AM/PM)
get_day_names Return the day names in the specified format
get_month_names Return the month names in the specified format
get_quarter_names Return the quarter names in the specified format
get_era_names Return the era names used in the specified format
get_date_format Return the date formatting pattern in the specified format
get_datetime_format Return the datetime formatting pattern in the specified format
get_time_format Return the time formatting pattern in the specified format
get_timezone_gmt Return the timezone associated with the given datetime object, formatted as a string indicating the offset from GMT
get_timezone_location Return a representation of the given timezone using the "location format", i.e. the human-readable name of the timezone
get_timezone_name Return the localized display name for the given timezone
format_time Return a time formatted according to the given pattern
format_date Return a date formatted according to the given pattern
format_datetime Return a datetime formatted according to the given pattern
parse_time Parse a time from a string
parse_date Parse a date from a string
to_timezone Return a localized datetime object
to_utc Return a UTC datetime object

Setting the locale of the application

The last step of the internationalization process is to initialize the locale of the application according to a language setting. There are many ways to implement a language setting:

  • use the client browser language settings, i.e. deal with the Accept-Language headers from the HTTP request
  • use a URL prefix, for example /en for english and /fr for french (see RestfulUrl to learn how to set up URLs)
  • use a per-user setting: for example, read the locale setting from a user profile stored in a database
  • use a per-session setting, for example store the language setting as an attribute of a component
  • use a per-application setting, for example provide the language setting in the application configuration file

You may even want to combine two or more of these schemes in your application! As you will see next, Nagare makes it easy to change the locale at runtime, it doesn't really matter where the language setting comes from.

The best place where to change the locale is in the start_request method of the application object (wsgi.WSGIApp by default) since we have access to both the request and to the authenticated user (when necessary) there. Furthermore, since a user may change his language setting between two requests, the locale setting is scoped to the HTTP request by design in Nagare.

The locale can be changed with either the WSGIApp.set_locale method or the WSGIApp.set_default_locale method (see wsgi.py). The WSGIApp.start_request method itself use WSGIApp.set_locale to install the default locale, set with WSGIApp.set_default_locale. By default, the default locale is initialized to a locale that does not perform any translation, it just returns the original string.

Example 1: using a negociated browser locale

As an illustration, let's use a negociated browser locale in the previous example:

from nagare import presentation, component, wsgi, i18n

...

class I18nExampleApp(wsgi.WSGIApp):
    AVAILABLE_LOCALES = (
        ('en',),  # English
        ('fr',),  # French
    )

    def start_request(self, root, request, response):
        # set the default locale to a browser negotiated locale
        self.set_default_locale(i18n.NegotiatedLocale(request, self.AVAILABLE_LOCALES, default_locale=self.AVAILABLE_LOCALES[0]))

        # let nagare install the default locale
        super(I18nExampleApp, self).start_request(root, request, response)


app = I18nExampleApp(lambda: component.Component(I18nExample()))

We first create an application class that inherit from wsgi.WSGIApp so that we can override the default start_request behavior in order to install a negociated browser locale.

Since we must pass the list of the available locales to the i18n.NegotiatedLocale constructor, we define a AVAILABLE_LOCALES class constant. The available locales should be given as a list of (language, territory) tuples. The territory has been omitted here since we don't use territory variants of the languages yet.

Then, we set the default locale in our new start_request method and let nagare install the default locale by calling the base start_request method. Note that you should always call the base start_request method since it also installs the security manager and authenticate the user. The i18n.NegotiatedLocale constructor accepts the request as first parameter, the list of the available locales as second parameter and some optional parameters. It examines the Accept-Language headers of the request in order to determine the best locale to use for the current request according to the available locales. The default_locale optional parameter can be used to define the default (language, territory) value when the negociation fails.

Finally, we use a I18nExampleApp instance as our application instead of the default wsgi.WSGIApp (implied when you provide a component instance as app instead of an application instance).

Example 2: using a per-session locale setting

Let's show another example: we will allow the user to change the language setting using links and store the setting in the session in order to persist the change for the next requests. Here is the code:

from nagare import presentation, component, wsgi, i18n
from nagare.i18n import _, _L


class I18nExample(object):
    # the following line is evaluated when the module is loaded, so _ would not work here
    TITLE = _L("Internationalization Example")

    AVAILABLE_LOCALES = (
        (i18n.Locale('en'), u'English'),  # English
        (i18n.Locale('fr'), u'Français'),  # French
    )

    def __init__(self, locale=AVAILABLE_LOCALES[0][0]):
        self.set_locale(locale)

    def set_locale(self, locale):
        # remember the locale to use for the next requests in the session (see I18nExampleApp)
        self.locale = locale

        # we should also initialize the locale dirname since it may not have been set yet
        # we use the current locale dirname, which is set by the WSGIApp.set_locale method
        dirname = i18n.get_locale().dirname
        locale.dirname = dirname

        # change the locale for the current request so that the rendering is performed with the new locale
        i18n.set_locale(self.locale)


@presentation.render_for(I18nExample)
def render(self, h, comp, *args):
    with h.h1:
        h << self.TITLE
    h << _('This is a translated string') << h.br
    h << 'This string is not translated'

    h << h.hr

    # language change
    for locale, label in self.AVAILABLE_LOCALES:
        h << h.a(label, title=label).action(lambda locale=locale: self.set_locale(locale))
        h << " "

    return h.root


class I18nExampleApp(wsgi.WSGIApp):
    def start_request(self, root, request, response):
        # call base implementation
        super(I18nExampleApp, self).start_request(root, request, response)

        if root():
            # reinstall the locale stored in the session
            locale = root().locale
            self.set_locale(locale)


app = I18nExampleApp(lambda: component.Component(I18nExample()))

We moved the AVAILABLE_LOCALES list to the I18nExample component since we'll show a list of language change links in its view. We also added a label to each entry and we use i18n.Locale instances instead of the (language, territory) tuples.

In the associated view, we iterate the AVAILABLE_LOCALES list in order to display a link for each locale. The links actions call the set_locale method, passing the associated i18n.Locale instance to it.

The I18nExample.set_locale method performs two operations. First, it remembers the locale setting as an attribute of the component. Since the component will be pickled into the session after the request, the locale setting is effectively stored in the session and will be back in the next request when the session is unpickled. The locale setting is then used in the I18nExampleApp.start_request method for reinstalling the locale for subsequent requests.

Second, it changes the locale in the current request (where the language change action is received) so that the rendering of the component is immediately affected by the change. However, we can't call I18nExampleApp.set_locale here, because we don't have access to the application instance from the component. Instead, we have to install the locale ourselves by calling the i18n.set_locale directly. But, the locale dirname attribute (i.e. the base directory where the locale catalog files are to be found) may not have been initialized yet, because it's the responsibility of the I18nExampleApp.set_locale method. So, we set the dirname attribute of the new locale to the dirname of the currently installed locale, which should have been initialized by the I18nExampleApp.set_locale method at the very beginning of the request.

Finally, the I18nExampleApp.start_request calls the set_locale method in order to reinstall the previously saved locale for the active request. It's important to call set_locale after calling the base implementation because we want to override the default locale set by the base implementation (remember that set_locale also intialize the security context and authenticate the user).

These two examples explain the basics for setting the locale in a Nagare application. Real-world implementations may store the language setting by other means (see the list at the top-most paragraph of this section) and use a combination of these two techniques.

Attachments