Right, so you’ve decided your Python application shouldn’t be a parochial little hermit that only speaks one language. Good for you. Welcome to the wonderful, occasionally maddening, world of making your code play nice with the entire planet. We call this twin-headed beast “i18n” (internationalization - 18 letters between the ‘i’ and the ’n’) and “l10n” (localization - 10 letters, you get it). i18n is the plumbing: the hooks and architecture to make multiple languages possible. l10n is the actual translation and cultural adaptation. You can’t have the second without the first.

We’re going to use Babel, which is the de facto toolkit for this job in Python. It’s not the only way, but it’s the most comprehensive. It handles message extraction, formatting dates, numbers, and currencies, and generally keeps you from pulling your hair out.

First, add it to your project. You’re using a virtual environment, right? Of course you are.

pip install Babel

The Absolute Core: Translation Strings and The gettext Underpinnings

Python’s i18n/l10n story is built on the decades-old gettext standard, and Babel is its best friend. The core concept is simple: instead of writing strings directly in your code, you wrap them in a function call that looks them up in a translation catalog at runtime.

The magic function is gettext(), but by convention, we alias it to _() because it’s used everywhere. Let’s set that up. You’ll typically configure this at your application’s entry point.

import gettext
from my_app import app_name  # Imagine this is "MyCoolApp"

# This sets up the 'gettext' machinery to look for translation files (.mo) in
# a directory like './locale/<lang>/LC_MESSAGES/messages.mo'
gettext.bindtextdomain(app_name, './locale')
gettext.textdomain(app_name)
_ = gettext.gettext

Now, in your code, you write:

# Bad, parochial hermit code:
print("Welcome to my app!")

# Good, worldly citizen code:
print(_("Welcome to my app!"))

That _("...") string is now a message ID. It’s the key that will be used to look up the translated string. Which brings us to the next step: you can’t just hope translations magically appear. You have to extract them.

Extracting Messages: The Pottery Wheel

You need to collect all those _("...") strings from your codebase into a Portable Object Template (.pot) file. This is Babel’s job. You create a babel.cfg file to tell it where to look. It’s usually as simple as:

# babel.cfg
[python: **.py]

Then, you run Babel’s extraction command:

pybabel extract -F babel.cfg -o messages.pot .

This scours your current directory (.) for all the message IDs and creates messages.pot. Open it. It’s a text file full of your strings, ready to be given to a translator. Now, for each language you support, say German (de), you initiate a catalog:

pybabel init -i messages.pot -d ./locale -l de

This creates ./locale/de/LC_MESSAGES/messages.po. The .po file is where the translations actually live. Your translator (or you, with a shaky grasp of Google Translate) edits this file, filling in the msgstr for each msgid.

#: main.py:42
msgid "Welcome to my app!"
msgstr "Willkommen in meiner App!"

Finally, you compile the human-readable .po files into the machine-optimized .mo files that gettext actually uses at runtime:

pybabel compile -d ./locale

Boom. Now when your application runs with the German locale, it will find and use the compiled ./locale/de/LC_MESSAGES/messages.mo file.

The Gotchas: Pluralization is a Nightmare

Here’s where the simple model breaks down spectacularly: plurals. In English, you have two forms: singular and plural (“1 file” vs. “2 files”). Easy. In Polish? They have three: one for 1, one for numbers ending in 2,3,4 (but not 12,13,14… because of course), and a plural for everything else. Arabic? Six. Six different plural forms.

This is why gettext has a plural-aware function, ngettext(). You must use it. The syntax is: ngettext('singular message', 'plural message', number)

The extraction and .po file process handles this, creating special pluralized entries. Your code must handle the logic.

file_count = 5
# WRONG. Don't do this.
print(_("You have ") + str(file_count) + _(" files"))

# RIGHT. Let gettext handle the plural logic.
print(ngettext("You have {count} file", "You have {count} files", file_count).format(count=file_count))

The corresponding .po entry will have multiple msgstr lines for each plural form. This is non-negotiable for professional applications.

Beyond Strings: Formatting the World’s Stuff

Localization isn’t just words; it’s formatting. A date written as 04/05/2023 is ambiguous. Is it April 5th or May 4th? It depends on where you are. Babel provides a set of formatting classes that are locale-aware.

from babel.dates import format_date, format_datetime
from babel.numbers import format_currency, format_decimal
from datetime import datetime

now = datetime.now()
number = 12345.678

print(format_date(now, locale='en_US'))  # -> Apr 5, 2023
print(format_date(now, locale='de_DE'))  # -> 05.04.2023

print(format_currency(number, 'USD', locale='en_US'))  # -> $12,345.68
print(format_currency(number, 'EUR', locale='fr_FR'))  # -> 12 345,68 €

print(format_decimal(number, locale='en_US'))  # -> 12,345.678
print(format_decimal(number, locale='de_DE'))  # -> 12.345,678

This is the real power. You pass in a data object and a locale, and it handles the formatting conventions automatically. It’s a crime not to use it for any user-facing data.

The Lazy Gettext Problem for Web Apps

In a web application, the locale is often determined per request, not at the time your code is imported. If you bind _ to the default gettext at import time, it’s fixed for the entire process. You need a way to get a translation function for the current request.

The solution is to use gettext’s ability to create translation objects on the fly. You set up a base translations object and then create a per-request gettext function from it.

from gettext import translation
import threading

# At app startup, load the translations for your domain
translations = translation('myapp', localedir='./locale', fallback=True)

# In your request handler (e.g., in Flask)
def handle_request():
    user_locale = get_locale_from_request()  # e.g., 'de_DE'
    user_translations = translations.load(user_locale)
    _ = user_translations.gettext

    # Now use _() as normal within this request. It will be in the user's language.
    print(_("Welcome!"))

This pattern is so common that most web frameworks (Flask-Babel, Django) have plugins that handle it for you, creating a _ function available in your templates and views that’s automatically bound to the current request’s locale. Use them. They save you from reinventing a very tricky wheel.

The path to full i18n is long and littered with edge cases (right-to-left text, anyone?), but this is the foundation. Get the gettext wrapping right, respect pluralization, and use locale-aware formatting. Do that, and you’re already ahead of 90% of the applications out there.