89.3 locale: Number, Currency, and Date Formatting
Right, let’s talk about making your app stop being so… American. Or British. Or whatever your default is. You’ve probably hard-coded a comma here, a dollar sign there, and called it a day. That works until your first user from Germany sees 1,99 for a price and thinks you’re charging one dollar and ninety-nine cents, not one thousand and ninety-nine. Whoops. That’s where locale comes in—it’s your app’s cultural and linguistic settings, and it’s the single most important tool for not accidentally insulting your users’ number formats.
We’re going to use Python’s locale module. It’s a bit of a cantankerous old beast, but it gets the job done. First, a crucial warning: the locale module is a global state monster. Setting it changes the behavior for your entire thread, and sometimes your entire process. It’s the kind of design that makes software architects weep into their abstract interfaces, but here we are. We’ll deal with it.
The Golden Rule: Set Your Locale Early (And Hope)
Before you do anything, you must set a locale. The default is usually the C locale, which is about as useful for localization as a typewriter. You don’t want that. You typically want the user’s system default.
import locale
# Try to set to the user's default locale. This is what you want 90% of the time.
try:
locale.setlocale(locale.LC_ALL, '')
except locale.Error as e:
print(f"Wow, your system's locale is a mess. Falling back to 'en_US.UTF-8'. Error: {e}")
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
Why the try-except? Because locale names are wildly non-portable. On your Mac, '' might resolve to en_US.UTF-8. On your friend’s Linux machine, it might be en_US.utf8. On Windows, it’s a whole other circus of strings like English_United States.1252. The locale module will throw a fit (Error: unsupported locale setting) if it doesn’t like the format. It’s the first of many absurdities you’ll encounter.
Formatting Numbers Like a Human
Now for the fun part. Never use str(float(...)) again.
# How you *think* numbers work (a bad idea)
number = 1234567.89
print(f"{number:,.2f}") # Output: 1,234,567.89
# How the rest of the world might see it
locale.setlocale(locale.LC_ALL, 'de_DE.UTF-8')
formatted = locale.format_string("%.2f", number, grouping=True)
print(formatted) # Output: 1.234.567,89
# Let's break down that format string:
# "%.2f" - The good old-fashioned float format. The locale module hijacks it.
# grouping=True - This is the magic. It adds the thousands separators.
See? In Germany, the comma is the decimal separator and the period is the thousands separator. The locale module handles this swap for you automatically. You just tell it you want grouping, and it uses the correct symbols.
Currency: Where It Gets Political
Formatting currency isn’t just about the symbol; it’s about where the symbol goes. Is it €10 or 10€? And what’s the spacing? Don’t you dare guess.
# Let's stay in Germany for a bit
amount = 1234.56
# This is the big gun: locale.currency()
formatted_currency = locale.currency(amount, grouping=True)
print(formatted_currency) # Output: 1.234,56 €
# Notice the symbol is at the end, with a space. Perfect.
But wait, what if you’re in the US?
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
formatted_currency = locale.currency(amount, grouping=True)
print(formatted_currency) # Output: $1,234.56
Symbol at the front, no space. The currency() function is smart enough to handle all these conventions, which are defined by the Unicode Common Locale Data Repository (CLDR)—a massive database of these rules that someone else maintains so you don’t have to.
Dates: The Ultimate Formatting Nightmare
If you think numbers are bad, dates are a minefield. 04/05/2023: is that April 5th or May 4th? The answer is “yes,” depending on your location. Never, ever format dates manually.
We’ll bring in datetime for this, as locale influences it too.
from datetime import datetime
now = datetime.now()
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
print(now.strftime('%c')) # Output: Thu May 2 14:30:22 2023 (typical US format)
locale.setlocale(locale.LC_ALL, 'it_IT.UTF-8')
print(now.strftime('%c')) # Output: gio 02 mag 14:30:22 2023 (Italian format, with abbreviated day and month names)
# The '%c' format code means "the preferred date and time representation for the current locale".
The key is to use format codes like %x (date only), %X (time only), and %c (both) that are defined by the locale. The alternative is manually specifying every element, which defeats the whole purpose.
The Critical Pitfall: Threads and Temporary Changes
Remember how I said the locale is global state? This is a disaster waiting to happen in any multi-threaded application. Thread A sets the locale to German, then Thread B gets scheduled and expects English but gets German numbers. Chaos ensues.
The best practice is to set the locale once at startup and pray. If you absolutely must work with multiple locales, you have two less-bad options:
- Use the
decimalorbabellibraries. Thedecimalmodule can be configured with specific formatting contexts, andbabelis a much more modern and powerful i18n library that doesn’t rely on global state. - Do it the ugly way: save the current locale, change it, do your formatting, and immediately restore it. It’s a performance hit and prone to errors if an exception occurs, but it works.
def format_currency_for_locale(amount, loc='de_DE.UTF-8'):
original_locale = locale.getlocale() # Save the current locale
try:
locale.setlocale(locale.LC_ALL, loc)
return locale.currency(amount, grouping=True)
finally:
locale.setlocale(locale.LC_ALL, original_locale) # Restore it no matter what
print(format_currency_for_locale(1234.56))
It’s clunky, but it keeps the global state pollution contained. Honestly, this is one of those areas where the standard library solution shows its age. For any serious application, after you’ve learned the basics here, I strongly recommend looking at babel. It’s the grown-up solution that doesn’t make you fight with global state. But understanding the built-in locale module is a rite of passage. Now go forth and format correctly.