Skip to content

Fix Arabic plural rule to match CLDR (#816)#1289

Open
apoorvdarshan wants to merge 1 commit into
python-babel:masterfrom
apoorvdarshan:fix-arabic-plural-816
Open

Fix Arabic plural rule to match CLDR (#816)#1289
apoorvdarshan wants to merge 1 commit into
python-babel:masterfrom
apoorvdarshan:fix-arabic-plural-816

Conversation

@apoorvdarshan

Copy link
Copy Markdown

Fixes #816.

Problem

babel/messages/plurals.py keeps a hand-maintained table of gettext Plural-Forms rules. Its Arabic (ar) rule has the "many" (form 4) and "other" (form 5) categories swapped relative to Unicode CLDR:

'ar': (6, '(n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 : n%100>=0 && n%100<=2 ? 4 : 5)'),

CLDR assigns many to n % 100 in 11..99 and other to the rest, but the table assigns form 4 to n % 100 in 0..2 instead. As a result, generated Arabic catalogs pick the wrong plural for any n whose n % 100 is 11–99 (e.g. 11, 50, 99, 111, 999). This also disagrees with the CLDR-derived rule that Babel itself ships (Locale('ar').plural_form).

Fix

One line — align the rule with CLDR:

'ar': (6, '(n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 : n%100>=11 && n%100<=99 ? 4 : 5)'),

This matches the standard gettext Arabic Plural-Forms used by Weblate/Transifex.

Testing

  • Added test_get_plural_arabic, which asserts the corrected expression and cross-checks that the hand-maintained rule agrees with Babel's CLDR-derived Locale('ar').plural_form across a range of n (0, 1, 2, 3, 10, 11, 50, 99, 100, 101, 102, 111, 999, 1000). It fails on master and passes with this change.
  • The plural, number, catalog, and message test suites pass (6202 passed). The only failure in a full run is the pre-existing, system-timezone-dependent test_get_timezone_name_misc (issue test_get_timezone_name_misc fails when system TZ isn't UTC #935), which fails identically on master here and is unrelated to this change.
  • ruff is clean on the changed files.

Disclosure: this change was prepared with the assistance of an AI tool (Claude Code). I verified the rule against Babel's own CLDR data, added and ran the regression test, ran the suites and linter, and take responsibility for the contribution and will respond to review feedback personally.

The hand-maintained rule in babel/messages/plurals.py had the "many"
(form 4) and "other" (form 5) categories swapped relative to CLDR, so
generated Arabic catalogs selected the wrong plural for any n whose
n%100 is in 11..99 (e.g. 11, 50, 99, 111). Align it with the
CLDR-derived rule that Babel already ships.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent plural form for Arabic

1 participant