[glotpress-updates] [GlotPress] #342: Cleanup to locales.php
GlotPress
noreply at wordpress.org
Tue Jul 8 22:37:48 UTC 2014
#342: Cleanup to locales.php
---------------------------------+-----------------
Reporter: stuwest | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: locale information | Version:
Resolution: | Keywords:
---------------------------------+-----------------
Comment (by stuwest):
Thanks for feedback everyone. Replies:
> * We need all of the ISO codes as they are the best reference we have
for matching an Accept-Language browser header to a corresponding WP
translation. (I've been working on this problem as recently as this
morning.)
Interesting. OK. Do we really need all the duplicate codes for that or
would the two-digit and the three-digit be enough? Also, it bugs me there
are some errors in those (I just noticed one mi but didn't look at all of
them) so if we're going to leave it perhaps should clean those up.
>As to the name of a language, both in English and native, that really
should be a matter for each translation team to decide. To give you an
idea of the kinds of issues we're facing (just examples, not exhaustive);
the pt_PT community, caught in the middle of a very polemic and artificial
spelling reform, refuses to adhere to it (like most of the country) and
still capitalizes the language name (i.e. "Português" and not
"português"). zn-ch is a whole other discussion, as "Chinese" is really a
macro-language and not a specific variant.
> * As Zé has also pointed out, a lot of the native and English names
have history, as in were requested/determined by translators. I'm fine
with trying to move to more accepted representations, particularly in
cases where there was no deliberate decision to deviate from that.
Yeah it's tough to know whether the mishmash of inconsistent naming was a)
carefully thought out following in-depth review of each name, or b) the
result of on-again, off-again focus by volunteers. :)
On Chinese zh, what caught my eye is that zh-cn is fully translated while
zh isn't even in GlotPress (if
http://translate.wordpress.org/projects/wp/dev/zh/default is where I
should look). So yes it's a macro language but not one that we
consistently use so it seems a distraction to include in locales.php.
> * Fallback is a loaded term. CLDR's approach is highly complicated and
it looks like it is simplified significantly here. See
http://www.unicode.org/reports/tr35/#Locale_Inheritance,
http://www.unicode.org/reports/tr35/#LanguageMatching, etc. We also have a
need to introduce variants, such as sr_Latn and concepts like an
"informal" German translation. See also
http://www.unicode.org/reports/tr35/#Likely_Subtags.
Yeah CLDR's approach IMHO makes sense when there's a detailed clean
structure for locale codes so you can follow their hierarchical model. But
we've got a ton of locales that aren't even in CLDR so a simple one-
dimensional fallback seemed a better fit. (I've played with Mediawiki's
similar implementation see
https://www.mediawiki.org/wiki/Manual:Language#mediaviewer/File:MediaWiki_fallback_chains.svg
for a pretty chart).
> This patch also has syntax errors. $nl-be is not a valid name for a
variable.
> * Aside from the variable syntax issues johnbillion points out, the
local variables ("object names") *are* actually used outside the class.
See
https://glotpress.trac.wordpress.org/browser/trunk/locales/locales.php?rev=931#L1260.
I didn't code this, talk to Nikolay. :-)
Ah ok. Should have caught the dash in object name was too excited about
consistency with the slug. On being used outside the class, *cough* Yoav
*cough*.
> In general, smaller changes will definitely be easier to review. As in,
tackling all ISO changes in one go, all fallbacks in another pass, all
reordering at once, etc.
>
> Can I ask if there was some kind of impetus for this?
On impetus, it started off a month ago wanting to make a small change to
allow CLDR country names in some stats reports. A month later, I have a
monster patch that fixes about a dozen different inconsistencies that
caught my eye. You know how it goes. :)
Seriously though it feels like CLDR could be helpful esp. with 4.0 and
mostly it's been a great chance for me to explore the code and try to get
caught up on i18n stuff since 4-5 years ago when I paid a lot of attention
to it for mediawiki. I want to help.
> Which furthermore illustrates another issue: people in Belgium would
probably rather refer to nl_BE (dutch-as-spoken-in-Belgium) as vls
(Vlaams). Welcome to the languages can of worms :D
Interesting. One of the theoretical benefits of CLDR is that it's
relatively standardized on typical usage. So there still might be
difference of opinion, but for the most part that's up to Unicode to sort
out. Do you buy that argument?
> I'm not sure why $locale->rtl was changed from true to '1'. This should
remain as a boolean.
That's a bug. On my list of things to fix in the script that generates
this but told myself I'd do manually and then forgot.
> Some country codes got changed to be uppercase, but a lot stayed the
same. Probably best to keep it as is for compatibility reasons.
I've thought it was a bit of a standard that a) language codes were
lowercase and country codes were upper case and b) just in case you should
always use case insensitive comparisons. Should I not think that?
--
Ticket URL: <https://glotpress.trac.wordpress.org/ticket/342#comment:8>
GlotPress <https://glotpress.trac.wordpress.org>
Easy comin', easy goin'
More information about the glotpress-updates
mailing list