[wp-polyglots] Locale Directory Structure

Morgan Doocy morgan at doocy.net
Mon Feb 28 07:03:42 GMT 2005


On Feb 27, 2005, at 10:25 PM, Ryan Boren wrote:
> On Mon, 2005-02-28 at 00:07 -0500, K Suominen wrote:
>> On Sun, 27 Feb 2005 22:34:06 -0600, Ryan Boren <ryan at boren.nu> wrote:
>>> In order to make automated packaging easier, let's define the locale
>>> directory structure more precisely.
>>
>> The fi_FI locale currently already has fully automated packaging
>> implemented.  It can also be easily extended to cover more locales, by
>> calling make(1) recursively, e.g. from an upper-level directory.
>
> That's good, although I hesitate to ask people to maintain Makefiles.
> One simple script could handle the packaging chores for all locales.
> Makefiles allow each locale to be flexible in how directories and files
> are organized, but that flexiblity comes at a bit of a maintenance
> cost.

Nothing against Kim's makefiles, but I don't see them being necessary 
when (if?) we have site-wide automation.

>> The only definitive version of the GPL is the one in English -- it is
>> probably not a good idea to replace the English version in any
>> distribution.
>
> Locales that provide a translation should follow these guidelines:
>
> http://www.gnu.org/licenses/translations.html
>
>> I think at least a couple other translations in addition to Finnish
>> are providing a separate file with a translated GPL for assistance in
>> interpreting the definitive version.  That seems to be what other
>> projects have done as well.  (Maybe it is a GNU guideline, but I
>> haven't checked.)
>
> Sounds like a good policy.

I agree. The translations are "official unofficial" -- meaning they're 
contributions from the international community for informational 
purposes, but aren't legally binding. As long as the translations 
adhere to the guidelines, I think it'd be great to include these.

>> When I was automating the packaging, I thought it would have been much
>> easier to place all files in a single tree, reflecting their proper
>> location in the final distribution.  In other words:
>>
>> readme.html
>> wp-config-sample.php
>> wp-content/themes/default/...
>> wp-includes/languages/fr_FR.po  (not that the .po file needs to be 
>> distributed
>> wp-includes/languages/fr_FR.mo
>>
>> What's the benefit from placing the themes or the message file 
>> elsewhere?
>
> There's no need to replicate ugly directory structure.
> Makefiles/scripts can put things where they need to be in the final
> packaging.  Let's use automation to hide the annoying details.

Agreed.

>> For Finnish, though, I found it easiest to automate the conversion,
>> eliminating the need to commit different converted versions of the
>> same content into the repository.
>
> I prefer automated conversion too.  Use UTF-8 for source files; use
> iconv to generate other encodings.  Ideally, we would have one UTF-8 po
> for each locale with all other po and mo files being generated.
> Generated files need not be committed, although committing them is
> convenient for those who just want to pick up a ready-to-go mo from the
> repository without getting a full package.  Committing derived objects
> allows us to provide versioned downloads of the end-user shippables
> without having to mess with a separate staging area.  That convenience
> may not justify the repository clutter, however.

I think a staging area would be fabulous. Maybe I'm not seeing 
something, but I think it would be pretty easy to implement too. My 
ideal would be to have an automation script hook into svn commits and 
automatically regenerate fully localized (and iconv'd) packages of 
WordPress (as well as just the individual component files, if desired) 
in a staging area -- say http://wordpress.org/international/ -- where 
they're made available for download, in the most frequently used 
character encodings for that locale.

>> I also consider the files in the repository to be "source" files.  The
>> automated packaging procedure (in the Makefile's and the top-level
>> build.sh) converts the source files to the correct character set or
>> entity encoding as required.  This minimizes the maintenance.
>
> Yes, agreed.

Me too. See above.

>> However, it also has a second impact: the "source" files should not be
>> distributed "as is."  They should be processed as part of the
>> packaging, as implemented.  For example, the "source" files are in ISO
>> 8859-1, because UTF-8 support still falls short on the UNIX systems I
>> use.  I have set the svn:mime-type property accordingly.
>
> Older UNIX systems do indeed suck.  Newer Linux ones are usually UTF-8
> out-of-the-box.  Regardless, UTF-8 makes the best source because it can
> be converted to just about every other character set.  It is a 
> universal
> donor.

Suggestion to Kim: perhaps you could iconv your files to UTF-8 before 
committing them? I know it'd be a little inconvenient for you on your 
older UNIX, but I'd really rather see all the source files in UTF-8. I 
share Ryan's feelings: UTF-8 is the best source to have.

> Summary (Ryan's ideal world that he will never have):
>
> I prefer having only one UTF-8 po file, one UTF-8 theme, and one UTF-8
> set of dist files with all other encodings being generated by iconv.
> Translators would work only with the UTF-8 "source" files since
> everything else is generated.  If we want to commit generated files,
> they should go in a separate directory structure.  We could have a
> source hierarchy and a derived encoding hierarchy.
>
> Source hierarchy (everything in UTF-8):
>
> trunk/messages/
> trunk/theme/
> trunk/dist/
>
> Generated encoding hierarchy:
>
> /trunk/ISO-8859-1/messages/
> /trunk/ISO-8859-1/theme/
> /trunk/ISO-8859-1/dist/
>
> Translators would work only with the source files.  The files under the
> ISO-8859-1 directory would be generated from the source files.  They 
> are
> committed to the repository merely as a convenience to users wanting to
> selectively retrieve specific files in their preferred encoding.

Like I said above, I'd love to see the repository just house the 
sources, and have the automation generate the downloadables (packaged 
and converted), which would be housed outside of the repository.

> Locale Makefiles are nice, but I think getting everyone to buy into 
> them
> will be difficult.  Makefiles are not pretty things.

Agreed.

> Automation proposal:  Allow locale Makefiles for those who want to
> maintain them.  For those who don't, we'll provide a universal script
> that will work with any properly structured locale.  When packaging
> locales, the Makefile will be used if present.  Otherwise, the 
> universal
> script is used.

Again, nothing against the makefiles, but I'd much prefer a transparent 
server-side solution that would make them unnecessary. Failing that 
though, I still don't see the need for different locales to do it 
different ways if we have one universal script.

Morgan



More information about the wp-polyglots mailing list