[wp-polyglots] Translation Guidelines / HTML Character Entities
Francesc Hervada-Sala
francesc at hervada.org
Tue Sep 4 06:03:11 GMT 2007
Hi all,
as a newcomer I've spent some time reading the wp-polyglots archives and
found many interesting discussions about encoding .po files as UTF-8 and
the use of HTML character entities.
It seems to me that it is a common practice to use HTML character
entitites for all special characters in translated messages. On the
other hand the translation guidelines say that one should avoid using
HTML character entities:
With a few exceptions (noted below), all translations should be
written literally, rather than escaping accented and special
characters with HTML character entities.
Source:
http://codex.wordpress.org/Translating_WordPress#Guidelines_and_requirements
I try to sum up:
1. .mo files without HTML entities do not work for blogs using other
character encodings than UTF-8 (the later being the default and
recommended in WP).
2. .mo files with HTML entities do not work for e-mail messages sent
by wordpress.
3. .po files with HTML entites are less translator-friendly and thus
more error-prone.
As Kim Suominen pointed out on March 7th, 2005, the best solution would
be the WP core to translate UTF-8 into the blog's character encoding on
runtime (both when generating html and e-mails). See
http://comox.textdrive.com/pipermail/wp-polyglots/2005-March/000449.html
At the translation files I've worked on (catalan for WP 2.2, 2.2.1 and
2.2.2) I've followed this approach:
* translated strings in .po files contain no HTML character entities
(original strings are obviously left with entities untouched)
* a Perl script I wrote generates an equivalent .po file with HTML
character entities in translated strings
* there are 2 deployed versions of the WP catalan translation: the
"normal version" (just for UTF-8 blogs, works fine with e-mail),
the "html version" (works with all blog character encodings,
produces "ugly" error messages)
Do you think this approach could be generalised for all WP localizations?
By the way I think the common practice today does not meet the
guidelines - we should change one of both to let them accord.
Cheers,
Francesc Hervada-Sala
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://comox.textdrive.com/pipermail/wp-polyglots/attachments/20070904/4c357b22/attachment.htm
More information about the wp-polyglots
mailing list