[wp-polyglots] Unicode characters instead of entities in POT
Samuel Murray (Groenkloof)
samuel at translate.org.za
Tue Apr 28 14:14:47 GMT 2009
Thomas Scholz wrote:
> This depends on the MIME type...
I think it depends on the browser :-)
> Any other entity might be shown literal.
Is there a list somewhere on the web of which browsers show them literal?
> Some older user agents (Opera 7...) do exactly that.
Earlier versions of Opera implemented the standards very strictly, and
as a result, many web sites did not work correctly in Opera. Then the
Opera people got smart and started implementing what they call "street
HTML", if Opera detects that a page is possibly non-compliant.
> So: If you don’t use real UTF-8, use numeric character references, eg.
> … not ….
I think a reason why hellip may be used is because it is easy to "read"
what the character is. It is an ellips. If numbered codes were used,
translators would not know what the code means unless they used a
look-up table, and volunteer translators tend not to use look-up tables
-- they prefer educated guesswork, which in the case of numbered
entities can be dangerous.
My own opinion is to reduce the "fancy" characters to a minimum.
>> A translator needs a working knowledge of HTML anyway. Replacing or
>> adding verbose descriptions of entities isn't worth it.
> Antithesis: A translator needs basic knowledge of character encoding
> anyway.
This applies to trained, professional translators. Volunteer
translators are often amateurs and have very little training. One has
to be pragmatic -- a logical, easy to use system is better than
something which is correct only from a purist's point of view.
> Finding and using UTF-8 capable software shouldn’t be so hard.
Do you know of PO editors that make it easy for translators to type the
raquo and the hellip? Neither PoEdit nor Virtaal does.
Samuel
--
Samuel Murray
samuel at translate.org.za
More information about the wp-polyglots
mailing list