[wp-polyglots] Unicode characters instead of entities in POT
Thomas Scholz
info at toscho.de
Tue Apr 28 13:42:04 GMT 2009
Nikolay Bachiyski:
> On Tue, Apr 28, 2009 at 16:05, Xavier Borderie <xavier at borderie.net>
> wrote:
>> Just saw this ticket being closed by Nikolay:
>> http://core.trac.wordpress.org/ticket/7099
>>
>> Since the POT (and PO/MO) uses UTF-8, why can't we just use actual
>> Unicode characters rather than their HTML entities equivalent?
>
> There are many editors, which don't support either showing or entering
> these characters. Browsers are a lot smarter. They revert to a basic
> font if the current one can't show the character and don't rely only
> on UTF-8 representation.
This depends on the MIME type: In XHTML (application/xhtml+xml) only five
entities MUST be resolved: <, >, ", & and ' ('
should be avoided due to bad support in some HTML user agents). Any other
entity might be shown literal. Some older user agents (Opera 7, early
Gecko derivates) do exactly that.
So: If you don’t use real UTF-8, use numeric character references, eg.
… not ….
> A translator needs a working knowledge of HTML anyway. Replacing or
> adding verbose descriptions of entities isn't worth it.
Antithesis: A translator needs basic knowledge of character encoding
anyway. Finding and using UTF-8 capable software shouldn’t be so hard.
This is even mandatory for any language with an ISO-8859-1 incompatible
alphabet (russian, tamil etc.).
Thomas
--
Redaktion, Druck- und Webdesign
http://toscho.de
0160/1764727
Twitter: @toscho
More information about the wp-polyglots
mailing list