Tamil in IDN

Here are three things for Tamil language in Internet Domain Names (IDN) that should be implemented.

(a) Tamil Aytham:
------------------------
In page 4, IDN language table TM.pdf,
U+0B83 TAMIL SIGN VISARGA = aytham, is shown with a dotted circle before ஃ.

The dotted circle indicates that Aytham is a combining letter and can only follow a consonant. But that property of Visarga has been modified in Tamil so that Aytham letter can start a word (i.e., English letter FA = ஃப in Tamil).

Please take out the dotted circle in the second column for 0B83 in page 4, IDN language table TM.pdf.

(b) Tamil Shrii:
----------------------

Shrii, as in Madras Tamil Lexicon and in Unicode table must be defined with U+0BB6 (SHA) letter in IDN registry.
http://www.unicode.org/versions/Unicode5.1.0/images/tamil_chart_alt.html http://unicode.org/Public/5.1.0/ucd/NamedSequencesProv.txt
TAMIL SYLLABLE SHRII ஸ்ரீ = <0BB6 0BCD 0BB0 0BC0>.

This Shrii (ஸ்ரீ) conjunct glyph should only be generated with letter SHA (0BB6).

The sequence <0BB8 0BCD 0BB0 0BC0> with letter SA (0BB8) should be shown as ஸ‌்ரீ in IDN. That is, <0BB8 0BCD 0BB0 0BC0> = ஸ‌்ரீ (and, not as shrii conjunct).

(c) Tamil script has evolved KSHA as non-conjunct form. Hence, all KSHA (க்ஷ) conjuncts should be shown as க்‌ஷ = <0B95 0BCD 200C 0BB7> in IDN.

Note the use of Zero Width Non-Joiner (200C) to show non-conjunct க்‌ஷ = <0B95 0BCD 200C 0BB7>.

For questions, please contact me. Many thanks.

0 comments: