Tamil in IDN

In the year 2005 CE, Indic IDN being considered was Unicode 3.0. But now, Unicode 5.2 is the current version, and the following Tamil letters need implementation in IDN.

(a) Tamil letter, SHA (U+0BB6)
------------------------------------------

The Tamil letter, SHA (U+0BB6) should be included in IDN.

Shrii, as in Tamil Lexicon and in Unicode table must be defined with U+0BB6

(SHA) letter in IDN registry.
http://unicode.org/Public/UNIDATA/NamedSequences.txt

TAMIL SYLLABLE SHRII ஸ்ரீ = <0BB6 0BCD 0BB0 0BC0>. This Shrii (ஸ்ரீ) conjunct glyph should only be generated with letter SHA (0BB6).

The sequence <0BB8 0BCD 0BB0 0BC0> with letter SA (0BB8) should be shown as ஸ‌்ரீ visually in IDN. That is, <0BB8 0BCD 0BB0 0BC0> = ஸ‌்ரீ (and, not as shrii conjunct visually).

(b) Tamil OM sign (U+0BD0)
----------------------------------------

If Devanagari OM sign (U+0950) is allowed for IDN, Tamil OM sign (U+0BD0) is needed in IDN also.

For blocking any confusability problems, the letter ம் should be blocked after Tamil OM sign (U+0BDO). Note that when Om is written as a sequence of two letters, i.e., , the letter ம் will always be present. So, the blockage of ம் following U+0BD0 will distinguish between OM sign and the word, Om as a sequence of 2 letters.

This is similar to situation in Devanagari script. And if Devanagari OM sign is allowed, Tamil OM sign should be allowed in IDN also. While Grantha loan conjunct, Shrii (Section (a)) will be allowed, Tamil OM sign is important in the native religions of India.

Also, note a graphic variant of Tamil OM sign contains Vel "spear" also. It is very popular form among Tamils not just in India but also in Malaysia, Singapore and Sri Lanka. If you add that glyph (with vel) as a requirement for Tamil OM sign for IDN, it will be easy to distinguish visually even in a small-screen PDA. If samples of the vel-inlclusive OM sign glyph is needed, please let us know.

(c) Display of Non-Conjunct K-SSA in Tamil IDN
----------------------------------------------------------------

Tamil script has evolved K-SSA as non-conjunct form. The Unicode Standard (TUS 5.2) recommends the use of non-conjunct K-SSA using ZWNJ (U+200C) joiner. This will make the understandability and clarity of the URL in Tamil script much better.

Popular Tamil editors such as NHM writer,
http://software.nhm.in/products/writer
produce nonconjunct forms of ksh:
க்‌ஷ் க்‌ஷ க்‌ஷா க்‌ஷி க்‌ஷீ க்‌ஷு க்‌ஷூ க்‌ஷெ க்‌ஷே க்‌ஷை க்‌ஷொ க்‌ஷோ க்‌ஷௌ

http://www.unicode.org/versions/Unicode5.1.0/

Section 9.6 Tamil, page 325.
"The situation is quite different for Tamil because the script uses very few consonant conjuncts. An orthographic cluster consisting of multiple consonants (represented by ) is normally displayed with explicit viramas (which are called pulli in Tamil). The conjuncts kssa and shra are traditionally displayed by conjunct ligatures, as illustrated for kssa in Figure 9-13, but nowadays tend to be displayed using an explicit pulli as well.

Figure 9-13. Kssa Ligature in Tamil
க + pulli (U+0BCD) + ஷ ⇒ க்ஷ ksha
To explicitly display a pulli for such sequences, zero width non-joiner can be inserted after the pulli in the sequence of characters.
"

The conjunct kssa should not be allowed in IDN, and always ZWNJ be present in the k-ssa series to make them non-conjunct.

N. Ganesan

0 comments: