Arabic alphabet

The basis for the Urdu alphabet is the Arabic alphabet:

ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ي

Besides these there are hamza ء and alif maksura ى (U+0649) and various vowel signs.
Letter shape depends on context - here only the isolated shapes are shown.


Persian alphabet

Persian (Farsi) uses the Arabic alphabet, and adds a few letters, namely
پ (U+067E, p), چ (U+0686, c), ژ (U+0698, j), گ (U+06AF, g):

ا ب پ ت ث ج چ ح خ د ذ ر ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن ه و ی

Here ک (U+06A9) was used instead of ك (U+0643).
Also ی (U+06CC) replaces ي (U+064A).
It looks like this is only a matter of font, but nevertheless Unicode assigned separate character codes.


Urdu alphabet

Urdu uses the Persian alphabet, normally in Nastaliq style, with some added characters, namely
ٹ (U+0679, ), ڈ (U+0688, ), ڑ (U+0691, ), with a small superscript ط to indicate a retroflex consonant.

The letter ں (U+06BA, nūn-e ğunnah) denotes nasalization (and only occurs in final position). Nasalization elsewhere is indicated by an ordinary ن.

The letter ھ (U+06BE, "double-eyed he", dō-čašmī hē) is used to form 11 digraphs and indicates aspiration:

بھ پھ تھ ٹھ جھ چھ دھ ڈھ ڑھ کھ گھ

The letter ے (U+06D2, baṛī yē) denotes final e.

The usual lam-alif combination لا is لا in Urdu style.

Alphabet (omitting the digraphs with ھ)

ا ب پ ت ٹ ث ج چ ح خ د ڈ ذ ر ڑ ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن و ہ ی

Here ہ (U+06C1) was used instead of ه (U+0647).

Several letters occur only in borrowed words.

Note the difference in order: here He comes after Vao.

See also Hugo's Urdu alphabet pages. Fonts.

Two-Eyed He

Arabic Heh. Arabic has four shapes for its Heh: isolated, initial, medial and final: ه ههه.
The medial form is written either as a double loop ﻬ or as a zigzag ﮩ, a matter of font.

Urdu He. Urdu commonly uses rather different forms (called "he goal"): ہ ہہہ

The initial form of he goal has an inverted comma below. In the Nastaliq style, the medial form may, or may not, have an inverted comma below.

[hain] [kahte] [na] [bada]

In the Naskh style, the medial form usually has zigzag form (but one also sees the double loop here).

Urdu double-eyed he. Apart from the four shapes shown above, the usual Nastaliq style has a fifth shape for h called "double-eyed he". It is used to denote aspiration of the foregoing consonant. This double-eyed he has a single "heart" form in all positions.

In Naskh style one uses the Arabic initial and medial Heh for this function. Urdu fonts tend to use the initial shape also in isolated position, and the medial shape also in final position. Many other fonts have four equal shapes or have equal initial and medial shape, and equal final and isolated shape.

: equal final and isolated shape.

: medial shape in final position.

In recent typography, double-eyed he is usually reserved to indicate aspiration. But this is not an iron rule - usage fluctuates.

: double loop medial Heh in (Arabic) Naskh style used for both functions.


Uyghur alphabet

Uyghur is written with several alphabets. Here only its Arabic-based alphabet is discussed.

Uyghur uses the Persian alphabet, with some additions, namely ڭ (U+06AD, ng), ۋ (U+06CB, w),
and explicitly written vowels: ا (U+0627, a), ە (U+06D5, ä), و (U+0648, o), ۇ (U+06C7, u), ۆ (U+06C6, ö), ۈ (U+06C8, ü), ﯤ (U+06D0, e), ى (U+0649, i), where the consonant y is written ﻱ (U+064A).

The ayn is not used.

In initial and isolated position the vowels are preceded by initial yeh-hamza (U+0626): ئا ئه ئو ئۇ ئۆ ئۈ ئې ئى

Yeh-like shapes. Arabic has a yeh ي that is always dotted, and an alif maksura ى that is never dotted. Regionally, for example in Egypt, final yeh is not dotted, and some fonts do not have dots on final or isolated yeh. Since in Uyghur the dots make the difference between i and y, which is not a matter of font, people often use U+FEF1 to force the dots.

(Unicode has U+06CC for the version of yeh where only initial and medial forms are dotted, as used in Egypt, and in Persian, Urdu, etc., but that does not help against fonts that fail to show dots on final U+064A.)

Heh-like shapes. Unicode character U+06BE not only represents Urdu double-eyed he, it also represents Uyghur H, a rather different character (since it is not the second half of a ligature, and since it is not one out of five, but two out of four Heh-shapes).

Uyghur (just like Kurdish) needed an E and an H and took the four Arabic forms of Heh, and used two (the isolated and the final ones) for E and two (the initial and the medial ones) for H.

Uyghur E uses U+06D5. Initial E is written preceded by yeh-hamza, so that no initial form is needed. Medial and final E use final Heh. Isolated E uses isolated Heh.

Concerning the distribution of forms of its H, Uyghur has two shapes, one for initial and isolated position and one for medial and final position.

Kurdish H is similar, but uses medial Heh only in medial position, and initial Heh in initial, final and isolated position.

See an Uyghur site and fonts and more.