Near the beginning of your html file you will find some code that looks like this: Your character set might have a name different from "ISO". You have to change the character set declaration to "UTF-8", by replacing the above meta statement by: A table which gives the standard Unicode as well as its html representation, called html numeric character reference , is given here. A very useful online tool is UTF Converter that lets you quickly convert a string of one or more characters to Unicode in various formats. From the table on page 2 of Unicode Arabic page , you can check that the hexadecimal Unicode representations of the Urdu letters Alif, Re, Daal, and Vaao are, respectively, , , F, and So suppose in your html document you insert the following: The characters typed on this keyboard are automatically converted to their Unicode version and placed in the input.

A caveat is in order here. To prepare html files, you are likely to use some special editor different from TextEdit. We have seen that, in RichText mode, TextEdit processes Urdu letters correctly, displaying the right form of the letter and connecting the letters appropritaely. Other editors, specially the so-called programmer's editors often used to prepare html files, may not do all that. For example, your typed Urdu letters might be displayed in their isolated form from left to right in the order of their entry, without being connected together.

Or worse, your typed input might appear garbled in even more annoying ways! If you are looking for an excellent, free html editor that handles Unicode and UTF-8 well, and displays Urdu text correctly, try Arachnophilia. Of course, the readers of your Web page will be able to see the Urdu text correctly only if their system has been configured for multi-lingual processing and has the Urdu fonts installed. In addition, it might be necessary for your readers to set the viewing option of their web browser for "Unicode UTF-8 " character encoding. Note that the symbols for summation, integration, root, etc.

In Urdu mathematical notation, the dots of the dotted letters are sometimes omitted. The keyboard suffices only for the casual typing of a few mathematical symbols in a general document. While editing certain documents, you need to change Keyboards quite often. For example, you may be working on a dictionary. Or, you might be preparing TeX files for Urdu documents, and need to continually switch keyboards between Urdu for text and English for special characters and TeX commands.

The standard way for changing Keyboards is to select the desired keyboard in the Input menu at the right end of the Apple menu bar at the top. This is cumbersome and annoying when Keyboard changes are very frequent. So you might like to set up a Keyboard Shortcut for it. MacOS has shortcuts programmed for Keyboard change already. These are: Command-space for toggling between keyboards and Command-shift-space for cycling through the active keyboards. But in MacOS versions If you prefer, you can disable the Spotlight shortcuts and enable the Keyboard shortcuts.

To do this in MacOS In the window that opens, click on the Keyboard Shortcuts tab. Select Spotlight in the left column, and disable the shortcut items that show up in the right column.

Of the two Keyboard shortcuts Command-space and Command-shift-space , the former is certainly easier to type. If you have only two keyboards activated say, English and Urdu , then the two shortcuts are equivalent. You can quickly see which keyboards are active by clicking on the Input menu. An icon appears underneath it for each active keyboard. However, if there are more than two active keyboards, then you might like to interchange the shortcuts. You can change a shortcut by double-clicking on it, and typing over it any desired combination of modifier keys Command, Control, Shift, etc.

Often a. So first please make sure that your Mac shows extensions in file names. For this, move into Finder for example, by clicking in a Finder window, or on the Finder icon in the Dock, or at a point on the screen which is not occupied by an application window. Then on the Menu bar the one with the Apple icon at the left , click on Finder , then on Preferences , then on the Advanced tab. Now look at the Show all file extensions item. If the check box on its left does not have a check mark, then click on it so that a check mark appears there.

Finally, close the Advanced window. Change their extensions if necessary, ignoring the Finder's complaint that this could render your files dysfunctional. Another problem some people have encountered is that during editing Urdu letters show up isolated rather than connected together in the normal way. This can happen when the editor being used is different from TextEdit or Bean. For example, at present Microsoft Word does not handle the Naskh and Nastaleeq scripts correctly on the Mac. Even in TextEdit, sometimes Urdu letters appear isolated rather than correctly connected.

This is usually due to TextEdit being run in the plain text mode rather than the rich text mode which Urdu editing requires. Now close Preferences , and quit TextEdit. When you restart TextEdit, it will use Rich Text as the default for new documents. A related problem that has troubled some people is that in their Urdu files some letters don't seem to have correct shapes.

For example, the letters "Goal Hay" or "yay" don't connect to the preceding or following letters properly. The culprit in such cases is nearly always the font used. Please let me know if you discover or design other well-behaved fonts for Urdu. Although usually omitted, they are occasionally needed to remove ambiguity or to show the correct pronunciation of a word. In particular, the tashdeed and madd signs and the zer of izaafat combinations are always helpful to the reader of the text. While composing text, you should type such a mark after typing the letter to which it belongs.

The most frequently used marks are: Alif with madd can be typed directly as shift-A. In Naskh fonts it shows up as a little circle, being the "sukun" of Arabic orthography. A complete list of diacriticl marks is given earlier with the keyboard images. See the note below about maaroof and majhool sounds.

The form entered by Y does not connect to the next letter. So even a majhool "yay" letter that occurs in the middle of a word should be typed as I. Even though both "yay" letters occuring in this word are pronounced with the majhool sound, the first one has to be entered as I. In Arabic, the letter "yay" has two dots underneath. In Urdu, the two dots are shown only if "yay" appears at the beginning or in the middle of a word, but not when it is the final letter of a word or when it stands alone e.

Noon Ghunna adds a nasal quality to the sound of the vowel preceding it. In the freewheeling, inconsistent way of Urdu orthography, Noon Gunna is used only at the end of a word.

In the middle of a word, even where Noon Ghunna would be appropriate, Urdu just uses the ordinary Noon. This inconsistency is forced by the circumstance that in the middle of a word, Noon is written as a shosha with a dot above. Without a dot, such a shosha would be visually quite confusing. In some old books, specially Urdu instructional primers, Noon Ghunna was indicated by a tiny crescent-like mark placed above the Noon.

This indicated nasalization both in the middle and at the end of a word. Note that, by contrast, Hindi takes the rational approach of signifying the nasal modification by always placing a special mark above the affected letter. The rules relevant to these forms are the following: The terminal hamza is usually omitted in modern Urdu publications. Only the words of Arabic origin can have a terminal hamza. It is incorrect to append a hamza to the words derived from other languages. An exception is made in Urdu when a hamza is needed for the "izafat" combination; e.

But the hamza used in such combinations is specific to Urdu spelling; in Persian, the same combinations contain only the yay, not the hamza. But as soon as the next letter is typed, the yay disappears, and the correct combination of hamza and the next letter is displayed. To start an isolated subword, this pair should come after an alif, vaao, daal, ray, etc.

The isolated subword condition is important. Otherwise just a medial hamza form key U is to be used. One should be careful in choosing the correct form of "Goal Hay" in "izaafat" combinations. This point is taken up again in the next subsection. In modern Urdu orthography, this letter is used only in combination with some consonant which precedes it , and its purpose is to modify that consonant's sound to make it an "aspirated letter".

The latter spelling makes absolutely no sense. The urge to reform seems to have been misdirected here to fix what ain't broke. Instead, that urge should have been channeled to popularize the placing of some mark on "Noon" here and in similar cases to indicate that it is the "Noon Ghunna" vowel modifer and not the true consonant "Noon". In general, "Dochashmi Hay" should not be used in any Urdu word that is derived from Arabic or Persian, since these languages do not have aspirated letters.

Aspirated letters can occur only in the words of Indic origin. There is an exception to the rule that "Goal Hay" must be pronounced with an "h" sound. An exception to that exception occurs sometimes, and the terminal "Hay" is actually pronounced as H, not A or E. However, the oddities of Urdu orthography do not end here.

In the words ending in a pronounced Goal Hay which is not isolated but connected to the previous letter, the Hay is often written twice! For example, the word "kah" meaning say! The purpose of doubling the Goal Hay is ostensibly to avoid its being wrongly pronounced as A or E. The reason for writing the Goal Hay twice in this word seems to be just the whim of the scribe rather than any logical need.

In general, you will find that the spelling variation of doubling the Goal Hay is practiced unpredictably and rather inconsistently! Other punctuation symbols such as question mark, exclamation, comma, semicolon, parentheses, brackets, braces, double and single quotation marks, etc.

Punctuation symbols are appropriately reversed or inverted to match the right-to-left flow of text. The Traditional Arabic digit forms are used in Arabic documents. The Eastern Arabic forms are commonly used in Urdu documents with Nastaleeq fonts and in Persian documents with both Naskh and Nastaleeq fonts. An Urdu document produced in a Naskh font looks better when the Traditional rather than Eastern Arabic digits are used.

The number 3. Separating groups of digits by thousand, million, billion. The more traditional separation is by hazaar, laakh lac , karoR crore , arab, kharab, etc. The abbreviation equivalent to "A. This move is accompanied by the adoption of British-American style of formatting numbers, that is, using a period for the decimal point and a comma for the thousands separator. The apostrophe ' key on the Urdu-QWERTY keyboard generates the period symbol, and can be typed as the decimal point to go with Western digit characters.

For example, to produce 3. Western numerals with decimals can thus be typed without needing to switch to the English keyboard or to use any modifier keys. People accustomed to Nastaleeq publications will discover that the documents composed in Naskh have spaces and other punctuation separating each pair of adjacent words.

This is the correct and rational approach to word processing, shared by every non-Nastaleeq word processor in the world. Nastaleeq word processors stand alone in suppressing inter-word spaces.

The user, of course, still has to type spaces to signify ends of words, but those spaces are removed and the words follow each other in a continuous stream. Just imagine reading this English page if it did not include any spaces between words. Deciphering such a character stream requires, in essence, that you already know what you are trying to learn! But that's exactly what is expected of you when you are reading a text composed in Nastaleeq.

Because some Urdu letters e.

When inter-word spaces are used, there is no confusion between any unintended "words" and the intended words because the beginning and end of each intended word is clearly delimited. But in the text edited with current Nastaleeq word processors, the only reason you are able to skip over the unintended "words" is that you already know the intended words, not because the text display is of any help! When computer typesetting of Nastaleeq was first introduced for Urdu in the s, inter-word spaces were actually employed.

The practice of suppressing them is more recent. This unwise retrogression, justified in the name of "tradition and esthetics", is an unnecessary obstacle to anyone trying to learn Urdu. The Nastaleeq script already suffers from too many complexities, obscurities, irregularities, and inconsistencies. It makes no sense to invent more barriers to the accessibility of Urdu. The practice simply prolongs the time it takes students to master the language.

It is also hindering the development of optical character recognition and other important electronic processing technologies for Urdu. Exercise for the reader: Find out what ghatrabood is, and enjoy the story. Not everyone spells English words in Urdu in the same way, but some common conventions are the following: All English consonants have reasonable equivalents in the Urdu alphabet. If an English word begins with the letter "S" that is immmediately followed by one of "c", k", "m, "n", "p", "q", and "t", then the Urdu transliteration often adds an inititial "alif" to the normally expected "seen".

However, this rule addition of "alif" is not followed uniformly. Also the "alif" is never added when the letter following "S" is "l". The transliteration of some English vowels is not phonetically correct, but the practices are too firmly entrenched to do anything about them. Two widely used conventions are the following: The long "a", the open "o", and the diphthongs like "au" and "aw" are generally rendered as "alif".

Other examples are: The diphthong "oi" sometimes gets the same treatment. Thus, "vice" and "voice" as in "Voice of America"! The vowel "e" is usually rendered as "Yeh" with a majhool sound. So if you can read a word written in Hindi, you can spell it in Urdu very easily. But many Urdu writers do not read Hindi and rely on English transliteration of Hindi words to render them into Urdu. This is complicated by some differences in Hindi, English, and Urdu spelling conventions.

As quite a few people are unaware of these conventions, some very poor transliterations of Hindi words are showing up in Urdu publications. It is a pity that many Hindi words that were once part of Urdu's own rich vocabulary are now being misspelled and mispronounced because Urdu has started to isolate itself from its Indic source and heritage. When transliterating Hindi words to Urdu, you need to pay special attention to these cases: Unless modified in some special way, every consonant letter in a Hindi word is supposed to be pronounced as if it is followed by the vowel "a".

In actual practice, this implicit "a" is not pronounced, or is articulated too speedily to be perceptible. But the convention in English transliteration of Hindi words has been to record this sound with an "a" anyway. In Urdu, the standard transliteration practice has been to only represent the actual Hindi letters, and not add this artificial terminal "a". So in Urdu the above words are spelled and pronounced as: In Hindi sometimes two words are joined together without any separator between them. This often happens when the two words are parts of a name. Without vocalization, they are often pronounced even more awfully as "koru" and "pandu"!

But in most cases when "v" occurs at the end of a syllable, it is pronounced as the vowel "o" or "u", and rarely as the consonant "v". Can you tell whether or not the letter "a" in an English transliteration corresponds to a true Hindi vowel "a"? No, because, unfortunately, the English transliteration is too ambiguous to settle that question.

So just look up the word in a Hindi dictionary! The above conventions are illustrated in the following list of Hindi words and their Urdu transliterations: The Ottoman and Modern Turkish languages do not differ much, but their scripts are totally different because Modern Turkish uses a script based on Latin characters. However, since there exists a huge volume of older Turkish publications and manuscripts written in the Ottoman script, this script still remains of interest as an essential scholarly tool and is taught in most departments of Turkish Studies.

It is suitable, for example, for preparing the contents of older texts for linguistic analyses. So with this keyboard layout installed, your computer will be ready for Ottoman Turkish texts. To download and install this keyboard, see the earlier section Installing the Urdu keyboard layout. Also, you will need to install some suitable fonts in order to be able to read, edit, and produce documents in Ottoman Turkish. Some beautiful, freely available fonts are recommended in Installing Urdu Fonts. The fonts and general instructions given here will work on the Windows and Linux machines also.

The following diacritics are used in a standard way: Some diacritics that are used in old Ottoman Turkish texts in non-standard ways are the following: The example below shows a sample text typed in Ottoman Turkish right , together with its equivalent in Modern Turkish left. Incidentally, this entire two-column table has been prepared by using nothing else but the Mac OS's built-in TextEdit utility. The Ottoman text part that you see on the right in the table has been typed using the exquisite Lateef font, which can be downloaded and installed as described in the section Installing Urdu Fonts.

Both the typed right and the transcribed left versions shown below have been taken from the Web site of the Department of Turkish Studies at the University of Michigan. There you can also see the image of the original manuscript in the Osmani calligraphic style referred to as Nastaleeq in Urdu and Persian calligraphy. For large amount of typing in Arabic, this is inconvenient and error prone. With this layout, you can type all the Arabic script characters that these languages use.

In addition, it also has keys for several mathematical and technical symbols. It can be downloadable from here. It provides a variety of "combining diacritical marks" to type "extended characters" of Latin script. The keyboard also includes several math symbols, especially those commonly used in logic, set theory, and number theory, and the most frequently used Greek letters. Shift-0 with CAPS on.

