TheVietnamese language is written with a Latin script with diacritics (accent tones) which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with software that can be installed on the device or from third-party software such as UniKey. Telex is the oldest input method devised to encode the Vietnamese language with its tones. Other input methods may also include VNI (Number key-based keyboard) and VIQR. VNI input method is not to be confused with VNI code page.
Historically, Vietnamese was also written in chữ Nm, which is mainly used for ceremonial and traditional purposes in recent times, and remains in the field of historians and philologists. There have been attempts to type chữ Hn and chữ Nm with existing Vietnamese input methods, but they are not widespread.[1][2] Sometimes, Vietnamese can be typed without tone marks, which Vietnamese speakers can usually guess depending on context.
There are as many as 46 character encodings for representing the Vietnamese alphabet.[3] Unicode has become the most popular form for many of the world's writing systems, due to its great compatibility and software support. Diacritics may be encoded either as combining characters or as precomposed characters, which are scattered throughout the Latin-1 Supplement, Latin Extended-A, Latin Extended-B, and Latin Extended Additional blocks. The Vietnamese đồng symbol is encoded in the Currency Symbols block.
For systems that lack support for Unicode, dozens of 8-bit Vietnamese code pages have been designed.[3] The most commonly used of them were VISCII, VSCII (TCVN 5712:1993), VNI, VPS and Windows-1258.[8][9] Where ASCII is required, such as when ensuring readability in plain text e-mail, Vietnamese letters are often encoded according to Vietnamese Quoted-Readable (VIQR) or VSCII Mnemonic (VSCII-MNEM),[10] though usage of either variable-width scheme has declined dramatically following the adoption of Unicode on the World Wide Web. For instance, support for all above mentioned 8-bit encodings, with the exception of Windows-1258, was dropped from Mozilla software in 2014.[11]
Many Vietnamese fonts intended for desktop publishing are encoded in VNI or TCVN3 (VSCII).[9] Such fonts are known as "ABC fonts".[12] Popular web browsers lack support for specialty Vietnamese encodings, so any webpage that uses these fonts appears as unintelligible mojibake on systems without them installed.
Vietnamese often stacks diacritics, so typeface designers must take care to prevent stacked diacritics from colliding with adjacent letters or lines. When a tone mark is used together with another diacritic, offsetting the tone mark to the right preserves consistency and avoids slowing down saccades.[13] In advertising signage and in cursive handwriting, diacritics often take forms unfamiliar to other Latin alphabets. For example, the lowercase letter I retains its tittle in , ỉ, ĩ, and .[14] These nuances are rarely accounted for in computing environments.
Vietnamese writing requires 134 additional letters (between both cases) besides the 52 already present in ASCII.[15] This exceeds the 128 additional characters available in a conventional extended ASCII encoding. Although this can be solved by using a variable-width encoding (as is done by UTF-8), a number of approaches have been used by other encodings to support Vietnamese without doing so:
Many fonts support a subset of the Latin writing system that omits much of the Vietnamese alphabet. Due to the high density of Vietnamese-specific characters in Vietnamese text, Web browsers that implement font substitution reliably produce a ransom note effect when the webpage specifies an inadequate font.
Unicode includes over 10,000 Nm characters as part of Unicode's repertoire of CJK Unified Ideographs. Of these characters, 10,082 can be found in the CJK Unified Ideographs Extension B block, while the rest are distributed between the CJK Unified Ideographs, CJK Unified Ideographs Extension A, and CJK Unified Ideographs Extension C blocks. A further 1,028 characters, including over 400 characters specific to the Ty language, are encoded in the CJK Unified Ideographs Extension E block. The characters are taken from the Vietnamese standards TCVN 5773:1993 and TCVN 6909:2001 [error for TCVN 6056:1995?], as well as from research by the Han-Nom Research Institute and other groups.[18] All the characters in TCVN 5773:1993 and about 95% of the characters in TCVN 6909:2001 [error for TCVN 6056:1995?] have corresponding codepoints in Unicode 5.1, though TCVN 5773:1993 itself mapped most of its characters to the Private Use Area of Unicode.[19] Unicode 13.0 added two diacritical characters to the Ideographic Symbols and Punctuation block that were commonly used to indicate borrowed characters in chữ Nm.[20][21]
A purely physical Vietnamese keyboard would be impractical, due to the sheer number of letter-diacritic-diacritic combinations in the alphabet e.g. ờ, ị. Instead, Vietnamese input relies on formulaic software-based keyboard layouts, virtual keyboards, or input methods (also known as IMEs).
Vietnamese keyboard layouts rely on dead keys to compose letters with diacritics. Most desktop operating systems include a Vietnamese keyboard layout similar to TCVN 6064:1995 [vi], a Vietnamese national standard. Previously, typewriters used an AZERTY-based Vietnamese layout (AĐERTY).[25]
The three most common Vietnamese input methods are Telex, VNI, and VIQR. Telex indicates diacritics using letters that are unlikely to appear at the end of a word, while VNI repurposes the number keys or function keys and VIQR repurposes various punctuation marks. The Telex and VIQR conventions originated in an earlier era of telex machines and typewriters, respectively.
Support for these input methods is provided by input method editors (IMEs), which are known in Vietnamese as bộ g, literally "peckers" or "percussion" in more general terms. IMEs may be provided by the operating system, installed as a third-party application, installed as a browser extension, or provided by an individual website in the form of a script. Common third-party applications include GoTiengViet, UniKey, VietKey, VPSKeys, WinVNKey, and xvnkb. On Unix-like operating systems, the IBus and SCIM frameworks both support Vietnamese. IME scripts such as AVIM, Mudim, and VietTyping can be found on most Vietnamese message boards, the Vietnamese Wikipedia, and other text-intensive websites. The Vietnamese Web browser Cốc Cốc comes with an input method built-in.
Borrowing a feature common amongst Chinese input methods, some Vietnamese IMEs allow one to skip diacritics altogether and instead, after typing the base letters, the user can select the accented word from a candidate list. In order to provide this autocomplete list, the IME may need to communicate with a Web service. Some IMEs also use candidate lists to allow the user to convert text from the Vietnamese alphabet to chữ Nm, because there is no one-to-one correspondence between alphabetic words and nm characters.
Typical Vietnamese text contains a high proportion of compound words. Compound words are never hyphenated in contemporary usage, so spell checkers are limited to checking individual syllables unless a statistical language model is consulted.
Vietnamese has rigid spelling rules and few exceptions, so text-to-speech engines may avoid dictionary lookups except when encountering a foreign loan word. TTS engines must account for tones, which are essential to the meaning of any Vietnamese word e.g. m (mother) is a different word to m (but).
Internationalized user interfaces are generally unable to use the full complement of Vietnamese pronouns that would be expected in a traditional social setting, even when much is known about the user. Instead, user interfaces typically use generic pronouns such as ti and bạn, some of which make potentially incorrect assumptions about the user's age and relationship to other users. For example, when a social media platform notifies a user about a younger user, it may refer to the latter in the third person as anh ấy instead of em ấy, leading the user to misinterpret the notification as a reference to someone else.[26]
In order to type Vietnamese on computers, you will need Vietnamese keyboard. In my first post, I only mentioned one way to get the keyboard and that was to get Unikey software. It was the only way back in the day but Windows and Mac now have built-in Vietnamese keyboard as well.
What's New?
FREE Vietnamese flashcards.
Vietnamese Vocabulary Ranked Challenge with Leaderboard!
How fast can you type Vietnamese? Check out this Vietnamese Typing Test!
Most computers and smartphones have a Vietnamese keyboard option. The one I use is Vietnamese Telex. Once you get the hang of it, it becomes easier than having a Vietnamese keyboard with all the different tones and special letters.
From time to time, I have the pleasure of hearing a claim so outrageous, it's almost unbelievable. One such claim is that the computer/OS being too slow/laggy to type (ordinary documents) on. I heard it from two different people on two different occasions.
First, a little bit of clarification. The stories I heard were not about computers typing English, they were about computers typing Vietnamese. Vietnamese is basically Latin with a couple funny hooks on top of the characters, like this: Tiếng Việt. To type Vietnamese, one of the most popular methods (now it has become de-facto) is TELEX. TELEX uses the unmodified Latin keyboard to express the hook by certain simple rules. An example would be to get Tiếng Việt, you type Tieengs Vieetj. There, two es make the , s makes the ế, and j makes the ệ. Therefore, creating an IME for Vietnamese is a tedious job, but there were no heuristics involved, so it is possible to make a very high-performance IME. This problem is very much solvable and has been solved many times.
3a8082e126