The state of Burmese Unicode

This past week, I switched over to a new Android phone, the much touted Google Nexus 5, a Google-branded smartphone bearing the latest version of the Android OS, version 4.4 (aka Kitkat). I’ve been impressed with the phone’s build and software capabilities, but the real gem I unearthed was that it has full support for the Burmese language, while browsing through my music library. That’s right, Android 4.4+ now has support for Burmese.

Now, if only there were some Burmese developers who could create a Burmese language keyboard for Android… I did spot some issues, namely with regard to proper line spacing of the characters to ensure the diacritics are fully displayed and not cut off. A screenshot of a Burmese Wikipedia article on the Chrome app below:

This should be exciting news in Burma, not least because Android smartphones dominate the the Burmese market, as with other emerging markets. And this should do some good in hastening the much-needed disappearance of Zawgyi font.

I went to the Android Developer website and confirmed that this wasn’t just a fluke: Unicode support for Burmese and even minority languages based on the Burmese script is all but complete!

Now, if only usage of Unicode were more prevalent among the Burmese community… The Unicode block dedicated to Burmese has been around for several years, but among Burmese language websites, Zawgyi, a substandard Burmese font, still reigns as king. Zawgyi websites still outpace websites based on the Unicode standard, even though the major operating systems, namely Windows 8 (link) and OSX (link) now provide support for Burmese or Myanmar Unicode.

The Unicode block dedicated to Myanmar (Burmese) and related scripts (Mon, Karen, Kayah, Shan, and Palaung, as well as Pali and Sanskrit):

More information on Burmese and related Southeast Asian scripts can be found in The Unicode Standard (link).

The following are some reasons why Zawgyi is inefficient–

  • Character ordering does not comply with the Unicode standard.
  • Accessibility (search, retrieval, sorting, etc.) is limited, because character ordering can be done numerous ways, and with numerous characters. A favorite example of mine is typing the Burmese numeral ‘0’ (၀) for the Burmese letter ‘wa’ (ဝ).
  • Zawgyi has no support for Sanksrit, Shan, Mon, and other ethnic minority languages based on the Burmese script.
  • Segmentation is nonexistent (i.e. Zawgyi doesn’t recognize a syllable unit, so words can’t properly segment as they would with Unicode, leading to orphans and widows).
  • Zawgyi violates Unicode principles, by reserving multiple code points for a single character or diacritic (such as different sizes of the ya-yit depending on the character’s width and different diacritic combinations).
  • Existence of two competing standards ultimately stifles information flow and access to news and information.

There was a place and a time for Zawgyi (Lionslayer provides a good background on Zawgyi), i.e., computers lacked support for complex Indic scripts, but that time is over. As major OS’es roll out support for this standard, it’s absolutely necessary that major Burmese language players begin to jump the bandwagon as well. But at the end of the day, it’s an issue of end user adoption. How do major players relay that message to the ordinary individual, when Zawgyi has been used as a crutch all these years? It’s definitely an uphill battle, especially with converting content from Zawgyi to Unicode and informing users. This link provides useful tools for anybody interested in getting started.

Out of curiosity, I conducted a fairly comprehensive survey of 40 Burmese government websites (using Google, primarily searching for websites in the domain) to see which ones have adopted Unicode and which ones are still using Zawgyi. I found that a promising 30% now use Unicode, but the plurality, 45%, use Zawgyi for their Burmese language sites. However, a startling 25% don’t even offer a Burmese version of their websites. I did notice a handful of websites that offer both Unicode and Zawgyi versions, a concession to users who have yet to make the switch.

There should be a concerted effort by Burmese government agencies to all adopt the Unicode standard, especially while IT penetration is still in its nascency. As Burma’s population of internet and computer users grows, it will become more and more difficult for consolidation and adoption of the Unicode standard to occur. This is a perfect opportunity to steer Burmese language users in the right direction.

The official website of the Amyotha Hluttaw (Upper House of the Parliament) get an F – renders its Burmese text using Zawgyi, as exemplified by the butchered and corrupted text on my tablet:

The Official Website of the President’s Office receives an A+: I noticed that the website does offer Zawgyi users an option, but the website is in full compliance.

For the curious, inventory of surveyed websites:
40 (10 with no Burmese language site, 12 Unicode, 18 Zawgyi)

Website Burmese language site available? Burmese Unicode support? Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y N N N N N N N N N N N N N N N N N N N N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N

If any of you have any input or suggestions to make, leave a comment! I’m very eager to hear your thoughts on making the transition to Unicode smoother, or promoting greater usage of Unicode for Burmese.


15 thoughts on “The state of Burmese Unicode

  1. K Sint says:

    It’s nice to see the adoption of Burmese unicode. That would means no longer having to install the Zawgyi fonts for my parents. It is strange that iOS 7 is lacking Burmese unicode even though the Mac has support. If I visit a web page or Facebook with Burmese text on the iPad, it would freeze the app for a few seconds.

    It’s nice to see a new blog post. I love reading it. Do you have any recommendations on reading about Burmese Americans from a sociological or historical background? For some reasons, Burmese Americans are not well known among Asian Americans.

    • Aung Kyaw says:

      Thanks, K. Sint! Always glad to hear from readers. Definitely, Google has the clout and user base to really drive Burmese speakers toward Unicode. Hopefully in the next 2 years we’ll have all moved away from Zawgyi.

      As for readings on the Burmese American community, I’m not aware of any particular studies. Joseph Cheah has done some research on the Burmese American Buddhist community, such as “Cultural Identity and Burmese American Buddhists” but you’ll need academic access to read that paper. I feel that the Burmese American community is way too segmented among religious (Buddhist, Muslim, Christian, etc.), ethnic (Burman, Sino Burmese, Burmese Indian, other ethnic minorities) and cultural lines to have visibility in the AAPI community..

  2. Thaths says:

    “I did spot some issues, namely with regard to proper line spacing of the characters to ensure the diacritics are fully displayed and not cut off.”

    Hi, can you please send me more details (annotated screenshots, unicode text, etc.) that show these issues so that we can fix it in the next version?

  3. Myo Win says:

    thank you for great article. I bought a nexus 7 (2013) which has pure google android 4.4.2 on it. I was amazed to see the Unicode Burmese already available in my tablet. I have not rooted or install Zawgyi font on it. part of it is I agreed with the reason that Zawgyi font does not compliance with Unicode. but here is the trouble I am running into. I am not able to see full rendering of Burmese text properly. all I am seeing is fonts are including that dotted circle for Unicode block. and they are not stacking up properly. is there anyway I can change that or do you know any solution for that? thank you.

  4. MHK99 says:

    I agree with 50Viss, why not make it 100viss. Ten years ago there was a big push for Myanmar Unicode(or Burmese Unicode if you will.) Unfortunately the push did not pick up speed as expected.mainly due to the resistance of the users. People were used to the old way of typing starting from the old days of Win fonts introduced by Ko Zaw Htut (with my due respect), Zawgyi came out with the force (perhaps with the blessings of the Jedi.) Zawgyi was adopted by the users and the Myanmar Unicode when it came out was at a disadvantage. Few people would like to use it as the keystrokes are a bit strange,i.e., not exactly as we are taught in kindergarten.

    Somehow we will have to adopt a Unicode compliant version as soon as possible. I do not see similar problems with Thai or Cambodian scripts although they also use the Sanskrit based script. When we do not use Unicode the epub versions of the texts do not render properly (or with great difficulty).

    The government sector should use Unicode fonts in web as well as official correspondence.

    Hopefully, we will see Unicode adopted in the near future.

  5. Khun Tharrnyi says:

    I wanna use the unicode font used in Google. But I don’t know yet! Please help me. Tharlon or Pyidaungsu or so on.

  6. Michael Tan says:

    Hi. I’m not Burmese but my work involves technology in Burma.

    This article is 2 years old, let me attempt to update.

    Today’s Burma still does not have Google Play full support – many apps will NOT appear if your resident location is determined to be Myanmar. Eg: Viber, one of the most popular apps, cannot be downloaded directly.

    Because of this, the tens of thousands of mobile shops directly install apps into phones, including keyboard. Most of these apps are old and not updated, and they support Zawgyi only, not unicode. Can you imagine, having apps for a couple of years with no updates?

    This is a vicious cycle. Everyone types in Zawgyi and viber conversations, facebook comments, all can’t be read by a Unicode only phone. Zawgyi support is hacked into phones, especially Samsungs.

    Now, Android 5 comes out and from the box supports Unicode. Guess what? Everybody hates them because they lack Zawgyi support. So Zawgyi is again hacked into those new Android 5 phones if possible.

    I’ve tried various phones, Samsung, LG, Xiaomi, Huawei – and the only phone to date which supports both Zawgyi and Unicode is the new models of Huawei phones, out of the box. The Samsung Edge 6 with full Unicode support has been cursed to death – IIRC it’s still not possible to hack in Zawgyi support.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s