Finally, Burmese joins Google Translate

Today, the official Google Translate blog announced that it added 10 new languages to Google Translate, among them Burmese:

In India and Southeast Asia, we are adding Malayalam, Myanmar, Sinhala, and Sundanese.

The blog post also notes the challenges in translating Burmese, citing Burmese syntax and font encoding issues (ahem, Zawgyi):

Myanmar (Burmese, မြန်မာစာ) is the official language of Myanmar with 33 million native speakers. Myanmar language has been in the works for a long time as it’s a challenging language for automatic translation, both from language structure and font encoding perspectives. While our system understands different Myanmar inputs, we encourage the use of open standards and therefore only output Myanmar translations in Unicode.

Exciting news to say the least, because Burmese is the last of the major Southeast Asian languages to be included as a supported language- Tagalog was included in one of the first stages, Indonesian and Vietnamese added in 2008, Thai and Malay in 2009, Lao in 2012, and Khmer in 2013.

Initial impressions

I’m not going to lie. I haven’t been able to recreate a truly bizarre or amusing translation yet, so I’m pleasantly surprised.

Grammar and syntax

Burmese grammar is subject-object-verb (SOV) unlike English, so I can see how word order can be completely thrown off when sentences are translated into English.

Translating simple sentences (Burmese > English, English to Burmese) produce legible, almost correct translations, just a few words off, at most.

Translation of more complex sentences is still remarkably good, considering the content and gist are translated through. I can foresee continued improvements in the translation capability.

For example, I translated this paragraph from a Burmese news article (written in colloquial Burmese):

ဘားဆိုင် တာဝန်ရှိသူတွေဖြစ်တဲ့ မြန်မာနိုင်ငံသား နှစ်ဦးနဲ့ နယူးဇီလန် နိုင်ငံသား တဦးတို့ကို ဖမ်းဆီးထားပြီး ဆိုင်ကိုလည်း ယာယီ ပိတ်သိမ်းထားတယ်လို့ ဆိုင်တည်ရှိရာ နယ်မြေပိုင်ဖြစ်တဲ့ ဗဟန်းမြို့နယ် ရဲစခန်း စခန်းမှူး ဒု ရဲမှူး သိန်းဝင်းက သတင်း ထောက်တွေကို ပြောဆိုပါတယ်။

Google Translate translated this as:

Bar, two officials, such as Myanmar and New Zealand citizens were arrested after a temporary closure of the same area is located in Bahan Township police station camp said deputy police chief သိန်းဝင်း support.

A more accurate translation would be:

The bar’s responsible individuals, including 2 Burmese nationals and 1 New Zealand national, have been arrested and the shop has been temporarily closed, police officer, Vice Officer Thein Win, from Bahan Township Police Station, responsible for the shop’s jurisdiction, told the reporters.

Google Translate doesn’t appear to have trouble translating literary/formal Burmese either. In some ways, I can see how translating formal Burmese can be easier, because sentences tend to be formulaic, but also quite long and convoluted. I found a press release on the President’s Office website that was rather correct.

Original Burmese text:

နိုင်ငံတော်သမ္မတ ဦးသိန်းစိန်သည် တရုတ်ပြည်သူ့သမ္မတနိုင်ငံ ပေကျင်းမြို့တွင် ကျင်းပခဲ့သည့် APEC CEO ထိပ်သီးအစည်းအဝေးသို့ တက်ရောက်ခဲ့ပြီး ၂၀၁၄ ခုနှစ် နိုဝင်ဘာ ၉ ရက် ညနေပိုင်းတွင် နေပြည်တော်သို့ ပြန်လည်ရောက်ရှိလာခဲ့ရာ နိုင်ငံတော်သမ္မတနှင့်အတူ ခရီးစဉ်တွင် လိုက်ပါသွားခဲ့သည့် ပြန်ကြားရေးဝန်ကြီးဌာန ပြည်ထောင်စုဝန်ကြီး ဦးရဲထွဋ်အား နေပြည်တော် အပြည်ပြည်ဆိုင်ရာ လေဆိပ်၌ တွေ့ဆုံကာ နိုင်ငံတော်သမ္မတ၏ ခရီးစဉ်နှင့်စပ်လျဉ်းသည့် အတွေ့အကြုံများကို မေးမြန်း ဖြစ်ခဲ့သည်။

Google Translate:

The President’s Republic of China in Beijing held APEC CEO Summit attended the 2014 November 9 evening to Naypyidaw back in the coming days, the President, with the tour now was the Ministry of Information, Union partners to Naypyidaw International Airport to meet with the President’s visit regarding experiences questioned the past decade.

A more accurate translation:

President U Thein Sein attended the APEC CEO top-level meeting held in Beijing, People’s Republic of China and returned to Naypyidaw on 9 November 2014 at night. Ministry of Information’s Union Minister U Ye Htut, who had accompanied the President on his journey, answered questions regarding the President’s trip at the Naypyidaw International Airport.

Romanization

Unfortunately, Google Translate stumbles on Burmese romanization. Google Translate uses a strange romanization system that seems to be a mishmash of Burglish and scholarly transcription systems (used by English speaking academia). I think Google Translate’s romanization system is trying to do too much: both replicate pronunciation and retain orthographic integrity.

There’s a feeble attempt to render tones (double vowels and the use of “r” for longer toned words), while there’s also an attempt to retain spelling fidelity (using “s” for သ, spelling out silent consonants, such as “sany” for သည်).

For example, a Burmese sentence “I love you very much” (မင်းကိုအရမ်းချစ်တယ်) is rendered into the weird mainn ko aaramhkyittaal. Neither here nor there–but then again, capturing both Burmese pronunciation and spelling in Roman alphabet is very difficult.

Some comparisons to the “prevailing” romanization systems for Burmese:

မဟာသမိုင်းတော်ကြီးညွန့်ပေါင်း (Burmese)
mahar samine tawkyee nyw an paungg (Google Translate)
Mahā samuiṅʻʺ toʻ krīʺ ññvanʻʹ poṅʻʺ (American Library of Congress)
maha sa muing: tau kri: nywan. paung: (Myanmar Language Commission)
mahar thamine daw gyee nyunt paung (Burglish)
maha thamaingdawgyi nyunt paung (Fifty Viss)

Accessibility

At the bare minimum, the device needs to have Unicode support for Burmese (Windows 8+, Android 4.4+, etc.) in order to render characters correctly.

Mobile use

I checked on my Nexus 5’s Google Translate app, and Burmese does show up as a supported language. Text translation works as it would on the computer:

Google Translate’s mobile site also provides Burmese support:

Keyboard input

For folks without access to Burmese language input methods, Google Translate provides 2 variants of the most popular Burmese keyboard setups:

မြန်မာဘာသာ Keyboard on Google Translate

မြန်မာဘာသာ (မြန်စံ) Keyboard on Google Translate

Zawgyi input

Interestingly enough, Google Translate appears to translate Zawgyi-encoded text, although Burmese language output is always in Unicode. I took a newspaper headline and plopped it in as Zawgyi and as Unicode, and the end translation was the same:

Take Google Translate for a spin at translate.google.com and try some Burmese translations out! P.S. There’s an “Improve this translation” option as well.

One thought on “Finally, Burmese joins Google Translate

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s