The Droid's Dharma: Supporting the Tibetan Language on Android

DISCLAIMER: I am by no means an expert in this issue – I am just an an enthusiastic hacker with a dream. Also I don’t read Tibetan, but I enjoy looking at it!

Thanks to the open-source movement and the hard work of many Tibet supporters and typography experts, I am happy to announce that  rendering of Tibetan characters is now supported on the most fantastic of mobile smartphones, Google Android!!!

YarlungRaging2.JPG
Tendor’s Yarlung Raging blog viewed on a T-Mobile myTouch3G Android Phone

While it only has a small alphabet of characters, the Tibetan language has been notoriously difficult to support on Mac, Windows and Linux due to some complexities in how one character can modify the next. Dedicated academics, volunteers and software engineers have stayed focused on solving this and the most recent versions of all major operating systems are able to render Tibetan and provide Tibetan character input tools. Google Android is based on Linux, and fortunately is able to support the use of the GPL-licensed Tibet Machine Unicode font.

YarlungMobile1.jpg

However, by default Android only has a small number of fonts built-in, and doesn’t support the easy addition of new fonts or locales. It does however have something called the “fallback” font, which is used to render any encoded text it comes across that it doesn’t quite know what to do with.

What I realized is that you could replace this font with a Tibetan unicode font compatible with Linux, and that this would then enable Tibetan support in all applications on Android, including the web browser, email apps, instant messaging, and short messaging (SMS), among others.

The steps below outline the technical how to for Android users.


WARNING: This is not for novices. However, it isn’t rocket science either. Your average neighborhood mobile phone enthusiast should be able to figure out how to do this, and potentially help their friends do it too. Down the road, I hope we can make this process easier and/or Google will allow for the addition of any font to the system.

Step 1: Get Root on your Android device. You don’t need to mod your phone with a custom firmware, you just need root access to change system fonts. Here’s some places to start looking on how to (this changes weekly, btw, and differs for each type of Android phone):

Step 2: Download Tibet Machine Unicode font. You can learn more about the variety of Tibetan fonts available here.

Step 3: Make the system font folder writeable and backup the existing font
This can be done using desktop ‘adb’ tool from the SDK or the Android terminal app on the device

# su
# mount -o remount,rw -t yaffs2 /dev/block/mtdblock3 /system
# chmod 777 /system/fonts
# cd /system/fonts
# mv DroidSansFallback.ttf DroidSansFallback.ttf.bak
# exit

Step 4: Write the Tibetan unicode font as the new fallback font:
Using ADB Desktop tool with Android connected via USB

adb push TibMachUni-1.901b.ttf /system/fonts/DroidSansFallback.ttf

Using on-device terminal app:

#cd /system/fonts
#wget -o DroidSansFallback.ttf http://tinyurl.com/tibfont /system/fonts/DroidSansFallback.ttf

Step 5: Reboot your Android phone

Step 6: Point your Android browser at http://yarlungraging.blogspot.com, http://lobsangmonlam.org/ or http://tb.tibet.cn to verify the Tibetan font support is properly installed.

What’s Next

Two big steps from here… this is a call to action for Android developers out there:

  • Develop a one-click app that can install Tibetan (or any other third-party language) font for any rooted device
  • Port an existing Java-based Tibetan input utility into Android as an Input Method Editor so that you can have a way to write Tibetan character emails, SMS messages and blog posts.

Many thanks to the authors and developer behind the following posts upon whose work this effort was based:
karuppuswamy.com: How to change fonts in Android?
karuppuswamy.com: Mounting /system partition in read-write mode in Android
android-devs.com: Adding Additional Language Fonts to Android

53 comments

  1. Nathan this is amazing news for all Tibetans!!! We've been waiting for the phone that can type Tibetan unicode and it's finally here. Can't wait to see how this innovation will change our world and that of many others. Looking forward to more updates.

  2. Nathan this is amazing news for all Tibetans!!! We've been waiting for the phone that can type Tibetan unicode and it's finally here. Can't wait to see how this innovation will change our world and that of many others. Looking forward to more updates.

  3. I often read your blog and always find it very interesting. Thought it was about time i let you know�Keep up the great work

    regards
    david swin
    ______________________________________________

  4. I often read your blog and always find it very interesting. Thought it was about time i let you know�Keep up the great work

    regards
    david swin
    ______________________________________________

  5. I will bookmark and continue reading your blog in the future! Thanks alot for the informative post!

    Thanks
    jenny martin
    ______________________________________________

  6. I will bookmark and continue reading your blog in the future! Thanks alot for the informative post!

    Thanks
    jenny martin
    ______________________________________________

  7. Thanks, Nathan. I just installed this on my Nexus, and it works to bring up Tibetan characters, but unfortunately they don't combine properly when stacking. Since the Tibetan combining rules for stacks are fairly arbitrary, stacks of glyphs are generally replaced by a single compound glyph when rendering. The result from this Android modification is indeed readable as Tibetan, but the letters run on top of each other improperly, and it's difficult to read. This must be due to the particular Unicode engine running on Android. It's difficult to get this working properly, since Tibetan has one of the most complex rendering systems of any Unicode language. Does anyone have any ideas on how to fix this? I might look into this if I have time, but unless Google comes out with a patch, I'm pretty sure it's going to require a custom kernel.

  8. Thanks, Nathan. I just installed this on my Nexus, and it works to bring up Tibetan characters, but unfortunately they don't combine properly when stacking. Since the Tibetan combining rules for stacks are fairly arbitrary, stacks of glyphs are generally replaced by a single compound glyph when rendering. The result from this Android modification is indeed readable as Tibetan, but the letters run on top of each other improperly, and it's difficult to read. This must be due to the particular Unicode engine running on Android. It's difficult to get this working properly, since Tibetan has one of the most complex rendering systems of any Unicode language. Does anyone have any ideas on how to fix this? I might look into this if I have time, but unless Google comes out with a patch, I'm pretty sure it's going to require a custom kernel.

  9. Hold on. As you can see from your screen shot Android is *not* applying the OpenType lookups in the font and so not forming Tibetan stacks properly. The same thing applies to other complex non-latin scripts (e.g. Devanagari, Arabic, Tamil, Thai, Sinhala, Bengali and so on) ~ conjunct ligatures are not formed properly on Android.

    In Windows this is normally handled by Uniscribe, on Linux by Pango in OpenOffice by ICU, and in Adobe applications by CoolType. According to Mike Reed CTL (Complex Text Layout) is not yet supported. [See: <http://groups.google.com/group/skia-discuss/bro

    One way round this would be to use GB/T20524-2006 encoding for Tibetan which uses Unicode PUA characters for pre-composed Tibetan stacks so you don't need OpenType lookups. On Windows you can use Andrew West's Babel Pad to do the conversion from normal Unicode to GB/T20524-2006 encoding. The Jomolhari Tibetan font supports both normal Unicode and GB/T20524-2006 encoding. – But that is not a real solution. Better to encourage people to start bugging Google to add CTL support to Android soon.

    -Chris

    (co-developer of Tibetan Machine Uni font & developer of Jomolhari Tibetan font)

  10. Chris, thanks for the clarification. The funny thing is that a number of Tibetans I showed it to didn't mention a thing! I suppose it takes someone versed in both the language and the technicality of the rendering to find the real bugs with a solution. Either that, or they were just being nice.

    Mostly I just wanted to get the conversation started, and see what the known and unknown issues are. I have the technical capability to make this happen, but not the linguistic expertise. Glad to make the connection.

  11. Sounds good – if you can make this happen that would be great ~ not only for Tibetans but for the users of all complex scripts, which is a large portion of humanity.

    Actually because of the way Tibetan is encoded in the Unicode / iso10646 Standard, despite its complexity, it is probably more straightforward to handle than a lot of other complex scripts. What needs to be done to enable CTL support is to incorporate Pango <http://www.pango.org/>, HarfBuzz <http://www.freedesktop.org/wiki/Software/HarfBuzz> or the relevent part of ICU in Skia. I contributed in one way or another to the Tibetan rendering parts of all of these ~ as well as providing information to Microsoft when they implemented Tibetan rendering and collation and contributing to the Tibetan encoding in the Unicode standard itself.

    Pango, HarfBuzz and ICU are by now all well tested libraries that have been deployed in major applications for quite some time ~ so a lot of the difficult work, and working out the right lookups to apply for each script, has already been done and so adding CTL support to Android should not require any particular knowlege about individual scripts. For Arabic and Hebrew you also need to handle RTL text – but most of the code for that should also be in these libraries. Vertical scripts like Mongolian may also present particular problems. Another thing that is necessary is implementing proper line-breaking rules for each script.

    Pango / HarfBuzz might be a lot lighter than ICU which is a huge library.

    For a simple test of Tibetan rendering, point your browser at: <http://www.library.gov.bt/IT/complexscript.html…> – There you will find two example graphics, the first looks like what you get for a complex Tibetan script conjunct when CTL support is not enabled, and the second is what the same conjunct should look like when CTL for Tibetan is supported and enabled. Just below these is a box labled “browser test” where the same Tibetan script text should render with a real font.

    There are also pages and pages of Tibetan script content on that site which you can get to by clicking on the flag at the top right of any page.

    – Chris

  12. Hold on. As you can see from your screen shot Android is *not* applying the OpenType lookups in the font and so not forming Tibetan stacks properly. The same thing applies to other complex non-latin scripts (e.g. Devanagari, Arabic, Tamil, Thai, Sinhala, Bengali and so on) ~ conjunct ligatures are not formed properly on Android.

    In Windows this is normally handled by Uniscribe, on Linux by Pango in OpenOffice by ICU, and in Adobe applications by CoolType. According to Mike Reed CTL (Complex Text Layout) is not yet supported. [See: <http://groups.google.com/group/skia-discuss/bro

    One way round this would be to use GB/T20524-2006 encoding for Tibetan which uses Unicode PUA characters for pre-composed Tibetan stacks so you don't need OpenType lookups. On Windows you can use Andrew West's Babel Pad to do the conversion from normal Unicode to GB/T20524-2006 encoding. The Jomolhari Tibetan font supports both normal Unicode and GB/T20524-2006 encoding. – But that is not a real solution. Better to encourage people to start bugging Google to add CTL support to Android soon.

    -Chris

    (co-developer of Tibetan Machine Uni font & developer of Jomolhari Tibetan font)

  13. Chris, thanks for the clarification. The funny thing is that a number of Tibetans I showed it to didn't mention a thing! I suppose it takes someone versed in both the language and the technicality of the rendering to find the real bugs with a solution. Either that, or they were just being nice.

    Mostly I just wanted to get the conversation started, and see what the known and unknown issues are. I have the technical capability to make this happen, but not the linguistic expertise. Glad to make the connection.

  14. Sounds good – if you can make this happen that would be great ~ not only for Tibetans but for the users of all complex scripts, which is a large portion of humanity.

    Actually because of the way Tibetan is encoded in the Unicode / iso10646 Standard, despite its complexity, it is probably more straightforward to handle than a lot of other complex scripts. What needs to be done to enable CTL support is to incorporate Pango <http://www.pango.org/>, HarfBuzz <http://www.freedesktop.org/wiki/Software/HarfBuzz> or the relevent part of ICU in Skia. I contributed in one way or another to the Tibetan rendering parts of all of these ~ as well as providing information to Microsoft when they implemented Tibetan rendering and collation and contributing to the Tibetan encoding in the Unicode standard itself.

    Pango, HarfBuzz and ICU are by now all well tested libraries that have been deployed in major applications for quite some time ~ so a lot of the difficult work, and working out the right lookups to apply for each script, has already been done and so adding CTL support to Android should not require any particular knowlege about individual scripts. For Arabic and Hebrew you also need to handle RTL text – but most of the code for that should also be in these libraries. Vertical scripts like Mongolian may also present particular problems. Another thing that is necessary is implementing proper line-breaking rules for each script.

    Pango / HarfBuzz might be a lot lighter than ICU which is a huge library.

    For a simple test of Tibetan rendering, point your browser at: <http://www.library.gov.bt/IT/complexscript.html…> – There you will find two example graphics, the first looks like what you get for a complex Tibetan script conjunct when CTL support is not enabled, and the second is what the same conjunct should look like when CTL for Tibetan is supported and enabled. Just below these is a box labled “browser test” where the same Tibetan script text should render with a real font.

    There are also pages and pages of Tibetan script content on that site which you can get to by clicking on the flag at the top right of any page.

    – Chris

  15. Yes, I pointed Chris Fynn at this thread, since he's such an expert in this area, and I knew he'd bring a lot to the discussion.

    I've done some experiments, and it looks like the GB/T20524-2006 encoding approach he suggested does indeed work to render Tibetan properly on Android. I'm working on some code to render a web page properly in Tibetan, by dynamically rewriting the Unicode before displaying it in a WebView, and I'll let you know when I've got a fully working example. Thanks for providing the impetus to get this working right!

  16. Yes, I pointed Chris Fynn at this thread, since he's such an expert in this area, and I knew he'd bring a lot to the discussion.

    I've done some experiments, and it looks like the GB/T20524-2006 encoding approach he suggested does indeed work to render Tibetan properly on Android. I'm working on some code to render a web page properly in Tibetan, by dynamically rewriting the Unicode before displaying it in a WebView, and I'll let you know when I've got a fully working example. Thanks for providing the impetus to get this working right!

  17. Great work, Tom!

    Aside from the browser, what about building a Tibetan supported eBook reader? Perhaps you could reach out to one of the popular eBook reader apps and ask them to support Tibetan using the code you've provided.

  18. Great work, Tom!

    Aside from the browser, what about building a Tibetan supported eBook reader? Perhaps you could reach out to one of the popular eBook reader apps and ask them to support Tibetan using the code you've provided.

  19. Hi both Tom and Nathan. Really good news for me to see someone working on CTL support on Android. I am willing to enable support for “Nafees Nastaleeq”, a type of Urdu Script on Android. Its a very complex font that requires very detailed processing, more than Arabic or probably even Tibetan. And i am looking forward to the option of doing something to port Pango onto Android. But for that I need guideline as to how I should proceed with it. I first of all need to know the exact path of the current rendering process on Android (which uses Freetype). Is there any tool that I can use to graphically display the source code layout, that can help me see the code flow and the API calls? Can you please provide me a step by step procedure on how to go about this? Thanks.

  20. You should replace DroidSans.ttf with Tibetan or Dzongkha Fonts.two of my Tibetan Frens had successed with it.because DroidSans.ttf is the font of English,and Tibetan font also has English alphabets.but what confused them a lot is how to make Tibetan or Dzongkha font rendering propely on Android Phones.

  21. Hi, I would like to add Tibetan support to MultiLing Gingerbread Keyboard.
    Please suggest a compact keyboard layout in following format:

    q w e r t y u i o p
    a s d f g h j k l
    z x c v b n m

    shifted
    Q W E R T Y U I O P
    A S D F G H J K L
    Z X C V B N M

  22. A few observations I had after doing this:

    1. If you use the suggestions below, replace the
    DroidSans.ttf instead of the DroidSansFallback.ttf, you won't lose Korean, Japanese
    or Chinese fonts.

    2. Sprint, my U.S. carrier, does not allow 16 bit character
    encoding for SMS. So if you want to read text messages (sending works fine) from
    iPhones or other Droids with Tibetan font, don't use Sprint. Alternatively
    you can use applications like Google Talk for sending messages back and forth.

    3. Doing this is really a lot easier than it seems. Thanks Nathan!

  23. I recently got an early version of the ZTE Open phone running Firefox OS and found it was far easier to get this phone running with Tibetan script, font and keyboard layout than with Android. A patch has been submitted to Mozilla (Firefox) to include Dzongkha (Tibetan script) support in the operating system so that nothing needs to be installed by the user. Tibetan and Dzongkha dictionaries also work well.

    Smart phones running Firefox OS are scheduled to be released in the Indian market for $25 equivalent (about ₨ 1,500) early next year – so these will be affordable to ordinary Tibetans.

  24. That is really great news! I have some good contacts at Mozilla, so let me know if they aren’t responsive.

  25. hi Nathan thanks for your support us tibtans and here I would like ask. as I read those comments and I find that, those guys have mention about that there is problem of rendering of tibetan character. the characters are not combines properly since tibetan font has complex character rendering and suggested that it’s might be due to the unicode running on andriod. so Nathan can you suggest some solution on rendering of tibetan font on andriod and that would be appreciable

Leave a comment