Planet SMC

January 01, 2017

Santhosh Thottingal

Detailed font reports using fontreport tool

Google i18n team developed a tool to create detailed report of fonts. The tool named fontreport, produces a multi page PDF with Unicode coverage of the font, what glyphs are in it, what Open Type features it supports, available ligatures, and glyph substitutions. Optionally the tool can also create plain text reports. The PDF is generated using TeX.

Manjari font report generated using fontreport tool

I found it very useful to create report for a dozen of fonts I maintain with Swathantha Malayalam Computing community. Sharing the reports it created:

Font reports(PDF):

  1. Rachana Regular
  2. Rachana Bold
  3. Meera
  4. Manjari Regular
  5. Manjari Bold
  6. Manjari Thin
  7. Dyuthi
  8. Chilanka
  9. Karumbi
  10. AnjaliOldLipi
  11. Keraleeyam
  12. Uroob

by Santhosh Thottingal at January 01, 2017 12:24 PM

Swanalekha input method documentation

Sayahna Foundation had prepared a single page documentation of Swanalekha input  method in PDF format. Since I often get emails about the key mapping, I am sharing it here.


Swanalekha Malayalam input method documentation

by Santhosh Thottingal at January 01, 2017 10:02 AM

December 31, 2016

Santhosh Thottingal

2016 ൽ വായിച്ച പുസ്തകങ്ങൾ

വായിച്ചു തീർത്തതിനെക്കാൾ തീർക്കാതെ വായന നിർത്തിയ പുസ്തകങ്ങളായിരിക്കും 2016ൽ കൂടുതലും. ഉള്ളടക്കത്തിലെ താത്പര്യം കൊണ്ട് വായിക്കാനെടുത്തു് വായന ദുഷ്കരമായപ്പോഴോ പ്രതീക്ഷയ്ക്കൊത്ത് ഉള്ളടക്കമില്ലാത്തപ്പോഴോ പല നോൺ ഫിക്ഷൻ പുസ്തകങ്ങളും മാറ്റിവെച്ചു. 20 പുസ്തങ്ങളെങ്കിലും വായിക്കണമെന്നു കരുതിയെങ്കിലും അത്രയും പൂർത്തിയാക്കാൻ കഴിഞ്ഞില്ല. അതേ സമയം പുസ്തങ്ങളുടെ ലോകത്തിനപ്പുറം ഇന്റർനെറ്റ്-സോഷ്യൽ മീഡിയാ വായനകൾക്ക് ഒട്ടും കുറവുണ്ടായില്ല. ഒന്നു രണ്ടു മിനിറ്റുകൊണ്ട് ഒറ്റനോട്ടത്തിൽ വായിച്ചെടുക്കാവുന്ന കുറിപ്പുകളാണ് ഇന്റർനെറ്റ് വായനക്കാർക്കിഷ്ടം എന്നു തോന്നുന്നു. നീണ്ട കുറിപ്പുകൾ ഉണ്ടെങ്കിൽ പ്രധാനഭാഗങ്ങൾ ബ്ലർബ് ആയി കൊടുത്ത് വേഗത്തിലുള്ള വായനയെ സഹായിച്ചില്ലെങ്കിൽ വായനക്കാർ വിട്ടുകളയും.

വായിച്ച പുസ്തകങ്ങൾ താഴെക്കൊടുക്കുന്നു. ഈബുക്കുകളാണ് കൂടുതലും വായിച്ചതു്.
1. Elon Musk: Inventing the Future by Ashlee Vance
2. ബുക്‌സ്റ്റാള്‍ജിയ by P.K. Rajasekharan
3. Mr. Penumbra’s 24-Hour Bookstore by Robin Sloan
4. When Breath Becomes Air by Paul Kalanithi
5. Sapiens: A Brief History of Humankind by Yuval Noah Harari
6. The Vegetarian by Han Kang
7. Gujarat Files by Rana Ayyub
8. Stories of Your Life and Others by Ted Chiang
9. The Sceptical Patriot: Exploring the Truths Behind the Zero and Other Indian Glories by Sidin Vadukut
10. The Drunkard’s Walk: How Randomness Rules Our Lives by Leonard Mlodinow
11. കണ്ണ് സൂത്രം by Vinod Krishna
12. Homo Deus: A Brief History of Tomorrow by Yuval Noah Harari
13. What If?: Serious Scientific Answers to Absurd Hypothetical Questions by Randall Munroe
14. Feynman’s Rainbow: A Search for Beauty in Physics and in Life by Leonard Mlodinow
15. Bad Science by Ben Goldacre (some more pages remaining)
16. The Intelligence Paradox: Why the Intelligent Choice Isn’t Always the Smart One by Satoshi Kanazawa
17. Shady Characters: The Secret Life of Punctuation, Symbols & Other Typographical Marks by Keith Houston
18. Arms and the Man by George Bernard Shaw

by Santhosh Thottingal at December 31, 2016 09:46 AM

November 15, 2016

Rajeesh K Nambiar

Improvement in converting video/audio files with VLC

VLC Media Player has the ability to convert video/audio files into various formats it supports, since a long time. There is a dedicated “Convert/Save” menu for converting single or multiple files at once into a different format, with limited ‘editing’ features such as specifying a start time, caching options etc. It is quite useful for basic editing/cropping of multimedia files.

As an example, one of the easiest ways to create a custom iPhone ringtone is to create a “.m4r” (AAC format) file exactly 40 seconds long. It is a matter of selecting your favourite music file and doing a “Convert/Save” with appropriate “Profile”. A “Profile” specifies the video/audio encoding to be used, which can be easily customized by selecting different audio and video codecs.

The options “Caching time”, “Play another media synchronously” (think adding different sound track to a video clipping) and a “Start time” etc can be specified under “Show more options” button and even more advanced functionality is available by making use of the “Edit Options” line. Internally, all the options specified at this line are passed to the converter.

There was one thing lacking in this “Convert/Save” dialog though – there was no possibility to specify a “Stop Time” akin to the “Start Time”, in the GUI (although it can be manually specified in the “Edit Options”, but you need to calculate the time in milliseconds). VLC 2.x series convert looks like as follows – notice the lack of “Stop time”:


Being bugged by this minor annoyance, I set out to add the missing “Stop-time” functionality. Going through the codebase of VLC, it was relieving to see that the converter backend already supports “:stop-time=” option (akin to “:start-time=”). It was then a matter of adding “Stop Time” to the GUI and properly updating the “Edit Options” when user changes the value.

A working patch was then sent to vlc-devel mailing list for review and feedback. After 5 rounds of review and constructive feedback from Filip Roséen the code was cleaned up (including existing code) which is now committed to the master branch. This functionality should be available to users in the upcoming 3.0 release. Screenshot below:



Tagged: hacking, vlc

by Rajeesh at November 15, 2016 06:05 PM

November 09, 2016

Santhosh Thottingal

Electronic Design and Automation Lab: Migration to Scilab

The third semester B.Tech Electronics and Communication Engineering programme of Kerala Technological University (KTU) has a lab course on electronic design and automation. The course aims to introduce the students to various electronic design and simulation tools like SPICE, MATLAB and HDL.


I had proposed the migration of the MATLAB section of this lab course to SCILAB, a popular open source alternative to MATLAB. The migration is now complete with support from FOSSEE . The source code of the experiments are now available here. A detailed lab manual will be made available soon.



by Kavya Manohar at November 09, 2016 02:30 PM

November 07, 2016

Santhosh Thottingal

September 30, 2016

Santhosh Thottingal

Proposal for Malayalam language subtags for orthography variants rejected

The Internet Engineering Task Force (IETF) – Languages is responsible for the registration of language tags, subtags and script variants. These registered language tags are used in a wide set of internet standards and applications to identify and annotate language uniquely.


Recently Sascha Brawer(currently working at Google) submitted a proposal to register two new language subtags for Malayalam to denote the orthography variations. Malayalam orthography had a diverging moment in history when Kerala government decided to script reformation in 1971. The decision was to accommodate the Malayalam orthography for the then existing typewriters and typesetting devices. These devices had limitations to accomodate the wide character set of Malayalam at that time.

So, the proposal was to introduce two subtags as follows:

  1. ml-puthiya:  Reformed Malayalam orthography-Malayalam that is  w ritten in the orthography of the 1971 reform. In Malayalam (transcribed to English), the term for this variant is “puthiya lipi”.
  2. ml-pazhaya:  Traditional Malayalam orthography- Malayalam that is written using the orthographic conventions that were in place before the 1971 reform. In Malayalam (transcribed to English), the term for this variant is “pazhaya lipi”.

Sascha Brawer correctly explained the missing part in this classification:

According to my contact, this reform was a continuum; the Kerala government order of 1971 did not immediately affect the common practice. Instead, the transition from traditional to reformed has happened over the period of 20-30 years. There is a lot of variation in the specifics for any year one could pick in the last century.

Again according to my contact, there is a common overall understanding among Malayalam speakers that the orthography of the language has moved from ‘traditional’ to ‘reformed.’ However, my contact did not know of an authoritative reference that would describe this transition in more detail.

I replied to the proposal as follows:

Mathrubhumi-titleMathrubhumi daily uses a mixed orthography, except the ു sign, it mostly follows traditional writing style with many conjuncts and stacked ligatures
Manorama-TitleMalayala Manoarma daily follows a style more close to reformed orthography and avoids many ligatures.

[…] This is true, there is no defnition or authoritative reference about this
differences. And that is my concern. Given a set of printed samples from
say, todays news paper in Malayalam, one cannot say this is new'(puthiya) or this is ‘old'(pazhaya). The contemporary Malayalam usage is a mixed one. It borrows some reformation from 1971 order and some from the practices that existed before

The reason for mixed mode is because the main intention behind the 1971
reformation was to get Malayalam ‘usable’ with then type writers and composing machines. As technology progressed and when these limitations vanished, nothing stopped people from using the types similar to what they will write using pen on paper. The modern opentype technology completely removed this limitation and many modern and famous typefaces of Malayalam uses this ‘old’/ml-pazhaya style.

So defining two variants ml-puthiya, ml-pazhaya without a clear way to
distinguish one from another and having a wide range of ununamed variants exist, is concerning.[…]

Later,  Michael Everson, the registrar for IETF language tags said he is rejecting the proposals.

For a Malayalam subtag to be approvable, it really should refer to an orthographic standard. So far it appears that there isn’t anything very precise for either the traditional or the newer spelling to be specified, so it would be best to reject this now (rather than extending it little by little) until revised proposals with solid references can be put forward.

by Santhosh Thottingal at September 30, 2016 03:07 AM

September 07, 2016

Balasankar C

SMC/IndicProject Activities- ToDo List


So, M.Tech is coming to an end I should probably start searching for a job soon. Still, it seems I will be having a bit of free time from Mid-September. I have got some plans about the areas I should contribute to SMC/Indic Project. As of now, the bucket list is as follows:

  1. Properly tag versions of fonts in SMC GitLab repo - I had taken over the package fonts-smc from Vasudev, but haven’t done any update on that yet. The main reason was fontforge being old in Debian. Also, I was waiting for some kind of official release of new versions by SMC. Since the new versions are already available in the SMC Fonts page, I assume I can go ahead with my plans. So, as a first step I have to tag the versions of fonts in the corresponding GitLab repo. Need to discuss whether to include TTF file in the repo or not.
  2. Restructure LibIndic modules - Those who were following my GSoC posts will know that I made some structural changes to the modules I contributed in LibIndic. (Those who don’t can check this mail I sent to the list). I plan to do this for all the modules in the framework, and to co-ordinate with Jerin to get REST APIs up.
  3. GNOME Localization - GNOME Localization has been dead for almost two years now. Ashik has shown interest in re-initiating it and I plan to do that. I first have to get my committer access back.
  4. Documentation - Improve documentation about SMC and IndicProject projects. This will be a troublesome and time consuming task but I still like our tools to have proper documentation.
  5. High Priority Projects - Create a static page about the high priority projects so that people can know where and how to contribute.
  6. Die Wiki, Die - Initiate porting Wiki to a static site using Git and Jekyll (or any similar tool). Tech people should be able to use git properly.

Knowing me pretty much better than anyone else, I understand there is every chance of this being “Never-being-implemented-plan” (അതായത് ആരംഭശൂരത്വം :D) but still I intend to do this in an easy-first order.

September 07, 2016 04:47 AM

August 23, 2016

Anwar N

GSoC 2016 IBus-Braille-Enhancement Project - Summary

   First of all my thanks to Indic Project and Swathanthra Malayalam Computing(SMC) for accepting this project. All hats off to my mentors Nalin Sathyan and Samuel Thibault. The project was awesome and I believe that I have done my maximum without any prior experience

Project Blog :

Now let me outline what we have done during this period.

Braille-Input-Tool (The on-line version)
  Just like Google transliteration or Google Input Tools online. This is required because it's completely operating system independent and it's a modern method which never force user to install additional plugin or specific browser. The user might use this form temporary places like internet cafe. This is written using JQuery and Html. And works well in GNU/Linux, Microsoft windows, Android etc

See All Commits :
Test with following link :

IBus-Braille enhancements
See All Commits :

1 IBus-Braille integrated with Liblouis : The Liblouis software suite provides an open-source braille translator, back-translator and formatter for a large number of languages and braille codes. So maintaining and shipping separate braille maps(located at /share/ibus-sharada-braille/braille) with ibus-braille is a bad idea. Through this we completely adopted Ibus-Braille to use Liblouis. The conversion is done in an entire word manner instead of each letter. ie the conversion does after writing direct braille unicode and pressing space.
Commit 1 :
Commit 2 :
Commit 3 :

See Picture of Ibus-Braille preferences given below

2 8-Dot braille Enabled : Yes languages having more than 64 characters which can't be handled with 64 (6 dots ) combination are there, Music notations like  “Abreu” and LAMBDA (Linear Access to Mathematics for Braille Device and Audio Synthesis) uses 8-dot braille system.  unicode support 8-dot braille.
Commit 1 :

See key/shortcut page picture of ISB preferences dot setting

3 Dot 4 issue Solved :  In IBus-Braille when we type in bharati braille such as Malayalam, Hindi, etc. we have to use 13-4-13 to get letter ക്ക(Kka). But according to braille standard in order to get EKKA one should press 4-13-13. And this make beginners to do extra learning to start typing. Through this project we solved this issues and a conventional-braille-mode switch is provided in preferences in order to switch between.

Commit :

4 Add Facility to write direct Braille Unicode : Now one can use IBus-Braille to type braille dot notation directly with the combination.  The output may be sent to a braille embosser. Here braille embosser is an impact printer that renders text in braille characters as tactile braille cells.

Commit :

5 Three to Six for disabled people with one hand : A three key implementation which uses delay factor between key presses for example 13 followed by
13 having delay less than delay factor (eg:0.2) will give X. If more, then output would be KK. If one want to type a letter having combination only 4,5,6 he have to press "t" key prior. The key and the Conversion-Delay can be adjusted from preferences.

Commit :

6 Arabic language added
Commit :

7 Many bugs solved
Commit :
others are implied

Project Discourse :
IBus-Sharada-Braille (GSoC 2014) :

Plugins for firefox and chrome
    This plugin can be installed will work with every text entry on the web pages no need for copy paste. extensions are written in Javascript.
See All Commits :

Modification yet desirable are as following

1 Announce extra information through Screen Reader:  When user expand abbreviation or a contraction having more than 2 letters is substituted the screen reader is not announcing it. We have to write a orca(screen reader) plugin for Ibus-Braille

2 A UI for Creating and Editing Liblouis Tables

3 Add support for more Indic Languages and Mathematica Operators via liblouis

Braille-input-tool (online version)
                       Liblouis integration
Conventional Braille, Three Dot mode and Table Type selection 
Chrome Extension

Direct braille unicode typing
 Eight dot braille enabled

by Anwar N ( at August 23, 2016 04:39 AM

August 22, 2016


GSoC — Final Report!

So finally it’s over. Today is the last date for submission of the GSoC project. This entire ride was a lot informative as well as an…

by malayaleecoder at August 22, 2016 08:06 PM

Sreenadh T C

It’s a wrap!

“To be successful, the first thing to do is to fall in love with your work — Sister Mary Lauretta”

by Sreenadh T C at August 22, 2016 07:01 AM

August 21, 2016

Arushi Dogra

GSoC Final Report

Its almost the end of the GSoC internship. From zero knowledge of Android to writing a proposal, proposal getting selected and finally 3 months working on the project was a great experience for me! I have learned a lot and I am really thankful to Jishnu Mohan for mentoring throughout .

Contributions include :-

All the tasks mentioned in the proposal were discussed and worked upon.

I started with making the designs of the layouts. The task was to make Santali Olchiki and Soni layouts for the keyboard. I looked at the code of the other layouts to get a basic understanding of how phonetic and inscript layouts work. Snapshot of one of the view of Santali keyboard :

Screen Shot 2016-08-21 at 6.53.03 PM

Language Support Feature 
While configuring languages, the user is prompted about the locales that might not be supported by the phone.

Screen Shot 2016-08-21 at 6.33.25 PM

Adding Theme Feature
Feature is added at the setup to enable user to select the keyboard theme

Screen Shot 2016-08-21 at 6.49.21 PM

Merging AOSP code
After looking at everything mentioned in the proposal, Jishnu  gave me the job of  merging AOSP source code to the keyboard as the current keyboard doesn’t have changes that were released along with  android M code drop because of which target sdk is not 23 . There are a few errors yet to be resolved and I am working on that😀

Overall, it was a wonderful journey and I will always want to be a contributor to the organisation as it introduced me to the world of open source and opened a whole new area to work upon and learn more.
Link to the discourse topic :

Thank You!  :D

by arushidogra at August 21, 2016 01:29 PM

August 17, 2016

Balasankar C

GSoC Final Report


It is finally the time to wind up the GSoC work on which I have been buried for the past three months. First of all, let me thank Santhosh, Hrishi and Vasudev for their help and support. I seem to have implemented, or at least proved the concepts that I mentioned in my initial proposal. A spell checker that can handle inflections in root word and generate suggestion in the same inflected form and differentiate between spelling mistakes and intended modifications has been implemented. The major contributions that I made were to

  1. Improve LibIndic’s Stemmer module. - My contributions
  2. Improve LibIndic’s Spell checker module - My contributions
  3. Implement relatively better project structure for the modules I used - My contributions on indicngram

1. Lemmatizer/Stemmer


My initial work was on improving the existing stemmer that was available as part of LibIndic. The existing implementation was a rule based one that was capable of handling single levels of inflections. The main problems of this stemmer were

  1. General incompleteness of rules - Plurals (പശുക്കൾ), Numerals(പതിനാലാം), Verbs (കാണാം) are missing.
  2. Unable to handle multiple levels of inflections - (പശുക്കളോട്)
  3. Unnecessarily stemming root words that look like inflected words - (ആപത്ത് -> ആപം following the rule of എറണാകുളത്ത് -> എറണാകുളം)

The above mentioned issues were fixed. The remaining category is verbs which need more detailed analysis.

Long Version

A demo screencast of the lemmatizer is given below.

So, comparing with the existing stemmer algorithm in LibIndic, the one I implemented as part of GSoC shows considerable improvement.

Future work

  1. Add more rules to increase grammatical coverage.
  2. Add more grammatical details - Handling Samvruthokaram etc.
  3. Use this to generate sufficient training data that can be used for a self-learning system implementing ML or AI techniques.

2. Spell Checker


The second phase of my GSoC work involved making the existing spell checker module better. The problems I could identify in the existing spell checker were

  1. It could not handle inflections in an intelligent way.
  2. It used a corpus that needed inflections in them for optimal working.
  3. It used only levenshtein distance for finding out suggestions.

As part of GSoC, I incorporated the lemmatizer developed in phase one to the spell checker, which could handle the inflection part. Three metrics were used to detect suggestion words - Soundex similarity, Levenshtein Distance and Jaccard Index. The inflector module that was developed along with lemmatizer was used to generate suggestions in the same inflected form as that of original word.

Long Version

A demo screencast of the lemmatizer is given below.

3. Package structure

The existing modules of libindic had an inconsistent package structure that gave no visibility to the project. Also, the package names were too general and didn’t convey the fact that they were used for Indic languages. So, I suggested and implemented the following suggestions

  1. Package names (of the ones I used) were changed to libindic-. Examples would be libindic-stemmer, libindic-ngram and libindic-spellchecker. So, the users will easily understand this package is part of libindic framework, and thus for indic text.
  2. Namespace packages (PEP 421) were used, so that import statments of libindic modules will be of the form from libindic.<module> import <language>. So, the visibility of the project ‘libindic’ is increased pretty much.

August 17, 2016 04:47 AM

August 16, 2016

Anwar N

IBus-Braille Enhancement - 3

 A hard week passed!

1 Conventional Braille Mode enabled : Through this we solved dot-4 issue and now one can type using braille without any extra knowledge

commit 1 :

2 handle configure parser exceptions : corrupted isb configuration file can make it won't start. so I solved this by proper exception handling

commit 2 :

3 Liblouis integration : I think our dream is about to come true!  But still also we are struggling with vowel substitution on the middle.
commit 3 :
commit 4 :
commit 5 :

by Anwar N ( at August 16, 2016 08:35 PM

August 09, 2016

Sreenadh T C

What now?

“Now that the basic aim was fulfilled, what more can we work on, given there is almost half a month to GSoC Submission!”

by Sreenadh T C at August 09, 2016 01:56 PM

Anwar N

IBus-Braille Enhancement - 2

Hi, with this week I where fighting with my final semester exams! and it's over.  Also within this week I added the facility for typing direct braille Unicode.

instead of converting to Unicode I added it as a new language so that one can later edit and use. 

by Anwar N ( at August 09, 2016 03:40 AM

July 31, 2016


Sreenadh T C

July 30, 2016

Rajeesh K Nambiar

Convert iPhone contacts to vCard

On a recent troubleshooting attempt, I lost all the contacts in my Android phone. It had also received a recent update which took away the option to import contacts from another phone via bluetooth.
I still had some contacts in the old iPhone, but now that mass transfer via bluetooth is gone, it was a question of manually sending each contact in vCard format to the Android phone. That means I should probably find a less dreadful way to get the contacts back.

Here is one way to extract contacts en-masse from iPhone into popular vCard format. The contact and address details in iOS are stored by AddressBook application in a file named ‘AddressBook.sqlitedb’ which is an sqlite database. The idea is to open this database using sqlite, extract the details from a couple of tables and convert the entries into vCard format.

Disclaimer: the iPhone is an old 3GS running iOS 6 and it is jailbroken. If you attempt this, your mileage would vary. Required tools/softwares are usbmuxd (especially libusbmuxd-utils) and sqlite, with the prerequisite that openssh server is running on the jailbroken iPhone.

  1. Connect iPhone via USB cable to the Linux machine. Run iproxy 2222 22 to connect to the openssh server running on the jailbroken phone. iproxy comes with libusbmuxd-utils package.
  2. Copy the addressbook sqlite database from phone:scp -P 2222 mobile@localhost:/var/mobile/Library/AddressBook/AddressBook.sqlitedb .Instead of steps 1 and 2 above, it might be possible to copy this file using Nautilus (gvfs-afc) or Dolphin (kio_afc) file manager, although I’m not sure if the file is accessible.
  3. Extract the contact and address details from the sqlite db (based on this forum post):sqlite3 -cmd ".out contacts.txt" AddressBook.sqlitedb "select ABPerson.prefix, ABPerson.first,ABPerson.last,ABPerson.organization, c.value as MobilePhone, h.val ue as HomePhone, he.value as HomeEmail, w.value as WorkPhone, we.value as WorkEmail,ABPerson.note from ABPerson left outer join ABMultiValue c on c.record_id = ABPerson.ROWID and c.label = 1 and 3 left outer join ABMultiValue h on h.record_id = ABPerson.ROWID and h.label = 2 and = 3 left outer join ABMultiValue he on he.record_id = ABPerson.ROWID and he.label = 2 and = 4 left outer join ABMultiValue w on w.record_id = ABPerson.ROWID and w.label = 4 and = 3 left outer join ABMultiValue we on we.record_id = ABPerson.ROWID and we.label = 4 and = 4;"
  4. Convert the extracted contact details to vCard format:cat contacts.txt | awk -F\| '{print "BEGIN:VCARD\nVERSION:3.0\nN:"$3";"$2";;;\nFN:"$2" "$3"\nORG:"$4"\nEMAIL;type=INTERNET;type=WORK;type=pref:" $9"\nTEL;type=CELL;type=pref:"$5"\nTEL;TYPE=HOME:"$6"\nTEL;TYPE=WORK:"$8"\nNOTE:"$9"\nEND:VCARD\n"}' > Contacts.vcf
  5. Remove the empty content lines if some contacts do not have all the different fields:sed -i '/.*:$/d' Contacts.vcf

Now simply transfer the Contact.vcf file containing all the contact details to Android phone’s storage and import contacts from there.

Tagged: hacking, linux, mac

by Rajeesh at July 30, 2016 09:16 AM

Anwar N

IBus-Braille Enhancement - 1

  This week I forked IBus-Braille project from SMC GitLab repository  added two things.

1 Eight-Dot braille enabled. Now one can add languages with 8 dot's. The default keys are Z for dot 7 and period for dot-8. This can be remapped using preferences.

2 Arabic Language added and tested with users

See commits :

by Anwar N ( at July 30, 2016 03:23 AM

July 26, 2016

Arushi Dogra

Updates on work

My next task was to show instead of all layouts, filter them on the basis of language. My first option I decided to do filtering based on locale. So instead of ACTION_INPUT_METHOD_SUBTYPE_SETTINGS we can use ACTION_LOCALE_SETTINGS but the problem here was that it was giving a list of all the locales in the system instead of the locales in our app. So I skipped this idea. And then decided to create a list and enable users selection on that. But there was no way to connect that to enabled system subtypes. I was stuck on this for quite some time .We ditched the plan and moved on to the “Theme selection” task.

I am currently working on the Theme Selection task . I have successfully added the step . But now I am working on adding the fragment instead of the whole activity . After I am done with this, I will move to adding the images of the themes. I will hopefully complete this task by the weekend.

Also , after a meeting with the mentor, it is decided that after this task I will work on merging AOSP source code to the keyboard as the current keyboard doesn’t have changes that were released along with  android M code drop because of which target sdk is not 23 . So my next task will be merging AOSP code which will give the benifit of run time permissions.😀

by arushidogra at July 26, 2016 12:34 AM

July 25, 2016

Balasankar C

4 Days. 22 Hours. LaTeX.

Heya folks,

One of the stuff I love doing is teaching what I know to others. Though it is a Cliché dialogue, I know from experience that when we teach others our knowledge expands. From 10 students, you often get 25 different doubts and minimum 5 of them would be ones you haven’t even thought yourself earlier. In that way, teaching drives a our curiosity to find out more.

I was asked to take a LaTeX training for B.Tech students as a bridge course (happening during their semester breaks. Poor kids!). The usual scenario is faculty taking class and we PG students assisting them. But, since all the faculty members were busy with their own subjects’ bridge courses and LaTeX was something like an additional skill that the students need for their next semesters for their report preparation, I was asked to take to take it with the assistance of my classmates. At first, I was asked to take a two-day session for third year IT students. But later, HOD decided that both CS and IT students should have that class, and guess what - I had to teach for four days. Weirdly, the IT class was split to two non-continuous dates - Monday and Wednesday. So, I didn’t have to take class for four consecutive days, but only three. :D

The syllabus I followed is as follows:

  • Basic LaTeX – Session I
    1. Brief introduction about LaTeX, general document structure, packages etc.
    2. Text Formatting
    3. Lists – Bullets and Numbering
  • Graphics and Formulas – Session II
    1. Working with Images
    2. Tables
    3. Basic Mathematical Formulas
  • Academic Document Generation (Reports and Papers) – Session III
    1. Sectioning and Chapters
    2. Header and Footer
    3. Table of Contents
    4. Adding Bibliography and Citations
    5. IEEETran template
  • Presentations using Beamer – Session IV

As (I, not the faculty) expected, only half of the students came (Classes on semester breaks, I was surprised when even half came!). Both the workshops - for CS and IT - were smooth without any much issues or hinderences. Students didn’t hesitate much to ask doubts or tips on how to do stuff that I didn’t teach (Unfortunately, I didn’t have time to go off-syllabus, so I directed them to Internet. :D). Analysing the students, CS students were more lively and interactive but they took some time to grasp the concept. Compared to them, even though kind of silent, IT students learned stuff fast.

By Friday, I had completed 4 days, around 22 hours of teaching and that too non-stop. I was tired each day after the class, but it was still fun to share the stuff I know. I would love to get this chance again.

IT Batch

IT Batch

CSE Batch

CSE Batch

July 25, 2016 12:00 AM

July 24, 2016

Sreenadh T C

Developing the Language Model

Finally, I can start the work towards Milestone — 2, which is completing the development of Language Model for Malayalam. Time to…

by Sreenadh T C at July 24, 2016 06:13 AM

July 23, 2016

Santhosh Thottingal

Manjari Font

I am happy to announce the new Malayalam font I was designing for past several months. The new font is named “Manjari”.

Malayalam blogpost:

Malayalam script is known for its curly characters with beautiful loops. Encoded in unicode around 2001, it is relatively new to the digital age. The script has been evolving from rectangle shaped to oval shaped types of varying proportions. The popular culture is more of oval/ellipse shaped curves, mainly because writing methods using stensils or pens demanded less sharp corners. The character or ligature shapes has also been changing gradually towards the shapes that are easy with pens. The Manjari font takes that to another level by smoothening all curves to its maximum.

The curves are constructed along the spiral segments. The resulting shapes are extra smooth. The curve perfection resulted in whitespaces that aquired beautiful leaf and drop shapes between the bowls and loops of the script. It is illustrated in the specimen. The spiral smoothness of curves were complemented by rounded terminals which gives very soft feeling for the eyes.

The design of the curves in Manjari are theoretically based on the PHD thesis by Raph Levien – “From Spiral to Spline: Optimal Techniques in Interactive Curve Design” ( The Inconsolata monospace humanist latin font known for its clean lines and elegant design by Levien himself is based on this theory.

Normal, Bold, Thin style variants are available. This is the first Malayalam unicode font with thin style variant


The curve strokes in Manjari were drawn in Inkscape using the spiral library written by Raph Levien himself  and opentype feature compilation was done using FontForge. The font is about to release in next few days, SVGs, scripts and source is available at 

Orion Champadiyil prepared some illustrations using the font, you can see them in our font download page

News reports


by Santhosh Thottingal at July 23, 2016 06:29 AM

July 21, 2016

Balasankar C

Kerala State IT Policy - A Stakeholder Consultation

Heya folks,

Last Saturday, that is 16th July, I attendeda a meeting regarding the upcoming Kerala State IT Policy. It was a stakeholder consultation organized by DAKF, Software Freedom Law Centre and Ernakulam Public Library Infopark branch. The program was presided by Prasanth Sugathan of SFLC (I had met him during Swatanthra, when I helped Praveen in the Privacy track) and was inaugurated by M. P Sukumaran Nair, advisor to the Minister of Industries. The agenda of the meeting was to discuss about the suggestions that needs to be submitted to the Government before they draft the official IT policy, that will be in effect for the next few years. I attended the meeting representing Swathanthra Malayalam Computing. Even though the meeting had a small audience, some of the key topics were brought into the mix.

Professor Jyothi John, retired principal of Model Engg. College, discussed about MOOCs to improve the education standard of the State. He also talked about improving the industry-academia-research relationship that is in a pathetic state as of now. I was asked to talk a few words. But, since SMC hadn’t taken any official stand or points for the meeting, I actually talked about my views about the issue. Obviously, my topics were more focused on Language Computing, Digital empowerment of the language and as well as how FOSS should be the key stone of the IT policy. I also mentioned about the E-Waste problem that Anivar had discussed the other day on the Whatsapp group.

Me Talking

Me Talking | PC: Sivahari

Mr. Joseph Thomas, the president of FSMI also talked on the importance of FOSS in IT policy (Kiran Thomas had some pretty strong disagreements with it. :D ). Following that, Babu Dominic from BSNL talked about their success stories with FOSS and how the project was scraped by government. There were some brilliant insights from Satheesh, who is a Social Entrepreneur now and once ran an IT-based company.

Following that, the meeting took the form of a round table discussion where interesting points regarding E-Waste and the money-saving nature of FOSS (Microsoft has been targetting Institutions for pirated copies, not home users) were raised by Mr. Bijumon, Asst Professor of Model Engg College. Mr. Jayasreekumar, who is a journalist talked about the important issue of the downtrodden people, or the people in the lower socio-economic belt were not part of the discussion and the image of digital divide that carves. We have to seriously increase diversity of participants in these meetings, as a large part of the population has no representation in them. Such meetings will be only fruitful, if the sidelined communities who also should benefit from this policy are brought together to participate in them.

The general theme of the meeting was pointing towards how the IT policy should focus more on the internal market, and how it should be helpful in entrepreneurs in competing with foreign competitors, atleast in the domestic market.

News Coverage in Deshabhimani

News Coverage | PC: Deshabhimani

More and more meetings of this nature are a must, if the state is to advance in the domain of IT.

July 21, 2016 12:00 AM

July 20, 2016

Anwar N

work progress in browser-addon

 About two months passed. We do many testing on online braille-input tool. And some widgets rearranged for user comforts. In the recent weeks we made a good progress in both Firefox and Chrome browser addons. But still we suffer from a grate problem with  these addons, The plugins are not working in google chat and Facebook chat entry's.  We are seeking the solution...

by Anwar N ( at July 20, 2016 08:15 PM

July 19, 2016

Balasankar C

GSoC Update: Week #7 and #8


Last two weeks were seeing less coding and more polishing. I was fixing the LibIndic modules to utilize the concept of namespace packages (PEP 420) to obtain the libindic.module structure. In the stemmer module, I introduced the namespace package concept and it worked well. I also made the inflector a part of stemmer itself. Since inflector's functionality was heavily dependent on the output of the stemmer, it made more sense to make inflector a part of stemmer itself, rather than an individual package. Also, I made the inflector language-agnostic so that it will accept a language parameters as input during initialization and select the appropriate rules file.

In spellchecker also, I implemented the namespace concept and removed the bundled packages of stemmer and inflector. Other modifications were needed to make the tests run with this namespace concept, fixing coverage to pick the change etc. In the coding side, I added weights to the three metrics so as to generate suggestions more efficiently. I am thinking about formulating an algorithm to make comparison of suggestions and root words more efficient. Also, I may try handling spelling mistakes in the suffixes.

This week, I met with Hrishi and discussed about the project. He is yet to go through the algorithm and comment on that. However he made a suggestion to split out the languages to each file and make more clean (just importing these split language files). He was ok with the work so far, as he tried out the web version of the stemmer.

[caption id="attachment_852" align="aligncenter" width="800"]hrishi_testing_spellchecker Hrishi testing out the spellchecker[/caption]

July 19, 2016 12:47 AM

July 16, 2016


GSoC Progress — Week 6 & 7

Alright, for the past two weeks, me and my mentor have been trying a lot to call the varnam library in Java. First we went on trying to…

by malayaleecoder at July 16, 2016 06:49 AM

Sreenadh T C

Mentioning the huge contributions

“ In open source, we feel strongly that to really do something well, you have to get a lot of people involved. — Linus Torvalds ”

by Sreenadh T C at July 16, 2016 06:14 AM

Santhosh Thottingal

A short story of one lakh Wikipedia articles

At Wikimedia Foundation, I am working on a project to help people translate articles from one language to another. The project started in 2014 and went to production in 2015.

Over the last one year, a total of 100,000 new artcles were created across many languages. A new article get translated in every five minutes, 2000+ articles translated per week.

The 100000th Wikipedia page created with Content Translation is in Spanish, for the song ‘Crying, Waiting, Hoping

I designed the technical architecture and continue to be the main developer. I am so proud to be part of a project that contributed this much for free knowledge.

Related blog posts in Wikimedia blog:

Content translation tool hits milestone with one hundred thousand articles

Wikipedia’s coverage of essential vaccines is expanding

Content Translation tool has now been used for 50,000 articles

Semi-automated content translation is coming to Scandinavian Wikipedias

Naradanews Malayalam published a note on this

ഓരോ അഞ്ച് മിനിറ്റിലും പുതിയ ലേഖനം; വിക്കിപീഡിയ ബഹുഭാഷാ സാന്നിധ്യം വർദ്ധിപ്പിക്കുന്നു

by Santhosh Thottingal at July 16, 2016 05:29 AM

July 15, 2016

Santhosh Thottingal

FOSS migration of electronic circuit simulation lab

My proposal for migrating basic electronic circuit simulation lab to the FOSS tool eSim has been approved. The source code and documentation of experiments can now be downloaded from here.

eSim is an open source EDA tool for circuit design, simulation, analysis and PCB design. eSim is developed by FOSSEE (Free and Open Source Software for Education) – an initiative of MHRD, Govt. of India. FOSSEE promotes the migration of labs in educational institutions from proprietary tools to FOSS only ones through lab migration projects.

I am really happy to have become a part of this project. You can read my previous post on eSim usage here.

by Kavya Manohar at July 15, 2016 02:41 PM

July 11, 2016

Arushi Dogra

Update on work

The week started with continuing the task for detection of supported locales. I was facing some problems initially. I was trying to first change the contents of a static file during runtime which I later realised couldn’t be done. So as directed by mentor I changed the approach and decided to prompt the user at the setup time about which languages might not be supported by the phone.
It looks something like this:


Unfortunately my system crashed and the later part of my time was given to formatting the laptop,taking backup, installing the OS and re-setup of the project. Then I went home for my parents wedding anniversary for 3 days.

My next task : Improving the setup wizard . Since the user might not be interested in all the languages , so instead of showing all the layouts at once , we are planning to first ask the user to chose the language and then the corresponding layout in it. I have to discuss more with Jishnu regarding this task.

by arushidogra at July 11, 2016 07:16 PM

July 08, 2016


ട്രോളുകളുടേ ജനാധിപത്യം – മനോരമ ന്യൂസിലെ നിയന്ത്രണരേഖയിൽ

മനോരമ ന്യൂസിലെ നിയന്ത്രണരേഖയിൽ ട്രോളുകളുടേ ജനാധിപത്യം എന്ന വിഷയത്തിൽ നടന്ന ചർച്ചയിൽ ഐ.സി.യു. വിനെ പ്രതിനിധീകരിച്ച് പങ്കെടുത്തു. വി.ടി ബൽറാം, ഉഴവൂർ വിജയൻ, വി.വി രാജേഷ്, ആർദ്ര നമ്പ്യാർ , സുഭാഷ് നായർ എന്നിവരും പങ്കെടുത്തിരുന്നു

by Hrishi at July 08, 2016 07:46 PM

July 06, 2016

Anwar N

Braille-Input-Tool : The final touch


            With this two weeks we have done many testing with users and done many additions according to their needs. The first one  is Key reassigning. as you know there are many keyboard variants also user like to set there own keys instead of using f,d,s,j,k and l. But this make the necessity of saving user preferences. So we done this using jstorage. it's working fine

Key reassigning can be done by clicking "Configure Keys" button which will popup many entry's where user can remap his keys. Restore option is also provided there.

New and Save are the basic things that should be provided by a online editor

Changing font color, font size and background color are very impotent for partially impaired blind people. For keeping the page accessible we choose combobox containing major color list instead of providing graphical color picker

Various bugfixes

Braille-Input-Tool :
See all commits :

by Anwar N ( at July 06, 2016 09:09 PM

July 05, 2016

Balasankar C

GSoC Update: Week #5 and #6


Last two weeks were spent mostly in getting basic spellchecker module to work. In the first week, I tried to polish the stemmer module by organizing tags for different inflections in an unambiguous way. These tags were to be used in spellchecker module to recreate the inflected forms of the suggestions. For this purpose, an inflector module was added. It takes the output of stemmer module and reverses its operations. Apart from that, I spent time in testing out the stemmer module and made many tiny modifications like converting everything to a sinlge encoding, using Unicode always, and above all changed the library name to an unambiguous one - libindic-stemmer (The old name was stemmer which was way too general).

In the second week, I forked out the spellchecker module, convert the directory structure to match the one I've been using for other modules and added basic building-testing-integration setup with pbr-testtools-Travis combination. Also, I implemented the basic spell checking and suggestion generation system. Like stemmer, marisa_trie was used to store the corpus. Three metrics were used to generate suggestions - Soundex similarity, Levenshtein Distance and Jaccard's Index. With that, I got my MVP (Minimum Viable Product) up and running.

So, as of now, spell checking and suggestion generation works. But, it needs more tweaking to increase efficiency. Also, I need to formulate another comparison algorithm, one tailored for Indic languages and spell checking.

On a side note, I also touched indicngram module, ported it to support Python3 and reformatted it to match the proposed directory that I have been using for other modules. A PR has been created and am waiting for someone to accept it.

July 05, 2016 01:57 PM

July 03, 2016

Santhosh Thottingal

ഡോക്ടർ ബി എം ഹെഗ്ഡേ – മാതൃഭൂമി ആഴ്ചപ്പതിപ്പിൽ വന്ന അഭിമുഖത്തെപ്പറ്റി

മാതൃഭൂമി ആഴ്ചപ്പതിപ്പിന്റെ കഴിഞ്ഞയാഴ്ച പുറത്തിറങ്ങിയ പതിപ്പിൽ(ലക്കം 16) ഡോ ബി എം ഹെഗ്‌ഡെയുമായി അഭിമുഖമുണ്ട്. കവർസ്റ്റോറിയാണു്.  ലോകപ്രശസ്ത ജനപക്ഷ ഡോക്ടറാണു്, ആരോഗ്യം സമൂഹത്തിന്റെ ആരോഗ്യമാണെന്നൊക്കെ പറഞ്ഞുള്ള ആമുഖത്തിനപ്പുറം വായിച്ച് പോയപ്പോൾ എന്റെ പരിമിതമായ ശാസ്ത്രവിജ്ഞാനത്തിനൊരിക്കലും അംഗീകരിക്കാൻ കഴിയാത്ത കപടവാദങ്ങളാണു് കാണാനിടയായതു്.  പത്മ ഭൂഷൺ ലഭിച്ച, മണിപ്പാൽ യൂണിവേഴ്സിറ്റിയുടെ വൈസ് ചാൻസ്‌ലറായിരുന്ന, നിരവധി ബിരുദങ്ങളും വർഷങ്ങളുടെ വൈദ്യശാസ്ത്ര അനുഭവപരിചയവുമുള്ള ഇദ്ദേഹം ഇങ്ങനെയൊക്കെ ശാസ്ത്രത്തെ അവതരിപ്പിക്കുന്നതിൽ എനിക്കത്ഭുതമുണ്ട്.

ആരോഗ്യരംഗത്തെ സാമ്പത്തിക ചൂഷണങ്ങൾ, ഡോക്ടർ രോഗി ബന്ധങ്ങളിലെ തെറ്റായ പ്രവണതകൾ തുടങ്ങി ആർക്കും അംഗീകരിക്കാൻ അത്ര വിഷമമൊന്നുമില്ലാത്ത വിഷയങ്ങളുടെ മറവിൽ ആധുനിക വൈദ്യശാസ്ത്രത്തെയും അതിന്റെ നേട്ടങ്ങളെയും അപ്പാടെ നിരാകരിയ്ക്കുകയാണു് ഹെഗ്‌ഡെ ചെയ്യുന്നതു്. അദ്ദേഹത്തിനു ലഭിച്ച ശാസ്ത്ര വിദ്യാഭ്യാസത്തെയും അലങ്കരിച്ച പദവികളെയും അപ്പാടെ ചോദ്യചിഹ്നമാക്കുന്ന രീതിയിലുള്ള നിരീക്ഷണങ്ങൾ നമ്മുടെ നാട്ടിലെ ലാടവൈദ്യൻമാരുടെയോ കപടശാസ്ത്രവാദക്കാരുടെയോ വാദങ്ങളിൽ നിന്നും ഒട്ടും ഭിന്നമല്ല.

അഭിമുഖത്തിൽ നിന്നുള്ള ചില ഭാഗങ്ങൾ താഴെകൊടുക്കുന്നു.

“ഞാൻ പഠിക്കുന്ന കാലത്തു് ഹൃദയാഘാതം വന്നാൽ രോഗിക്ക് ഓക്സിജൻ നൽകുമായിരുന്നു. ഹൃദയാഘാതം വന്നവർക്ക് ഓക്സിജന്റെ അളവ് കുറച്ചുമതിയെന്ന്(Hypoxia)  പിന്നീടാണു് കണ്ടെത്തിയതു്. എത്രപേരെ നമ്മൾ ഓക്സിജൻ കൊടുത്തു് കൊന്നു?” ഹാർവാഡ് മെഡിക്കൽ സ്കൂളിൽ നിന്നും നോബൽ സമ്മാനം നേടിയ ബെർണാഡ് ലോണിന്റെ കീഴിൽ കാർഡിയോളജിയിൽ പരിശീലനം നേടിയ ആളാണ് ഡോ ഹെഗ്ഡെ എന്നു് അദ്ദേഹത്തിനെക്കുറിച്ചുള്ള വിക്കിപീഡിയ ലേഖനത്തിൽ പറയുന്നു.  ഓക്സിജൻ കൊടുത്തുവെന്ന അമിതമായി ലളിതവത്കരിച്ച മേൽപ്രസ്താവന കൊണ്ട് ഡോ ഹെഗ്ഡെ എന്താണുദ്ദേശിക്കുന്നതു്? ഓക്സിജന്റെ അളവു ക്രമീകരിച്ചുകൊണ്ടുള്ള ചികിത്സാപഠനങ്ങൾ നടന്നിട്ടുണ്ട് എന്നതുകൊണ്ട് ഓക്സിജൻ കുറച്ചു മതിയെന്നും ഓക്സിജൻ കൊടുത്തുകൊന്നുവെന്നും പറയുന്നതു് അദ്ദേഹത്തിന്റെ വിദ്യാഭ്യാസത്തിനു ചേർന്നതാണോ? ഇതുവായിക്കുന്ന സാധാരണക്കാർ എന്താണതിൽ നിന്നും മനസ്സിലാക്കേണ്ടതു്?.

വെളിച്ചെണ്ണയുടെ മാഹാത്മ്യത്തെപ്പറ്റി അദ്ദേഹം വിവരിക്കുന്നുണ്ട്. അമേരിക്കയിൽ അൾഷെമേഴ്സിനും ജീവിതശൈലീരോഗങ്ങൾക്കും വെളിച്ചെണ്ണ ഡോക്ടർമാർ നിർദ്ദേശിക്കുന്നുവെന്നു അദ്ദേഹം പറയുന്നു. അറുപതുകളിൽ ഡോ ഹെഗ്ഡേ ഇതുപറഞ്ഞപ്പോൾ ആരും കണക്കിലെടുത്തില്ല പോലും. “നവജാതശിശുവിനുപോലും വെളിച്ചെണ്ണ വായിൽവെച്ച് കൊടുക്കാം, വായിൽവെച്ചുതന്നെ ദഹിക്കും”-മലയാളികൾ വെളിച്ചെണ്ണ ഉപേക്ഷിക്കാനുള്ള ബുദ്ധി കാണിച്ചതുകൊണ്ടാണു് രോഗങ്ങൾ കൂടിയതു്. നവജാതശിശുക്കൾക്ക് മുലപ്പാലിനുപകരം വെളിച്ചെണ്ണ കൊടുക്കാം എന്ന് അദ്ദേഹം വേറേ പലയിടങ്ങളിൽ പറയുന്നതായും വായിച്ചു( ഈ വാദത്തിന്റെ ഉള്ളടക്കത്തിലേയ്ക്ക് ഞാൻ പോകുന്നില്ല, പക്ഷേ ശാസ്ത്രീയമായ നിഗമനങ്ങളിൽ എത്തിച്ചേരുന്ന മാർഗങ്ങളും അവയുടെ അംഗീകാരത്തിനു് അവലംബിയ്ക്കുന്ന മാർഗങ്ങളും അറിയൊന്ന ഏതൊരാൾക്കും ഈ “ഒറ്റമൂലി” വാദങ്ങൾ മുഖവിലയ്ക്കെടുക്കാൻ എളുപ്പമല്ല. പൊതുജനങ്ങളുടെ മുന്നിൽ ഇത്തരം വാദങ്ങൾ അവതരിപ്പിക്കുന്നതു് എത്രമാത്രം ഉത്തരവാദിത്തത്തോടുകൂടിയായിരിക്കണം എന്നും അദ്ദേഹത്തിനറിയേണ്ടതല്ലേ?

“കോശങ്ങൾ സ്വയം അറ്റകുറ്റപ്പണികൾ നടത്തുമെന്ന വിപ്ലവകരമായ അറിവ് ശാസ്ത്ര ലോകം രഹസ്യമാക്കിവെച്ചിരിക്കുകയാണു്”. ഈ രഹസ്യം ഡോ ഹെഗ്ഡേ മാതൃഭൂമി ആഴ്ചപ്പതിപ്പ് ലക്കം 16 ൽ പുറത്തുവിട്ടു.

“അവയവം മാറ്റിവെയ്ക്കൽ ഒരു രോഗത്തിനും പരിഹാരമല്ല, കാറിനെപ്പോലെ യാന്ത്രികമായാണ് ശരീരം പ്രവർത്തിക്കുന്നതെന്ന പഴയ ശാസ്ത്രം പഠിച്ചവരാണു് ട്രാൻസ്പ്ലാന്റേഷൻ തുടരുന്നത്”

“ശാസ്ത്രത്തെ കിഴക്കെന്നും പടിഞ്ഞാറെന്നും നാം ഒരിക്കലും വേർതിരിക്കേണ്ടതില്ല ” എന്നുപറയുന്ന ഡോ ഹെഗ്ഡേ അടുത്ത വാചകത്തിൽ പറയുന്നു:  “പാശ്ചാത്യ ശാസ്ത്രം ഇന്ന് എല്ലാ രംഗത്തും ആധിപത്യം നേടിയിട്ടുണ്ട്, അതു് താമസിയാതെ തകർന്നു വീഴും”. ഡോ ഹെഗ്ഡേ അദ്ദേഹത്തിന്റെ പ്രൊഫഷനിൽ സ്വയം വിശ്വസിക്കുന്നുണ്ടോ എന്നാർക്കെങ്കിലും ചോദിക്കാൻ തോന്നിയാൽ കുറ്റപ്പെടുത്താനാവുമോ?

“പണ്ട് അമ്പലങ്ങളിൽ വെള്ളിപ്പാത്രത്തിൽ തീർഥജലം നൽകുമായിരുന്നു, പന്ത്രണ്ടുവർഷമൊക്കെ മണ്ണിനടിയിൽ സൂക്ഷിക്കുന്ന നാനോപാർട്ടിക്കിൾസ് അതിലുണ്ടാവും. “.  ക്വാണ്ടം മെക്കാനിക്സ് പോലെ പറയുന്ന വിഷയത്തിനു കനം കൂട്ടാൻ വേണ്ടി ചേർക്കുന്ന മറ്റൊരു വാക്കാണ് നാനോ സയൻസ്. കേൾക്കുന്നവർക്ക് അറിയാൻ സാധ്യത കുറവാണെന്ന വിശ്വാസമാണതിനു പിന്നിൽ. ഡോ ഗോപാലകൃഷ്ണനൊക്കെ നന്നായി ഉപയോഗിക്കുന്ന വിദ്യയാണതു്. വെള്ളത്തിൽ നാനോ പാർട്ടിക്കിൾസ് ഉണ്ടാവും എന്നതുകൊണ്ട് എന്തുണ്ടാവാനാണു്? വെള്ളത്തിൽ മൺതരികളുണ്ടാവുമെന്നോ, പായലുണ്ടാവുമെന്നോ പറയുന്നതിനേക്കാൾ ഇതിനൊക്കെ എന്തെങ്കിലും അർത്ഥമുണ്ടോ?  “ഞങ്ങളുടെ നാനോ സിൽവർ ലായനിക്ക് വില പത്തുരൂപയേ ഉള്ളൂ, ഉപകാരികളായ അണുക്കളെ അതു നശിപ്പിക്കില്ല, രോഗകാരികളെ പെരുകാൻ അതു് അനുവദിക്കുകയുമില്ല”-പേറ്റന്റ് കിട്ടാൻ കൈക്കൂലി കൊടുക്കാൻ തയ്യാറാകാത്തതുകൊണ്ട് പുറത്തിറക്കിയിട്ടില്ലെന്നും അദ്ദേഹം പറയുന്നു. ഹൃദയാഘാതം തടയാൻ ഒരു ഇലക്ട്രോമാഗ്നറ്റിക് ഉപകരണവും ഇദ്ദേഹം കണ്ടുപിടിച്ചതായി അവകാശപ്പെടുന്നു. അതിനും പേറ്റന്റ് കിട്ടിയിട്ടില്ല. കൈക്കൂലി തന്നെ പ്രശ്നം.

“പ്രാചീനമായ ആചാരങ്ങളെ ശാസ്ത്രീയമായി വിശകലനം ചെയ്തു് നമുക്ക് പല ആരോഗ്യപ്രശ്നങ്ങളെയും പരിഹരിക്കാം”. പ്രധാനമന്ത്രി മുതൽ നാട്ടിലെ ഒറ്റമൂലി വൈദ്യൻമാർവരെ  പറയുന്ന ഈ മന്ത്രം ഇദ്ദേഹവും ഉരുവിട്ടു. ഒരു കണക്കിൽ അഭിമുഖത്തിന്റെ ഒറ്റവാചകത്തിലുള്ള ചുരുക്കമായും ഈ പ്രസ്താവനയെ പരിഗണിക്കാം. പക്ഷേ അഭിമുഖം വരും ലക്കങ്ങളിൽ തുടരും എന്നു മുന്നറിയിപ്പുണ്ട്.

“നിങ്ങൾ ആരോഗ്യത്തോടെയിരിക്കുമ്പോൾ ഒരിക്കലും ഡോക്ടറുടെയടുത്തു് ചെക്ക് ചെയ്യാൻ പോകരുതു്” ഹെൽത്ത് ചെക്കപ്പുകൾക്കെതിരെ അദ്ദേഹം സംസാരിക്കുന്നുണ്ട്.

“ഡോക്ടർമാർ ഹാർട്ട് ബ്ലോക്കെന്നു പറഞ്ഞാൽ ഒരിക്കലും ഭയക്കരുതു്, എല്ലാ മനുഷ്യർക്കും ബ്ലോക്ക് ഉണ്ടാവും” -” സ്കൂളിൽ പോകുന്ന കുട്ടികൾക്കും ബ്ലോക്കുണ്ടാവും, വിയറ്റ്നാം കൊറിയൻ യുദ്ധങ്ങളിൽ വെടിയേറ്റ് വീണ അമേരിക്കൻ പട്ടാളക്കാരിൽ എല്ലാർക്കും ബ്ലോക്കുണ്ടായിരുന്നു!”. വെടിയേറ്റ് മരിച്ച പട്ടാളക്കാരുടെ ബ്ലോക്കിന്റെ കാര്യം സ്വല്പം വിശ്വസിക്കാൻ തോന്നുന്നു. ലേസർ ബൈപാസ് സർജറി നടത്തിയ  തെരഞ്ഞെടുത്തവരിൽ നടത്തിയപ്പോൾ, കുറച്ചുപേർക്ക് വെറും ബോധം കെടുത്തുകയേ ഉണ്ടായുള്ളൂ. “പക്ഷേ അവർ പുർണ ആരോഗ്യവാൻമാരായിരുന്നു. ശസ്ത്രക്രിയ നടത്തിയവർക്ക് ആരോഗ്യപ്രശ്നങ്ങളുണ്ടായിരുന്നു. പ്ലസിബോ ഇഫക്ടാണ് രോഗം മാറ്റുന്നതെന്ന് ഇതു് സംശയാതീതമായി തെളിയിക്കുന്നു”. പൂർണ ആരോഗ്യമുള്ളവരിലാണോ ഇങ്ങനെ ബോധംകെടുത്തി ശസ്ത്രക്രിയ നടത്തിയെന്നു പറഞ്ഞു വിശ്വസിപ്പിച്ചതെന്നു വ്യക്തമല്ല.

“രോഗമൊന്നുമുണ്ടാവില്ലെന്ന ശുഭാപ്തിവിശ്വാസമുള്ളവരായാൽ തന്നെ ശരീരത്തിലെ കാൻസർ കോശങ്ങൾക്ക് പെരുക്കാൻ കഴിയില്ല”. കാൻസർ ചികിത്സയ്ക്ക് സെൽഫ് ഹെൽപ് പുസ്തകങ്ങൾ ഉപയോഗിക്കാമോ?

“അറിവില്ലായ്മയുണ്ടാക്കുന്ന അഹങ്കാരമാണ് എല്ലാ ആരോഗ്യപ്രശ്നങ്ങളും സൃഷ്ടിക്കുന്നതു്. പണം മുടക്കി ലോകത്തൊരാൾക്കും ആരോഗ്യം വാങ്ങാനാവില്ല”

“പരിണാമജീവശാസ്ത്രം പഠിച്ചാൽ ഡാർവിനും മെൻഡലുമൊക്കെ പഴഞ്ചനാണെന്നും അണുക്കളിൽ നിന്നും പരിണമിച്ചുണ്ടായവരാണ് മനുഷ്യരെന്നും ബോധ്യമാകും”.

“ഫിസിക്സിലെ തത്വമെടുത്തു് വിമാനവും മിസൈലുകളുണ്ടാക്കുന്നതും ശാസ്ത്ര സാങ്കേതികവിദ്യയുടെ പേരിൽ ഒരു ലക്ഷം കോടി ചെലവു ചെയ്തു് ചന്ദ്രനിൽ പോകുന്നതും വെറും ഈഗോയെ തൃപ്തിപെടുത്തലാണ്. അതുകൊണ്ട് ഭൂരിപക്ഷം വരുന്ന സാധാരണകാർക്കും ഒരു നേട്ടവും ഇല്ല” – ഇന്ത്യയിലെ ശാസ്ത്ര വിദ്യാർത്ഥികൾക്കും പൊതുജനത്തിനും ഇദ്ദേഹം നൽകുന്ന സന്ദേശമെന്താണു്?

ഇനിയുമുണ്ട്. പക്ഷേ ഇതെല്ലാം നമ്മുടെ നാട്ടിലെ ലാടവൈദ്യൻമാരും വടക്കഞ്ചേരിമാരും ഗോപാലകൃഷ്ണനെപ്പോലുള്ളവരും എത്രയോവട്ടം പറഞ്ഞ് നമ്മൾ കേട്ടിട്ടുള്ളതുതന്നെ. പ്രതിരോധവാക്സിനുകളെ സംബന്ധിച്ച് ഗൌരവകരമായ ചർച്ചകൾ നടക്കുന്ന ഇക്കാലത്തു്, ആ വിഷയത്തിൽ പ്രതിലോമകരമായ ആശയങ്ങൾ പ്രചരിപ്പിച്ച മാതൃഭൂമി തന്നെ ഇത്തരം കപടശാസ്ത്ര വാദങ്ങൾക്ക് വേദിയൊരുക്കുന്നതു് ആശങ്കയുണ്ടാക്കുന്നു. പത്മഭൂഷൻ നേടിയ ശാസ്ത്രഞ്ജൻ, നിരവധി വൈദ്യശാസ്ത്ര ബിരുദങ്ങളുള്ള, പല ഉയർന്ന പദവികൾ അലങ്കരിച്ചയാൾ എന്നൊക്കെയാണ് ആണു് ഡോ ഹെഗ്ഡേയുടെ വിശേഷണങ്ങൾ എന്നോർക്കുമ്പോൾ ഇന്ത്യയിലെ ശാസ്ത്ര വിദ്യാഭ്യാസത്തെയും ശാസ്ത്രജ്ഞർ എന്നു വിളിക്കപെടുകയും ആദരിക്കുകയും ചെയ്യപ്പെടുന്നവരെയും ഓർത്ത് ഒരേ സമയം ലജ്ജിക്കുകയും ആശങ്കപ്പെടുകയും ചെയ്യുന്നു. റോക്കറ്റിനു തുലാഭാരം നടത്തുന്നവരും യുദ്ധവീമാനത്തിനു ശത്രുസംഹാരപൂജനടത്തുവരെയും ശാസ്ത്രജ്ഞൻ എന്ന് നമ്മൾ ഇനിയും വിളിച്ചുകൊണ്ടേയിരിക്കുമോ?

ഹെഗ്ഡേയുടെ ലേഖനങ്ങൾ ‘നിർമുക്ത’ പണ്ട് വിശകലനം ചെയ്തിരുന്നു –


This note was later published in Narada news Malayalam

ഹെഗ്‌ഡെയുടെ അഭിമുഖം; ഒരു വിമർശന വായന

by Santhosh Thottingal at July 03, 2016 11:07 AM

June 26, 2016


GSoC Progress — Week 4 & 5

Ooh boy, half way through GSoC and lot to be done. Finally we decided to do the entire project in Android Studio so that the later…

by malayaleecoder at June 26, 2016 09:40 PM

Sreenadh T C


Are you referring to the bash command used in the experiment I described or about the ruby scripts from my previous posts. I used the…

by Sreenadh T C at June 26, 2016 05:56 AM

June 25, 2016

Arushi Dogra

Weekly Blog

I am given the task to detect whether a language is supported by the keyboard or not. In my phone Punjabi is not supported so I did all the testing with that. Whenever a language is not supported it is displayed as blank so that gave me an idea on how I will work on this issue. So I created the bitmap for all the characters of the language and compared it with an empty bitmap. So If the language was not supported it had empty bitmap and I declared it as not supported.

I have to improve on : Currently it is checking every time when the keyboard is opening. So I will do it such that it checks for all languages during the setup wizard and stores the info.

My task for next week is checking in setup wizard for all languages and in the list displaying the languages which cannot be supported as not supportable so that the user can know.

by arushidogra at June 25, 2016 01:35 PM

June 23, 2016

Balasankar C

GSoC Update: Week #3 and #4


[Sorry for the delay in the post]

I spent the last two weeks mainly testing out the stemmer module and the defined rules. During that I found out there are many issues for a rule based model because different types of inflections to different parts of speech can yield same inflected form. This can be solved only by machine learning algorithm that incorporates a morphological analyzer and is hence out of scope of my proposal. So I decided to move forward with the stemmer.

I tried to incorporate handling of inflections of verb - like tense change - using rules and was able to do a subset of them. Rest of the forms need more careful analysis and I've decided to get the system working first and then optimize it.

I've also decided to tag the rules so that a history of stemming can be preserved. The stemmer will now generate the stem as well as the tags of rules applied. This metadata can be useful to handle the problem of same letter being inflected to different forms that I faced while developing VibhakthiGenerator.

I spent some time in cleaning up the code more and setting up some local testing setup like a CLI and Web interface.

The PR was accepted by Vasudev and the changes are currently a part of the indicstemmer codebase.

BTW, it is time for the Midterm evaluations of GSoC 2016, where the mentors evaluate the progress of the students and give a pass/fail grade to them. Also, the students get to evaluate the mentors, communication with them and their inputs. I have already completed this and am waiting for my mentor to finish it. Hopefully, everything will go well.

June 23, 2016 03:04 PM

June 21, 2016

Anwar N

Bug fixes on Online Braille-Input-Tool

  The first month is over, the webpage is almost finished we gone through many bugs in the last week. Sathyaseelan mash and Balaram G really helped us to find out the bugs. One of the crucial and not easy to detect was the bug with map initialization.  We take lot of time to find and fix it.  Another one was with the insertion of text at the middle. following are the names of other commits

CapsLock(G) and Beginning-Middle switch(Alt)
Simple mode checkbox
Word and letter deletion enabled
Abbreviation enabled

by Anwar N ( at June 21, 2016 05:45 AM

June 18, 2016

Sreenadh T C

Milestone 1 : Conquered

We look far ahead and calls it ‘future’ but fails to realize the coming step is closest milestone for future.

by Sreenadh T C at June 18, 2016 06:59 AM

June 17, 2016

Arushi Dogra

Working with the layouts!

I started with making the designs of the layouts. The task was to make Santali Olchiki and Soni layouts for the keyboard. I looked at the code of the other layouts to get a basic understanding of how they were working.

Soni Layout
It took some time to understand how the transliteration codes were working.I did changes in the ime submodule for the layout. I messed up with the locale names and fixed that later. The changes were merged! Then I updated  the submodule on the Indic keyboard branch .

Santali Olchiki Layout

Previously I made the inscript layout of Santali Olchiki but after discussion with the mentor, it was decided to work on the phonetic layout as it can fit in smaller layout and thus easier to type too. I made the design of the keyboard and wrote the code for it and tested on the device. It is coming out fine too.

After that I explored various keyboard apps to see their setup wizards.

My task for the next week is to detect whether a language is supported by the system or not. I am planning to do it by checking if a character is typed it gives empty results or not. I will look for other ways too. I will update about the progress in the next blog.


by arushidogra at June 17, 2016 12:18 PM

June 14, 2016

Sreenadh T C

Hi shaun,

Am working with CMU Sphinx toolkit that has recognizer libraries written in C. I am focussing on adding Language model and Acoustic model…

by Sreenadh T C at June 14, 2016 07:02 PM

June 13, 2016


GSoC Progress — Week 2 & 3

First of all, apologies for skipping the last week’s post. Last two weeks were somewhat a bit rocky :P

by malayaleecoder at June 13, 2016 06:57 PM

June 11, 2016

Anwar N

Basic Online Braille-Input Tool


         As three weeks passed, After developing basic Chrome and Firefox extensions we moved to development of webpage braille-input where one can type in six key way. For achieving this we have gone through a lot of things such as Ajax, jQuery, JSON, Apache web server, etc..  The most major referred links are given at the end of this post. even my mentor also new to web based developments he always suggest me to keep it more ideal as possible.  Even the concept of Map switching bitween the begining, middle and contraction list was bit difficult to understand later I realize that's the way it should be. Finaly when we requesting for a space to host the web page one of another mentor from my organization Akshay S Dinesh gave us a hint about facility in Github itself to host. So we done it with a simple effort even we faced jQery download problem and Contraction file listing.

Source Code :

Now one can try it using the following link

Now we have to implement Abbreviations, Simple-Mode, Open, New , Save, Option to change font, font size, Background and Foreground Color etc.. as done in Sharada-Braille-Writer.

Refered Links :

by Anwar N ( at June 11, 2016 12:11 PM

How would be the browser extensions

Hi All,

       Yes the community bonding period is over and coding period started. Me and my mentor really happy to announce that with this community bonding period we just made the basic chrome and firefox extensions that can show how it's going to be!! Once again thanks to my mentor and varnam project. The code is hosted on github with the name braille-browser-addons.

Repository URL :

To test it in firefox do the following steps
1 - git clone
2 - cd braille-browser-addons/firefox/
3 - jpm run -b /usr/bin/firefox
4 - Go to and right click on text entry,from the context menu select Enable from braille sub menu.
5 - Try typing l(fds), k(fs), a(f), b(fd), c(fj)

To test it in chrome
1 - git clone
2 - Open chrome browser
3 - Go to settings and select extensions
4 - Check Developer mode
5 - Click Load unpacked extensions
6 - Choose chrome folder from braille-browser-addons
7 - Go to and right click on text entry,from the context menu select Enable from braille sub menu.
8 - Try typing l(fds), k(fs), a(f), b(fd), c(fj)


An article that I read from OpenSourceForU May 2016
Courtesy  : CHMK Library, University Of Calicut

Thank You,

by Anwar N ( at June 11, 2016 12:01 PM

June 10, 2016

Sreenadh T C

June 06, 2016

Arushi Dogra

Weekly Update

This week started with the successful gradle build of the project on my system.The build was successful with  version of gradle : 2.13, SDK version : 22 , build tools version : 22.0.1  . After that I deployed the app on the emulator and on my phone. I am currently working on making Santali Olchiki and Soni layouts for the keyboard.

by arushidogra at June 06, 2016 11:15 AM

June 05, 2016

Balasankar C

GSoC Update: Week #1 and #2


Last two weeks of GSoC mostly involved working on Stemmer module of the proposal. I had discussions with Hrishi and Vasudev regarding the directory structure that I will be using for the stummer. I proposed .. format because it gives more visibility to libindic. Since both of them agreed (I will be converting all the existing modules to this format, after GSoC), I first ported the existing indicstemmer module to this directory structure. With directions and suggestions from Vasudev and Hrishi, we (Me and Jerin, who is working on Sandhi Splitter set up pbr as packaging tool, testtools as testing framework, Travis CI for continuous integration and tox for local automation and testing. The development environment may be summarized as follows

Work on Stemmer

There are several problems with the existing stememr implementation. One is the high count of false positives. This is because, in Malayalam there exists root words which satisfy the structure of an inflected word. An example is ആപത്ത്, which can be considered similar to എറണാകുളത്ത്. The former is a root word whereas the latter is an inflected word. So, based on the stemmer rule ത്ത്=ം, that is used to handle എറണാകുളത്ത്, ആപത്ത് will also get stemmed to ആപം. This, hence, is a false positive. What we need is a root word corpus, that contains the possible root words in Malayalam (well, we need some crowd sourcing to update it) and checking the input word against it so as to detect if it is a root word.

Another problem with the existing stemmer is that it is unable to handle multiple levels of inflection. An example is അതിലേക്ക് (into that) : അതിൽ + ഏക്ക് : അത് (that) + ഇൽ (in) + ഏക്ക് (to). We need to implement a multiple suffix stripping algorithm that will handle it. I wrote an iterative suffix stripping algorithm, that continues suffix stripping and transformation until a root word is encountered or a mis-hit on the rules occurs.

Since linear list obviously is the least optimal solution for storing a large dataset, and tries are good for storing textual data, I decided to go with a Trie for storing the root word corpus. Tailoring a data structure that will suit my need is one of the last tasks of my GSoC proposal. So, I used an existing trie implementation - marisa trie that is available as a Python library.

I have added tests for all the 7 vibhakthis, and form of a word (അവനും, രാമുവും), some of the plural forms (കാളകൾ) etc and the coverage as of now is 100%.

As always, the code and update is available at GitHub repo.

June 05, 2016 04:04 AM

May 29, 2016


GSoC Progress — Week 1

A week into GSoC has come to an end and it’s been a wonderful learning experience.

by malayaleecoder at May 29, 2016 05:50 PM

Sreenadh T C

Focusing on the Basics

Excellence is achieved by the mastery of the fundamentals

by Sreenadh T C at May 29, 2016 09:25 AM

May 28, 2016

Santhosh Thottingal

When Breath Becomes Air – Paul Kalanithi

I read this book after my friends recommended it and wrote the following note at Goodreads

I am not rating this book. Not because of the book is bad. It is well written, I read it in a single day. But just that it gave me lot of pain and I would not recommend any of my friends to go through that pain. The reading can be felt very personal, like watching a friend – the author of the book – suffering. I had friends who went through very difficult illness and no longer with us. I have close relatives who is going through similar difficult time.

That is about rating and recommendation.

But, I am thankful to my friends who introduced this book to me. Just with some 100 pages, a friend appears in my life and leaves after a very short visit. Presenting lot of perspectives about the life and its struggles. Earns respect very quickly, and earns your tears towards the last pages. I literally skipped last few pages just because I did not want to.

by Santhosh Thottingal at May 28, 2016 05:49 AM

Gujarat Files – Rana Ayyub

Fear – That would be the single word I can use to describe my feeling after reading the book. My belief in Indian judicial and paraliamentary system is not a firm one. This book shakes it badly. I am nobody to judge the facts revealed in the book, but what make me afraid is there is very less chance that these facts are verified by the current Indian political and legal system.

Lot of respect for the author and her courage for this work.

by Santhosh Thottingal at May 28, 2016 05:41 AM

May 27, 2016

Arushi Dogra

Update on work

In the initial weeks of Community Bonding, I learned Java and Android and made a sample app on Android.

Then I started working on building Indic Keyboard on my system . First problem that I faced was of gradle version. Then maven was not working on proxy net which extended the building process. I am new to android so  many errors came, and it took some time to resolve them. Currently stuck on exception on processDebugResources . My work for the week includes Santali and Soni Keyboard layouts. Building Indic Keybaord on the system is taking longer than I expected.

by arushidogra at May 27, 2016 03:50 PM

May 24, 2016

Anwar N

The initial work

Hi All,
It is really good for me to engaged with free software activities which are interesting for me. I felt great to learn new aspects of life. Also as a new one to SMC and Indic project I get a lot's of new experiences such Instant Relay Chat (#smc-project).

As I mentioned earlier my first target is to create the web interface. As the first and easy step I tried to create a webpage that shows a textarea and user can type in braille way. Since I have no previous experience and lot of things to do immediately I asked my mentor for a hint and he said it can be done with a javascript and learn more about Apache2 webserver. After learning the working strategy of IBus-Braille and some works I just implemented a basic mechanism for testing the possibility and it worked!!. The code is as follows.

   <textarea id="brailletextarea" cols=60 rows=30 type="textarea" onKeyPress="KeyPress(event)" onKeyUp="KeyRelease(event)"> </textarea>
    var items = "";
    function KeyPress(event)
    function case_insensitive_comp(strA, strB) 
  return strA.toLowerCase().localeCompare(strB.toLowerCase());

    function KeyRelease(event)
brailletextarea = document.getElementById('brailletextarea');
if(items != ""){
items = items.split("");
items = items.sort( case_insensitive_comp )
items = items.join("");

if(items == "f"){
brailletextarea.value = brailletextarea.value+"a";
if(items == "df"){
brailletextarea.value = brailletextarea.value+"b";
if(items == "fs"){
brailletextarea.value = brailletextarea.value+"k";
if(items == "dfs"){
brailletextarea.value = brailletextarea.value+"l";
if(items == "fj"){
brailletextarea.value = brailletextarea.value+"c";
if(items == " ")
brailletextarea.value = brailletextarea.value+" ";

Then I where searching over the internet for how extensions are created. On that time mentor Nalin gave me a good hint( which is varnam project( Varnam is a cross platform opensource transliterator for Indian languages which have browser extentions hosted on github. I cloned the codebase of varnam-browser-addon and read it and I understand that it's the best place start the project. Also later I realize that varnam project is also a part of Swathanthra Malayalam Computing(SMC) done by navaneeth.

by Anwar N ( at May 24, 2016 06:41 AM

May 23, 2016

Balasankar C

GSoC Community Bonding - VibhakthiGenerator


So the community bonding period of Google Summer of Code 2016 comes to an end today. During this period, the participants were expected to mingle with the community, get to know more about its basic workflow, make small small contributions to the projects undertaken by community, understand the coding standards, deployement models and version control schemes, the communication model used etc.


I have been contributing to the community before GSoC, in form of localization and packaging etc. Still, as a contribution to the community during the GSoC Community Bonding period, I have written a small library in Python, that can be used to generate different Vibhakthi forms of Malayalam words (like രാമൻ -> രാമന്റെ, രാമനെ, രാമനോട്, രാമനാൽ etc). The project was inspired by Santhoshettan's similar project using jQuery.i18n library. The code is available in my personal repo now and I will be pushing it to the organization's repo soon. I am yet to make it follow libindic's directory structure.

The library now uses a rule based approach to generate the Vibhakthi forms, which is not 100% efficient. It will fail in words ending with those letters (usually Chillu characters) whose base forms are still ambiguous. Example, the words ending with ർ whose base form can be either ര or റ. Different words ending with ർ, that have similar structure has different results when applying the same vibhakthi.

Example :
അനിവർ + സംബന്ധിക = അനിവറിന്റെ
മലർ + സംബന്ധിക = മലരിന്റെ
കൗരവർ + സംബന്ധിക = കൗരവരുടെ

(Thanks to Santhoshettan for the following info) This shows a drawback of rule based method and we need to develop a method where the word etymology is also considered . That can be expected to be done when Machine Learning techniques become more clear and usable for Malayalam. For now, since no such library exists, I follow the concept of "99% is better than 0%" and guess the library is worth using until we can find something better. I have a plan to use this library during the development of the spell checker (I will post its proposal in detail, soon), which I have to dig more on. For now, you can try out the library at this online demo. Happy if you people can test and report issues/suggestions etc.


Also, to handle inflections, the spellchecker should have a stemming phase in between. So, I have read the stemmer code of libindic and found out that it is actually halfway between a stemmer and lemmatizer (and I intend to keep it as such). However, the existing method is highly inefficient, causing many false positives and has to be made more efficient. I intend to follow the rule based approach for it, with an option to crowd source the root word corpus. This will be the first phase of the project.

May 23, 2016 09:41 PM

Santhosh Thottingal

Introductory Workshop on Version Control Systems

The IEEE student branch of College of Engineering, Chengannur is doing a commendable initiative conducting a week long student quality improvement programme. I was invited to give an introductory workshop on version control systems as part of ISQIP 2016.

It was a good experience to be with a group of enthusiastic youngsters. The need of version control systems and demonstration of version controlling using git was done during the workshop. The slides of presentation is here.

by Kavya Manohar at May 23, 2016 01:54 PM

May 22, 2016


Community Bonding Period comes to an end

Alright, so the almost one month long Community Bonding Period is over. During this time I went through the codebase of libvarnam…

by malayaleecoder at May 22, 2016 07:22 PM