Planet SMC

July 06, 2024

Rajeesh K Nambiar

RIT Malayalam fonts: supporting a range of OpenType shapers

The open fonts for Malayalam developed by Rachana Institute of Technology use our independently developed advanced shaping rules since 2020. A conscious decision was made to support only the revised OpenType specification for Indic scripts (the script tag mlm2, which fixed issues with the v1 specification such as halant shifting). Our shaping rules provide precise, exact and definite shaping for Malayalam Unicode fonts on all major software platforms.

And yet, there are many users who still use either old or buggy softwares/platforms. Hussain and Bhattathiri have expressed angst and displeasure in seeing their beautifully and meticulously designed fonts not shaped correctly on some typeset works and prints (for instance, Ezhuthu is used by Mathrubhumi newspaper showing detached ു/ൂ-signs). I have received many requests over the years to add support for those obsolete (or sometimes proprietary) platforms, but have always refused.

Fig. 1: Ezhuthu font shaped with detached ൃ/ ു-signs. They should conjoin with base character. Source: Mathrubhumi.

Few weeks ago, CVR and I were trying to generate Malayalam epub content to read on a Kobo ebook reader (which supports loading user’s own fonts, unlike Kindle). We found that Kobo’s shaping software (quite possibly an old version of Pango) does not support the v2 OpenType specification. That did irk me and I knew it is going to be a rabbit hole. A little bit of reverse engineering and a day later, we were happy to read Malayalam properly shaped in Kobo, by adding rudimentary support for v1 spec.

Fig. 2: RIT Rachana shaped perfectly with Kobo ebook reader (ignore the book title).

Out of curiosity, I checked whether those small additions work with Windows XP, but it did not (hardly surprising). But now that the itch has been scratched; a bunch of shaping rules were added to support XP era applications as well (oh, well).

Fig. 3: RIT Rachana shaped perfectly in Windows XP.

Few days later, a user also reported (known) shaping issue with Adobe InDesign. Though I was inclined to close it as NOTABUG pointing to use HarfBuzz instead, the user was willing to help test a few attempts I promised to make. Adobe 2020/2021 (and earlier) products use Lipika shaper, but recent versions are using HarfBuzz natively. Lipika seems to support v2 OpenType specification, yet doesn’t work well with our existing shaping rules. Quite some reverse engineering and half a dozen attempts later, I have succeeded in writing shaping rules that support Lipika along with other shapers.

Fig.4: RIT Rachana shaped perfectly with InDesign 2021 (note: the characters outside margins is a known issue only with InDesign, and it is fixed with a workaround).

All published (and in progress) RIT Malayalam fonts are updated with these new set of shaping rules; which means all of them will be shaped exactly, precisely and correctly (barring the well-known limitation of v1 specification and bugs in legacy shapers ) all the way from Windows XP (2002) to HarfBuzz 8.0 (present day) and all applications in between.

Supported shaping engines

With this extra engineering work, RIT fonts now tested to work well with following shaping engines/softwares. Note: old Pango and Qt4 have shaping issues (with below base ല forms and ു/ൂ forms of conjuncts, in respective shapers), but those won’t be fixed. Any shaper other than HarfBuzz (and to a certain extent Uniscribe) is best effort only.

New releases

New releases are made available for all the fonts developed by Rachana Institute of Typography, viz.

Acknowledgements

A lot of invaluable work was done by Narayana Bhattathiri, Ashok Kumar and CV Radhakrishnan in testing and verifying the fonts with different platforms and typesetting systems.

End users who reported issues and helped with troubleshooting have also contributed heavily in shaping (pun intended) community software like RIT Malayalam open fonts.

by Rajeesh at July 06, 2024 11:13 AM

May 28, 2024

Rajeesh K Nambiar

Malayalam open font design competition 2025 announced

Rachana Institute of Typography, in association with KaChaTaThaPa Foundation and Sayahna Foundation, is launching a Malayalam font design competition for students, professionals, and amateur designers.

Selected fonts will be published under Open Font License for free use.

It is not necessary to know details of font technology; skills to design characters would suffice.

Timelines, regulations, prizes and more details are available at the below URLs.

English: https://sayahna.net/fcomp-en
Malayalam: https://sayahna.net/fcomp-ml

Registration

Interested participants may register at https://sayahna.net/fcomp

Last day for registration is 30th June 2024.

by Rajeesh at May 28, 2024 04:52 AM

January 08, 2024

Rajeesh K Nambiar

Chingam: a new libre Malayalam traditional script font

‘Chingam’/ചിങ്ങം (named after the first month of Malayalam calendar) is the newest libre/open source font released by Rachana Institute of Typography in the year 2024.

Chingam font specimen

It comes with a regular variant, embellished with stylistic alternates for a number of characters. The default shape of characters D, O, , etc. are wider in stark contrast with the shape of other characters designed as narrow width. The font contains alternate shapes for these characters more in line with the general narrow width characteristic.

Example of stylistic alternate characters in Chingam font. Text without stylistic alternate above, same text with stylistic alternate below.

Users can enable the stylistic alternates in typesetting systems, should they wish.

  • XeTeX: stylistic variant can be enabled with the StylisticSet={1} option when defining the font via fontspec package. For e.g.
% in the preamble
\newfontfamily\chingam[Ligatures=TeX,Script=Malayalam,StylisticSet={1}]{Chingam}

\begin{document}
\chingam{മനുഷ്യരെല്ലാവരും തുല്യാവകാശങ്ങളോടും അന്തസ്സോടും സ്വാതന്ത്ര്യത്തോടുംകൂടി ജനിച്ചിട്ടുള്ളവരാണ്‌…}
\end{document}
  • Scribus: extra font features are accessible since version 1.6
  • LibreOffice: extra font features are accessible since version 7.4. Enable it using FormatCharacterLanguageFeatures.
LibreOffice stylistic alternates with Chingam font.
  • InDesign: very similar to Scribus; there should be an option in the text/font properties to choose the stylistic set.

Development

Chingam is designed and drawn by Narayana Bhattathiri. Based on the initial drawings on paper, the glyph shapes are created in vector format (svg) following the glyph naming convention used in RIT projects. A new build script is developed by Rajeesh that makes it easier for designers to iterate & adjust the font metadata & metrics. Review & scrutiny done by CVR, Hussain KH and Ashok Kumar improved the font substantially.

Download

Chingam is licensed under Open Font License. The font can be downloaded from Rachana website, sources are available in GitLab page.

by Rajeesh at January 08, 2024 08:16 AM

February 08, 2023

Rajeesh K Nambiar

Dual boot, secure boot & bitlocker

I have installed GNU/Linux on many a computers in ~20 years (some automated, most individually). In the University, I used to be woken past midnight by someone knocking at the door — who reinstalled Windows — and now they can’t boot because grub was overwritten. I’d rub the eyes, pickup the bunch latest Fedora CDs and go rescue the beast machine. Linux installation, customization and grub-recovery was my specialization (no, the course didn’t have credit for that).

Technologies (libre & otherwise) have improved since then. Instead of MBR, there’s GPT (no, not that one). Instead of BIOS, there’s UEFI. Dual booting Windows with GNU/Linux has become mostly painless. Then there’s Secure Boot. Libre software works with that too. You may still run into issues; I ran into one recently and if someone is in the same position I hope this helps:

A friend of mine got an Ideapad 3 Gaming laptop (which was preinstalled with Windows 11) and we tried to install Fedora 37 on it (of course, remotely; thanks to screensharing and cameras on mobile phones). The bootable USB pendrive was not being listed in boot options (F12), so we fiddled with TPM & Secure Boot settings in EFI settings (F2). No luck, and troubleshooting eventually concluded that the USB pendrive was faulty. Tried with another one, and this time it was detected, happily installed Fedora 37 (under 15 mins, because instead of spinning Hard Disks, there’s SSD). Fedora boots & works fine.

A day later, the friend selects Windows to boot into (from grub menu) and gets greeted by a BitLocker message: “Enter bitlocker recovery key” because “Secure boot is disabled”.

Dang. I thought we re-enabled Secure Boot, but apparently not. Go to EFI settings, and turn it back on; save & reboot; select Windows — but BitLocker kept asking for recovery key but with a different reason: “Secure Boot policy has unexpectedly changed”.

That led to scrambling & searching, as BitLocker was not enabled by the user but OEM, and thus there was no recovery key in the user’s Microsoft online account (if the user had enabled it manually, they can find the key there).

The nature of the error message made me conclude that Fedora installation with secure boot disabled has somehow altered the TPM settings and Windows (rightfully) refuses to boot. EFI settings has an option to ‘Restore Factory Keys’ which will reset the secure boot DB. I could try that to remove Fedora keys, pray Windows boots and if it works, recover grub (my specialty) or reinstall Fedora in the worst case scenario.

Enter Matthew Garret. Matthew was instrumental in making GNU/Linux systems to work with Secure Boot (and was awarded the prestigious Free Software Foundation Award). He is a security researcher who frequently writes about computer security.

I have sought Matthew’s advice before trying anything stupid, and he suggested thus (reproduced with permission):

First, how are you attempting to boot Windows? If you’re
doing this via grub then this will result in the secure boot
measurements changing and this error occurring – if you pick Windows
from the firmware boot menu (which I think should appear if you hit F12
on an Ideapad?) then this might solve the problem.

Secondly, if the owner added a Microsoft account when setting up the
Windows system, they can visit
https://account.microsoft.com/devices/recoverykey and a recovery key
should be available there.

If neither of these approaches work, then please try resetting the
factory keys, reset the firmware to its default settings, and delete any
Fedora boot entries from the firmware (you can recover them later), and
with luck that’ll work.

Thankfully, the first option of booting Windows directly via F12 — without involving grub — works. And the first thing the user does after logging in is back up the recovery keys.

by Rajeesh at February 08, 2023 09:18 AM

November 12, 2022

Rajeesh K Nambiar

RIT Rachana & MeeraNew fonts version 1.4 released

The Malayalam serif font RIT Rachana and its sans-serif counterpart MeeraNew have enjoyed a wide array of improvements in the past months; and are available now for download and use.

Some notable improvements are listed here:

  1. Entire Malayalam character set defined in Unicode 15, including archaic and vedic characters.
  2. All characters — especially vowel signs — now belong to proper Unicode category GDEF class (thanks to Liang Hai for pointing out the correction), removing a workaround put in place just for Adobe InDesign. This workaround is not required when using HarfBuzz shaping engine (which you should anyway).
  3. Improved design of old-style figures 0, 1 & 2 in RIT-Rachana.
  4. Standalone dependent glyphs of pre-base ra (reph) and below-base la can be displayed with ‘zwj+് +‌ ര/ല’ respectively, useful for informational purpose (when writing a typography specific article, for instance). These characters otherwise always conjoin with the base character.
Reph and below base La standalone glyphs
  1. Major improvements in shaping rules to adhere to language rules even better: double consonants are always joined properly in context; even for unusual combinations. Correct shaping for below instance can be obtained by adding a ZWNJ before but the advanced shaping rule is smarter to not require encoding corrections.
Double consonants are shaped first..
  1. Improved underline position (although thou shalt question thyself why use underline in Indic scripts), which is now also respected by LibreOffice 7.5 thanks to Khaled Hosny. This bug was reported many years ago.
Underline position improved
  1. ന്‍ +‌ ് + റ → ന്റ (Unicode 5.1 atomic chillu nta) support added upon request.

… kerning improvements and many more tweaks and fine tuning. As usual, both typefaces are free & open source software, available at Rachana website. They will be available shortly in Fedora 36 & 37 as an update.

by Rajeesh at November 12, 2022 05:39 AM

September 22, 2022

Rajeesh K Nambiar

FontForge gains ability to reuse OpenType rules for different fonts

FontForge is the long standing libre font development tool: it can be used to design glyphs, import glyphs of many formats (svg, ps, pdf, …), write OpenType lookups or integrate Adobe feature files, and produce binary fonts (OTF, TTF, WOFF, …). It has excellent scripting abilities, especially Python library to manipulate fonts; which I extensively use in producing & testing fonts.

When I wrote advanced definitive OpenType shaping rules for Malayalam and build scripts based on FontForge, I also wanted to reuse the comprehensive shaping rules in all the fonts RIT develop. The challenge in reusing the larger set of rules in a ‘limited’ character set font was that FontForge would (rightly) throw errors that such-and-such glyph does not exist in the font and thus the lookup is invalid. For instance, the definitive OTL shaping rules for Malayalam has nearly 950 glyphs and lookup rules; but a limited character set font like ‘Ezhuthu’ has about 740 glyphs.

One fine morning in 2020, I set out to read FontForge’s source code to study if functionality to safely skip lookups that do not apply to a font (because the glyphs specified in the lookup are not present in the font, for instance) can be added. Few days later, I have modified the core functionality and adapted the Python interface (specifically, the Font.mergeFeature method) to do exactly that, preserving backward compatibility.

Next, it was also needed to expose the same functionality in the graphical interface (via File→Merge Feature info menu). FontForge uses its own GUI toolkit (neither GTK nor Qt); but with helpful pointers from Fredrick Brennan, I have developed the GUI to take a flag (default ‘off’ to retain backward compatibility) that allows the users to try skipping lookup rules that do not apply to the current font. In the process, I had to touch the innards of FontForge’s low-level code and learn about it.

Fig. 1: Fontforge now supports skipping non-existent glyphs when merging a comprehensive OpenType feature file.

This worked fine for our use case, typically ignoring the GSUB lookups of type sub glyph1 glyph2 by glyph3 where glyph3 does not exist in the font. But it did not properly handle the cases when glyph1 or glyph2 were non-existent. I’ve tried to fix the issue but then was unable to spend more time to finish it as Real Lifeâ„¢ caught up; c’est la vie. It was later attempted as part of Free Software Camp mentoring program in 2021 but that didn’t bear fruit.

A couple of weeks ago, Fred followed up now that this functionality is found very useful; so I set aside time again to finish the feature. With fresh eyes, I was able to fix remaining issues quickly, rebase the changes to current master and update the pull request.

The merge request has landed in FontForge master branch this morning. There’s a follow up pull request to update the Python scripting documentation as well. I want to thank Fredrick Brennan and Jeremy Tan for the code reviews and suggestions, and KH Hussain and CVR for sharing the excitement.

This functionality added to FontForge helps immensely in reusing the definitive Malayalam OpenType shaping rules without any modification for all the fonts! 🎉

by Rajeesh at September 22, 2022 06:08 AM

August 17, 2022

Rajeesh K Nambiar

Releasing new libre Malayalam font ‘Karuna’

Today, on the auspicious day of Malayalam new year (ചിങ്ങം ൧), I am pleased to announce the release of a new libre font for Malayalam script ‘Karuna’ by Rachana Institute of Typography. Karuna is a display typeface suitable for titling and headlines.

Here are some beautiful posters designed in Karuna by Narayana Bhattathiri.

Karuna is designed by renowned calligrapher Narayana Bhattathiri, font development is done by KH Hussain, font engineering is done by me (Rajeesh KV) in collaboration with CV Radhakrishnan.

Bhattathiri explains that the font was inspired by style of CN Karunakaran (1940–2013), an acclaimed painter, illustrator & art director from Kerala. Inspired by and as a homage to his style of titling and designs; Bhattathiri designed the shapes for Karuna. Karuna brings a unique design to the growing collection of high-quality open fonts maintained by Rachana Institute of Typography. In KH Hussain’s words:

മലയാളത്തിന്റെ ടൈറ്റിലിംഗിലും കവർ ഡിസൈനിംഗിലും സി.എൻ.കരുണാകരൻ ആയിരത്തിത്തൊള്ളായിരത്തി എഴുപതുകളിൽ കൊണ്ടുവന്ന മാറ്റം വിപ്ലവാത്മകമായിരുന്നു. എ.എസ്സിന്റെയും നമ്പൂതിരിയുടെയും സമകാലീനനായിരിക്കുമ്പോൾ തന്നെ ചിത്രീകരണങ്ങളിലും അക്ഷര രൂപകല്പനയിലും കരുണാകരൻ പൂർവ്വഗാമികളിൽ നിന്നു വ്യക്തമായ അകലവും വ്യത്യസ്തതയും പുലർത്തി.

അരനൂറ്റാണ്ടിനു ശേഷം നാരയണ ഭട്ടതിരി കരുണ ഡിസൈൻ ചെയ്യുമ്പോൾ വെറുമൊരു പകർത്തലല്ലാതായി അത് മാറുന്നുണ്ട്. കരുണാകരൻ മലയാള അക്ഷരങ്ങളിൽ കാണിച്ച അതേ സ്വാതന്ത്ര്യം കരുണാകരന്റെ അക്ഷരങ്ങളിൽ ഭട്ടതിരിയും എടുക്കുന്നു. മലയാളം ടൈപോഗ്രഫിയിലെ ഏറ്റവും അനന്യമായ ഫോണ്ടായി കരുണ മാറുകയാണ്. ഇന്നിപ്പോൾ ആസ്കിയിലും യൂണികോഡിലും ഉപയോഗത്തിലുള്ള മറ്റെല്ലാ ഫോണ്ടുകൾക്കും മലയാളത്തിലും റോമനിലുമൊക്കെ ചാർച്ചകൾ കണ്ടെത്താൻ കഴിയും. കരുണയ്ക്കു കഴിയില്ല.

1977 ൽ തടവറക്കവിതകൾക്കു വേണ്ടി കരുണാകരൻ ഡിസൈൻ ചെയ്ത പുറംചട്ടയിൽ കരുണാകരന്റെ കാലിഗ്രാഫിയുടെ പ്രത്യേകതകൾ ദർശിക്കാൻ കഴിയും. അടിയന്തിരാവസ്ഥയിൽ കൊടിയ മർദ്ദനങ്ങൾക്കിരയായി തടവറയിൽ കിടന്ന് നക്സലൈറ്റുകൾ എഴുതിയ കവിതകളുടെ സമാഹാരമായിരുന്നു ആ പുസ്തകം. അടിയന്തിരാവസ്ഥയുടെ നൃശംസതകൾ ആ കവർ ചിത്രത്തിലെ അക്ഷരങ്ങളിൽ വിറങ്ങലിപ്പായി നിഴലിക്കുന്നു. കരുണ ഫോണ്ട് അതിന്റെയൊരു പകർന്നാട്ടമായി മാറുന്നു.
Title designed by CN Karunakaran in 1977. Source: KH Hussain.

Karuna follows the traditional orthography of Malayalam script (neither reformed script, nor re-reformed script) and has precise OTL shaping rules required for advanced script layout. The font is licensed and made available for public use under Open Font License (OFL). You may download it at Rachana website. Font sources are available at the GitLab repository.

by Rajeesh at August 17, 2022 05:52 AM

June 27, 2022

Rajeesh K Nambiar

Digitally signing PDF documents in Linux: with hardware token & Okular

We are living in 2022. And it is now possible to digitally sign a PDF document using libre software. This is a love letter to libre software projects, and also a manual.

For a long time, one of the challenges in using libre software in ‘enterprise’ environments or working with Government documents is that one will eventually be forced to use a proprietary software that isn’t even available for a libre platform like GNU/Linux. A notorious use-case is digitally signing PDF documents.

Recently, Poppler (the free software library for rendering PDF; used by Evince and Okular) and Okular in particular has gained a lot of improvements in displaying digital signature and actually signing a PDF document digitally (see this, this, this, this, this and this). When the main developer Albert asked for feedback on what important functionality would the community like to see incorporated as part this effort; I had asked if it would be possible to use hardware tokens for digital signature. Turns out, poppler uses nss (Network Security Services, a Mozilla project) for managing the certificates, and if the token is enrolled in NSS database, Okular should be able to just use it.

This blog post written a couple of years ago about using hardware token in GNU/Linux is still actively referred by many users. Trying to make the hardware token work with Okular gave me some more insights. With all the other prerequisites (token driver installation etc.) in place, follow these steps to get everything working nicely.

Howto

  1. There are 2 options to manage NSSDB: (i) manually by setting up $HOME/.pki/nssdb, or (ii) use the one automatically created by Firefox if you already use it. Assuming the latter, the nssdb would be located in the default profile directory $HOME/.mozilla/firefox/<random.dirname>/ (check for existence of the file pkcs11.txt in that directory to be sure).
  2. Open Okular and go to SettingsConfigure backendPDF and choose/set the correct certificate database path, if not already set by default.
Fig. 1: Okular PDF certificate database configuration.
  1. Start the smart card service (usually auto-started, you won’t have to do this): either pcsc_wd.service (for WatchData keys) or pcscd.service.
  2. Plug in the hardware token.
  3. Open a PDF in Okular. Add digitial signature using menu ToolsDigitally Sign
  4. This should prompt for the hardware token password.
Fig. 2: Digital token password prompt when adding digital sign in the PDF document.
  1. Click & drag a square area where you need to place the signature and choose the certificate. Note that, since Poppler 22.03, it is also possible to insert signature in a designated field.
Fig. 3: Add digital signature by drawing a rectangle.
  1. Signature will be placed on a new PDF file (with suffix -signed) and it will open automatically.
Fig. 4: Digitally signed document.
  1. You can also see the details of the hardware token in PDF backend settings.
Fig. 5: Signature present in hardware token visible on the PDF backend settings.

Thanks to the free software projects & developers who made this possible.

by Rajeesh at June 27, 2022 09:34 AM

May 16, 2022

Rajeesh K Nambiar

MeeraNew font new release 1.3

MeeraNew is the default Malayalam font for Fedora 36. I have just released a much improved version of this libre font and they are just built for Fedora 36 & rawhide; which should reach the users in a week of time. For the impatient, you may enable updates-testing repository and provide karma/feedback.

Fig. 1: MeeraNew definitive-script Malayalam font version 1.3.

Two major improvements are present in version 1.3

  1. Improved x-height to match RIT Rachana, the serif counterpart. This should improve readability at default font sizes.
  2. Large improvements to many kerning pairs including the above-base mark positioning of dotreph (0D4E) character (e.g. ൎയ്യ), ി (0D3F), ീ (0D40) vowel symbols (e.g. ന്റി), post-base symbols of Ya, Va (e.g. സ്ത്യ) etc.

The font source and OTF/TTF files can be downloaded at Rachana website or at GitLab.

by Rajeesh at May 16, 2022 09:43 AM

March 28, 2022

Rajeesh K Nambiar

RIT Malayalam fonts are available & default in Fedora 36+, ELN

The upcoming Fedora release 36 (due end of April 2022) and beyond, and ELN (Enterprise Linux Next, what would become RHEL) will have default Malayalam script fonts as RIT Rachana and Meera New fonts. In addition, Sundar, TNJoy, Panmana and Ezhuthu fonts are now available in the official repositories. This brings Malayalam fonts that are modern (Unicode 13 compatible), well-maintained, having perfect complex-script shaping and good metadata to the users of Fedora, RHEL, CentOS & downstream OSen. I have made all the necessary updates in the upstream projects (which I maintain) and packaged them for Fedora (which also I maintain).

Update: thanks to Norbert Preining, all these fonts are also available for ArchLinux!

RIT Malayalam fonts available in Fedora.

RIT Rachana and Meera New fonts will be default serif and sans-serif fonts for Malayalam. smc-rachana-fonts and smc-meera-fonts are deprecated as they are unmaintained.

All the fonts can be installed from your favourite package managers (GNOME Software, Discover, dnf etc.).

RIT fonts in GNOME software of upcoming Fedora 36.

The packages can be installed using dnf via:

sudo dnf install -y rit-*-fonts

This change in Fedora required many well orchestrated steps:

  1. Packaging & building RIT fonts according to latest font packaging guidelines
  2. Set as default serif/sans-serif fonts for Malayalam in langpacks
  3. Set as default serif/sans-serif fonts for Malayalam in fedora-comps
  4. Propose the ChangeRequest which is then discussed & approved by Fedora Engineering Steering Committee (FESCO).

I would like to especially thank Parage Nemade for coordinating all the changes and relevant engineering procedures, and Pravin Satpute for initial discussions; in helping to complete these updates in time for Fedora 36.

by Rajeesh at March 28, 2022 12:03 PM

September 20, 2021

Rajeesh K Nambiar

A new set of OpenType shaping rules for Malayalam script

TLDR; research and development of a completely new OpenType layout rules for Malayalam traditional orthography.

Writing OpenType shaping rules is hard. Writing OpenType shaping rules for advanced (complex) scripts is harder. Writing OpenType shaping rules without causing any undesired ligature formations is even harder.

Background

The shaping rules for SMC fonts abiding v2 of Malayalam OpenType specification (mlm2 script tag) were written and polished in large part by me over many years, fixing shaping errors and undesired ligature formations. It still left some hard to fix bugs. Driven by the desire to fix such difficult bugs in RIT fonts and the copyright fiasco, I have set out to write a simplified OpenType shaping rules for Malayalam from scratch. Two major references helped in that quest: (1) a radically different approach I have tried few years ago but failed with mlym script tag (aka Windows XP era shaping); (2) a manuscript by R. Chithrajakumar of Rachana Aksharavedi who culled and compiled the ‘definitive character set’ for Malayalam script. The idea of ‘definitive character set’ is that it contains all the valid characters in a script and it doesn’t contain any (invalid) characters not in the script. By the definition; I wanted to create the new shaping rules in such a way that it does not generate any invalid characters (for e.g. with a detached u-kar). In short: it shouldn’t be possible to accidentally generate broken reformed orthography forms.

Fig. 1. Samples of Malayalam definitive character set listing by R. Chithrajakumar, circa 1999. Source: K.H. Hussain.

“Simplify, simplify, simplify!”

Henry David Thoreau

It is my opinion that a lot of complexity in the Malayalam shaping largely comes from Indic OpenType shaping specification largely follows Devanagari, which in turn was adapted from ISCII, which has (in my limited understanding) its root in component-wise metal type design of ligature glyphs. Many half, postbase and other shaping rules have their lineage there. I have also heard similar concerns about complexity expressed by others, including Behdad Esfahbod, FreeFont maintainer et al.

Implementation

As K.H. Hussain once rightly noted, the shaping rules were creating many undesired/unnecessary ligature glyphs by default, and additional shaping rules (complex contextual lookups) are written to avoid/undo those. A better, alternate approach would be: simply don’t generate undesired ligatures in the first place.

“Invert, always invert.”

Carl Gustav Jacob Jacobi

Around December 2019, I set out to write a definitive set of OpenType shaping rules for traditional script set of Malayalam. Instead of relying on many different lookup types such as pref, pstf, blwf, pres, psts and myriad of complex contextual substitutions, the only type of lookup required was akhn — because the definitive character set contains all ligatures of Malayalm and those glyphs are designed in the font as a single glyph — no component based design.

The draft rules were written in tandem with RIT-Rachana redesign effort and tested against different shaping engines such as HarfBuzz, Allsorts, XeTeX, LuaHBTeX and DirectWrite/Uniscribe for Windows. Windows, being Windows (also being maintainers of OpenType specification), indeed did not work as expected adhering to the specification. Windows implementation clearly special cased the pstf forms of യ (Ya, 0D2F) and വ (Va, 0D35). To make single set of shaping rules work with all these shaping engines, the draft rules were slightly amended, et voila — it worked in all applications and OSen that use any of these shaping engines. It was decided to drop support for mlym script which was deprecated many years ago and support only mlm2 specification which fixed many irreparable shortcomings of mlym. One notable shaping engine which doesn’t work with these rules is Adobe text engine (Lipika?), but they have recently switched to HarfBuzz. That covers all major typesetting applications.

Testing fonts developed using this new set of shaping rules for Malayalam indeed showed that they do not generate any undesired ligatures in the first place. In addition, compared to the previous shaping rules, it gets rid of 70+ lines of complex contextual substitutions and other rules, while remaining easy to read and maintain.

Old vs new shaping rules in Rachana
Fig. 3. Old vs new shaping rules in RIT Rachana.

Application support

This new set of OpenType layout rules for Malayalam is tested to work 100% with following shaping engines:

  1. HarfBuzz
  2. Allsorts
  3. DirectWrite/Uniscribe (Windows shaping engine)

And GUI toolkits/applications:

  1. Qt (KDE applications)
  2. Pango/GTK (GNOME applications)
  3. LibreOffice
  4. Microsoft Office
  5. XeTeX
  6. LuaHBTeX
  7. Emacs
  8. Adobe InDesign (with HarfBuzz shaping engine)
  9. Adobe Photoshop
  10. Firefox, Chrome/Chromium, Edge browsers

Advantages

In addition, the advantages of the new shaping rules are:

  1. Adheres to the concept of ‘definitive character set’ of the language/script completely. Generate all valid conjunct characters and do not generate any invalid conjunct character.
  2. Same set of rules work fine without adjustments/reprogramming for ‘limited character set’ fonts. The ‘limited character set’ may not contain conjunct characters as extensive in the ‘definitive character set’; yet it would always have characters with reph and u/uu-kars formed correctly.
  3. Reduced complexity and maintenance (no complex contextual lookups, reverse chaining etc.). Write once, use in any fonts.
  4. Open source, libre software.

This new OpenType shaping rules program was released to public along with RIT Rachana few months ago, and also used in all other fonts developed by RIT. It is licensed under Open Font License for anyone to use and integrate into their fonts, please ensure the copyright statements are preserved. The shaping rules are maintained at RIT GitLab repository. Please create an issue in the tracker if you find any bugs; or send a merge request if any improvement is made.

by Rajeesh at September 20, 2021 05:30 AM

May 08, 2021

Rajeesh K Nambiar

Letsencrypt certificate renewal: Nginx with reverse-proxy

Let’s Encrypt revolutionized the SSL certificate management for websites in a short span of time — it directly improved the security of users of the world wide web by: (1) making it very simple to deploy SSL certificates to websites by administrators and (2) make the certificates available free of cost. To appreciate their efforts, compare to what hoops one had to jump through to obtain a certificate from a certificate authority (CA) and how much money and energy one would have to spend on it.

I make use of letsencrypt in all the servers I manitain(ed) and in the past used the certbot tool to obtain & renew certificates. Recent versions of certbot are only available as a snap package, which is not something I’d want to or able to setup in many cases.

Enter acme. It is shell script that works great. Installing acme will also setup a cron job, which would automatically renew the certificate for the domain(s) near its expiration. I have recently setup dict.sayahna.org using nginx as a reverse proxy to a lexonomy service and acme for certificate management. The cron job is supposed to renew the certificate on time.

Except it didn’t. Few days ago received a notification from about imminent expiry of the certificate. I have searched the interweb quite a bit, but didn’t find a simple enough solution (“make the proxy service redirect the request”…). What follows is the troubleshooting and a solution, may be someone else find it useful.

Problem

acme was unable to renew the certificate, because the HTTP-01 authentication challenge requests were not answered by the proxy server where all traffic was being redirected to. In short: how to renew letsencrypt certificates on an nginx reverse-proxy server?

Certificate renewal attempt by acme would result in errors like:

# .acme.sh/acme.sh --cron --home "/root/.acme.sh" -w /var/www/html/
[Sat 08 May 2021 07:28:17 AM UTC] <strong>===Starting cron===</strong>
[Sat 08 May 2021 07:28:17 AM UTC] <strong>Renew: 'my.domain.org'</strong>
[Sat 08 May 2021 07:28:18 AM UTC] Using CA: https://acme-v02.api.letsencrypt.org/directory
[Sat 08 May 2021 07:28:18 AM UTC] Single domain='my.domain.org'
[Sat 08 May 2021 07:28:18 AM UTC] Getting domain auth token for each domain
[Sat 08 May 2021 07:28:20 AM UTC] Getting webroot for domain='my.domain.org'
[Sat 08 May 2021 07:28:21 AM UTC] Verifying: my.domain.org
[Sat 08 May 2021 07:28:24 AM UTC] <strong>my.domain.org:Verify error:Invalid response from https://<strong>my.domain</strong>.org/.well-known/acme-challenge/Iyx9vzzPWv8iRrl3OkXjQkXTsnWwN49N5aTyFbweJiA [NNN.NNN.NNN.NNN]:</strong>
[Sat 08 May 2021 07:28:24 AM UTC] <strong>Please add '--debug' or '--log' to check more details.</strong>
[Sat 08 May 2021 07:28:24 AM UTC] <strong>See: https://github.com/acmesh-official/acme.sh/wiki/How-to-debug-acme.sh</strong>
[Sat 08 May 2021 07:28:25 AM UTC] <strong>Error renew <strong>my.domain</strong>.org.</strong>

Troubleshooting

The key error to notice is

Verify error:Invalid response from https://my.domain.org/.well-known/acme-challenge/Iyx9vzzPWv8iRrl3OkXjQkXTsnWwN49N5aTyFbweJiA [NNN.NNN.NNN.NNN]

Sure enough, the resource .well-known/acme-challenge/… is not accessible. Let us try to make that accessible, without going through proxy server.

Solution

First, create the directory if it doesn’t exist. Assuming the web root as /var/www/html:

# mkdir -p /var/ww/html/.well-known/acme-challenge

Then, edit /etc/nginx/sites-enabled/my.domain.org and before the proxy_pass directive, add the .well-known/acme-challenge/ location and point it to the correct location in web root. Do this on both HTTPS and HTTP server blocks (otherwise it didn’t work for me).

 6 server {
 7   listen 443 default_server ssl;
...
43   server_name my.domain.org;
44   location /.well-known/acme-challenge/ {
45     root /var/www/html/;
46   }
47  
48   location / {
49     proxy_pass http://myproxyserver;
50     proxy_redirect off;
51   }
...
83 server {
84   listen 80;
85   listen [::]:80;
86 
87   server_name my.domain.org;
88 
89   location /.well-known/acme-challenge/ {
90     root /var/www/html/;
91   }
92 
93   # Redirect to HTTPS
94   return 301 https://$server_name$request_uri;


Make sure the configuration is valid and reload the nginx configuration

nginx -t && systemctl reload nginx.service

Now, try to renew the certificate again:

# .acme.sh/acme.sh --cron --home "/root/.acme.sh" -w /var/www/html/
...
[Sat 08 May 2021 07:45:01 AM UTC] Your cert is in  /root/.acme.sh/my.domain.org/dict.sayahna.org.cer 
[Sat 08 May 2021 07:45:01 AM UTC] Your cert key is in  /root/.acme.sh/my.domain.org/my.domain.org.key 
[Sat 08 May 2021 07:45:01 AM UTC] v2 chain.
[Sat 08 May 2021 07:45:01 AM UTC] The intermediate CA cert is in  /root/.acme.sh/my.domain.org/ca.cer 
[Sat 08 May 2021 07:45:01 AM UTC] And the full chain certs is there:  /root/.acme.sh/my.domain.org/fullchain.cer 
[Sat 08 May 2021 07:45:02 AM UTC] _on_issue_success

Success.

by Rajeesh at May 08, 2021 10:22 AM

February 17, 2020

Sreenadh T C

Eulogy to my best friend from childhood

This probably is a very late eulogy. This also means, it took me that long to find the nerve to put together words without breaking down or loosing my composure.

So here goes the story of two little friends who “are” brothers (from two close families) for a lifetime.

Photo by sudip paul from Pexels

I was almost a year old when he arrived (Hari Krishnan, referred to as Kichus from here on).

We both grew up sharing toys, getting new dresses together for Onam, buying crackers together for Vishu, fighting for penalties and 6s (and ofcz making amends the very next day).

I took for granted that this kid is going to be with me forever, see me graduate high school, class 10th, class 12th, see me become an Engineer. But destiny as we call it had other plans.

Kichu was diagnosed with Blood Cancer at the age of 16.

Little did we know that, we started losing each other way before he even turned 16.

Let me tell you how I remained helpless while everything around me pulled me down into a rabbit hole.

I was off for school that morning, to attend my last half-yearly exam. Something was off that day from the time I woke up. My mom was acting weird and she was in a hurry to push me off for school. Given this was my exams, I felt this urge was justified. So I walk away from the gate and I could see my mom peeking through the kitchen window, making sure I was not being stopped by anyone to tell me what had happened. I somehow reach the bus-stop, and the bus that comes on time everyday, was no where to be seen.

I could see my mom outside my house now, with those extended neck looking out for me, checking if I safely got into the bus. At this point I knew something wasn’t right. We all knew Kichu wasn’t gonna make it, coz I saw him a month before this day.

He had his fare share of chemo done by then, and had lost all his hair. It was hard for me to face him and look him in the eyes that was searching for a bit of hope, coz he knew it all the way.

How did he know you may ask.

Coz he had seen his elder sister take the same path to death when he was probably 6.

Every time I saw him, he was holding on to that smile making sure his mom never saw him suffer the pain he had within. Kichu was strong and he asked me to be strong alongside him and keep my shit together.
He wanted to do so much thing, and he had very little time.
We started swimming lessons, we went for painting class, he got himself a gaming PC and we played NFS all day or till he was tired.
He couldn’t play any more for his heart was weak, but he watched me score goals. Even when he was cheering from the sidelines, I was hoping for that one day when I could celebrate another goal with him.

So the bus finally arrived and I am off for school. Mom is relieved for the time being.

I write my exam, thinking about what had happened in the morning. I walk back home in the afternoon, and I open the front door.

I could see my mom had cried the whole day, and her eyes were so red and dry. I could see my grandma numb and looking at me with those helpless eyes.

Mom finally said: “ശ്രീ, നമ്മടെ കിച്ചു പോയെടാ / Kichu is no longer with us”

I don’t remember anything but just one answer. I asked mom if she was hiding this from me in the morning.

Yes

I felt so much anger and pain, I wanted to smash the front window glass. I went straight to my room upstairs, shut the door from behind, grabbed a pillow, and bit it like an angry dog and screamed for a long time as far as I can remember.

I had to be strong.

How can I be strong, when I hear the friend I thought I had for a lifetime, had gone away for ever.

How can I be strong when the last image of him I had in my head was of the kid who hoped to live a healthy funny life.

How can I be strong when I could not even say a final goodbye. I couldn’t even see his body for one last time.

Days, weeks, and months pass by. I wanted to accept the reality, but till this very day, I wake up on most days empty and have all these thoughts about how we grew up as brothers.

Chemistry paper was out after valuation, and I still remember the then chemistry teacher asking me in-front of the whole class about what was wrong with me. She wasn’t expecting me to do this poorly in the exam for she knew my mom who also happens to teach the same subject.

As much as I wanted to shout to the whole class that I just lost my best friend, I kept quite with my head down. I felt so much pain that day, that I tore up the answer paper and threw it away on my way back home. I don’t think any of my classmates knew about the whole Kichu scene.

I was scared to talk about it and have only told this to a close friend of mine, once. I am still scared and it hurts hell to write this draft, which I don’t know if I would be able to publish.

Almost 10 years have gone by, and when I look back at my childhood, at least I can still picture the little kid with a bright smile who always had my back.

This is a Eulogy for you buddy:

You have shown me the courage to fight with hope. You are my brother, I miss you very badly and I’ll always carry you with me. I wish you could see me now. I wasn’t ready for you to go yet, but you left me with no other choice. Growing up into adulthood without you was hard, and am still finding it hard to believe its been 10 years since you left.

I feel so proud and honored to have shared the kind of brotherhood and love we had for each other for 16 years, but how I wish I could get more of those.

I know you tried hard and I understand why you had to give up. I know you faked a lot of smile towards the end but I know you did it for a reason. If something life has taught me from all that you went through, its that “there are some people who always find reason to make others happy, even when they know that they are dying”. I don’t really believe in after life and stuff. For all the people who know me, should now know why I gave up on the concept of God, for God wasn’t there when I needed. I don’t trust someone who doesn’t show up when you need them to. So God for me died with Kichu.

Love you, my brother.

-Sree

NB. This post is for remembering my friend and also to help me let some of the longing feeling of pain and heavy heart. This post doesn’t really tell half the pain I still have and I could never write something that does. For people who have been in my shoes or are currently in it, please find that strength by holding on to good memories with the ones you lost.

by Sreenadh T C at February 17, 2020 09:14 PM

January 01, 2020

Balasankar C

FOSS contributions in 2019

Heyo,

I have been interested in the concept of Freedom - both in the technical and social ecosystems for almost a decade now. Even though I am not a harcore contributor or anything, I have been involved in it for few years now - as an enthusiast, a contributor, a mentor, and above all an evangelist. Since 2019 is coming to an end, I thought I will note down what all I did last year as a FOSS person.

GitLab

My job at GitLab is that of a Distribution Engineer. In simple terms, I have to deal with anything that a user/customer may use to install or deploy GitLab. My team maintains the omnibus-gitlab packages for various OSs, docker image, AWS AMIs and Marketplace listings, Cloud Native docker images, Helm charts for Kubernetes, etc.

My job description is essentially dealing with the above mentioned tasks only, and as part of my day job I don’t usually have to write and backend Rails/Go code. However, I also find GitLab as a good open source project and have been contributing few features to it over the year. Few main reasons I started doing this are

  1. An opportunity to learn more Rails. GitLab is a pretty good project to do that, from an engineering perspective.
  2. Most of the features I implemented are the ones I wanted from GitLab, the product. The rest are technically simpler issues with less complexity(relates to the point above, regarding getting better at Rails).
  3. I know the never-ending dilemma our Product team goes through to always maintain the balance of CE v/s EE features in every release, and prioritizing appropriate issues from a mountain of backlog to be done on each milestone. In my mind, it is easier for both them and me if I just implemented something rather than asked them to schedule it to be done by a backend team, so that I cane enjoy the feature. To note, most of the issues I tackled already had Accepting Merge Requests label on them, which meant Product was in agreement that the feature was worthy of having, but there were issues with more priority to be tackled first.

So, here are the features/enhancements I implemented in GitLab, as an interested contributor in the selfish interest of improving my Rails understanding and to get features that I wanted without much waiting:

  1. Add number of repositories to usage ping data
  2. Provide an API endpoint to get GPG signature of a commit
  3. Add ability to set project path and name when forking a project via API
  4. Add predefined CI variable to provide GitLab FQDN
  5. Ensure changelog filenames have less than 99 characters
  6. Support notifications to be fired for protected branches also
  7. Set X-GitLab-NotificationReason header in emails that are sent due to explicit subscription to an issue/MR
  8. Truncate recommended branch name to a sane length
  9. Support passing CI variables as push options
  10. Add option to configure branches for which emails should be sent on push

Swathanthra Malayalam Computing

I have been a volunteer at Swathanthra Malayalam Computing for almost 8 years now. Most of my contributions are towards various localization efforts that SMC coordinates. Last year, my major contributions were improving our fonts build process to help various packaging efforts (well, selfish reason - I wanted my life as the maintainer of Debian packages to be easier), implementing CI based workflows for various projects and helping in evangelism.

  1. Ensuring all our fonts build with Python3
  2. Ensuring all our fonts have proper appstream metadata files
  3. Add an FAQ page to Malayalam Speech Corpus
  4. Add release workflow using CI for Magisk font module

Debian

I have been a Debian contributor for almost 8 years, became a Debian Maintainer 3 years after my first stint with Debian, and have been a Debian Developer for 2 years. My activities as a Debian contributor this year are:

  1. Continuing maintenance of fonts-smc-* and hyphen-indic packages.
  2. Packaging of gopass password manager. This has been going on very slow.
  3. Reviewing and sponsoring various Ruby and Go packages.
  4. Help GitLab packaging efforts, both as a Debian Developer and a GitLab employee.

Other FOSS projects

In addition to the main projects I am a part of, I contributed to few FOSS last year, either due to personal interest, or as part of my job. They are:

  1. Calamares - I initiated and spearheaded the localization of Calamares installer to Malayalam language. It reached 100% translated status within a month.
  2. Chef
    1. Fix openSUSE Leap and SLES detection in Chef Ohai 14
    2. Make runit service’s control commands configurable in Chef Runit cookbook
  3. Mozilla - Being one of the Managers for Malayalam Localization team of Mozilla, I helped coordinate localizations of various projects, interact with Mozilla staff for the community in clarifying their concerns, getting new projects added for localization etc.

Talks

I also gave few talks regarding various FOSS topics that I am interested/knowledgeable in during 2019. List and details can be found at the talks page.

Overall, I think 2019 was a good year for the FOSS person in me. Next year, I plan to be more active in Debian because from the above list I think that is where I didn’t contribute as much as I wanted.

January 01, 2020 06:00 AM

August 21, 2019

Sreenadh T C

How Dockup tracks online status of remote agents using Phoenix Presence

“Is our agent online? Let’s ask Phoenix Presence!”

Dockup is a tool that helps engineering teams spin up on-demand environments. We have a UI that talks to several agents which are installed on remote servers. The UI sends commands to these agents over WebSocket connections using Phoenix channels.

What if agent went down?

The commands to spin up and manage environments are sent over to agents running on remote servers. For this to work, we need to make sure our agents are online and ready to receive the commands. In order to do this, we need to keep track of agents assigned to our users and also show the agent’s online status in the UI.

Our first implementation

In the UI, we show if the agent for the organization is online and ready to receive the commands. It is an old school synchronous “ping” to agent behind a Retry module, where we ask for “pong” from agent to relay that back to our UI. This has a problem.

Consider the agent went down due to some unexpected error in the remote server, or suppose the organization has not yet been configured with a proper agent. If the user now opens the page that shows the agent status, the request would be blocked until the “ping” to the agent times out. Unfortunately this would take some time and would be terrible UX. No user would want to see an empty loading screen, only to find out that their agent is actually down!

Using Phoenix Presence

Phoenix Presence is a feature which allows you to register process information on a topic and replicate it transparently across a cluster. It’s a combination of both a server-side and client-side library which makes it simple to implement. A simple use-case would be showing which users are currently online in an application.

If we can track online statuses of users in chat-rooms, it should be possible to track online statuses of our agents too. That’s exactly what we did and here’s a step-by-step guide on how to do it

Firstly, we need to add Phoenix Presence under the App supervision tree as explained in the official docs.

We then configure our agents channel to use Presence to track the agents that connect with Dockup. After this, we can then simply ask Presence if there is a presence of an agent in our app!

https://medium.com/media/087430dfbbfd3a128d72f979652aee1b/href

Let’s add the lines that tells the user if their Agent is up. Now we’ll call the function

Agent.online?(agent_id)

in our template to render the status in the UI.

https://medium.com/media/322ba32012d87f1ec4215fa8fb08ba35/href

Why this is great

By the time user actually visits the settings page, Presence would already have the info whether that specific agent has already joined the topic or not. Since this is a very basic key-value lookup, it is going to be super quick. We no longer need to play ping-pong with the agent to know the presence!

Earlier, this page would, in worst case scenario, take around 30–50 seconds to render, simply because the agent was down.

[info] Received GET /settings
[info] Sent 200 response in 40255.44ms

Using Presence, the response time came down to around 40–50ms, or even lower.

[info] Received GET /settings
[info] Sent 200 response in 17.86ms
[info] Received GET /settings
[info] Sent 200 response in 49.65ms
[info] Received GET /settings
[info] Sent 200 response in 65.62ms
[info] Received GET /settings
[info] Sent 200 response in 31.12ms
[info] Received GET /settings
[info] Sent 200 response in 51.73ms

The most interesting thing about solving this issue for us was that the PR that went in was tiny (just +40/-1), but the impact it had was significant, something we’ve seen time and again with Elixir!


How Dockup tracks online status of remote agents using Phoenix Presence was originally published in Dockup on Medium, where people are continuing the conversation by highlighting and responding to this story.

by Sreenadh T C at August 21, 2019 06:42 PM

July 31, 2019

Sreenadh T C

How to run E2E tests on on-demand environments

On-demand environments for running end-to-end tests

Be more confident about your code changes by adding end-to-end tests that run for each deployment you create on Dockup.

End-to-end testing is a technique used to verify the correctness of an application’s behavior when it works in integration with all its dependencies.
Running end-to-end tests have become exceedingly complicated over time as companies embrace service oriented architecture and monoliths turn into micro-services.

In this blog post, we’ll see how to use Dockup to automatically spin up on-demand environments to run end-to-end tests for every pull request.

We will be explaining this based on Cypress.io but you can follow the similar steps for configuring your favorite E2E tool to run alongside Dockup deployments.

For ease of understanding, let’s use a simple VueJS app that implements a TodoMVC.

https://medium.com/media/776a185ba398092be4306f7cb8032b48/href

We will keep this source under a common project folder, say todomvc-app/ and also create another folder, say todomvc-app/e2e/. We will write our tests inside this. Cypress test specs are kept under a sub-directory called "cypress" and we will have a cypress.json file inside our e2e folder. See more on how Cypress tests are written from their docs.

Once you have your test specs ready, we need to add a Dockerfile and the whole directory structure would look something like this:

todomvc-app/
|
|---- src/
| |---- index.html
| |---- app.js
|
|---- e2e/
| |----cypress/
| | |---- fixtures/
| | |---- integration/
| | | |---- add_todo_spec.js
| | | |---- mark_todo_spec.js
| | |---- plugins/
| | |---- support/
| |
| |---- cypress.json
| |---- Dockerfile
|
|---- package.json
|---- Dockerfile

Since the test cases are to be run for several deployments, we will be keeping the baseUrl config value for cypress as an initial dummy URL, and then we will override it with env variables. This is documented by Cypress here.

Container for the actual todo-app

Assuming that you have already added the container for the actual app while creating a Dockup Blueprint for your todomvc-app (as shown in figure above), we will add a new container holding the image source details. If you are new to Dockup Blueprint, head over here to read more about creating one.

Take care about the Dokcerfile path here, as this is the one which resides inside our e2e folder.

We will also have to add CYPRESS_BASE_URL env for Cypress to receive a public endpoint for the deployment. This can be done using the Environment Variables Substitution feature ( Refer DOCKUP_PORT_ENDPOINT_ ) in Dockup.

The cypress container would exit with the overall number of error the test had.

Container form for e2e

That is all you need to do to have a working cypress end-to-end test running alongside each of the deployments.

Since containers inside a Dockup deployment spin up when they are ready and not sequentially, you will need a shell script that waits for the UI endpoint to be live before you start to run tests. The script can simply fail when the endpoint is not live, upon which the Dockup container would restart.
#!/bin/sh
set -x
set -e
echo "Checking if the endpoint for testing is ready..."
response=$(curl --write-out %{http_code} --silent --output /dev/null "$CYPRESS_BASE_URL")
if [[ $response != 200 ]]; then
# exit the script
exit 1
fi

Cypress has its own docker image configured to run on several CI tools, which you can use it on Dockup as well, without much changes. All you have to do is, put the cypress/folder in the same level as the Dockerfile as their images look for it in the root directory, and as soon as the containers spin up, cypress run command would run. It is however not recommended to do this on Dockup due to the reason mentioned above. Instead, have the script take care of running the cypress command when the endpoint is up.

Now you can go ahead and deploy this blueprint and have it run the E2E tests for you. Your containers should spin up and the e2e tests should start running. While you wait for them to complete, you can also take a look at the logs.

Image builds are ready

A successful deployment with your e2e tests passed would look something like this.

E2E test has passed, and hence the container has a success check

Checks also send updates to GitHub if the deployments are triggered by PRs.

An example of how Dockup sends updates to GitHub PRs

In the case of cypress, it would exit out with a non-zero exit code when there are failures, and thus the container would also fail, suggesting there are failed test cases.

Not using Dockup already? Click here to start for free.


How to run E2E tests on on-demand environments was originally published in Dockup on Medium, where people are continuing the conversation by highlighting and responding to this story.

by Sreenadh T C at July 31, 2019 08:43 AM

How to create on-demand environments for WordPress

Spin up on-demand staging environment to test out your custom plugins and themes for WordPress

Setting up a staging environment and maintaining it for every theme/plugin project for WordPress can be very daunting. Quite often when website developers work on design implementations or content creators try to add articles to their website, they tend to seek approval from team members more often than one can imagine. This can be a tedious amount of work and also time consuming if the team is limited by availability of staging environments.

Dockup helps you mitigate this problem by providing on-demand staging environments for you WordPress site. Your changes would automatically be made available across your team as and when you update it, while letting you concentrate on the design or the article.

In this article, we’ll see how you can use Dockup to automatically spin up on-demand copies of your WordPress site.

How can Dockup help?

Dockup can automatically spin up a staging environment every time you open a PR for your WordPress site. This way, you will have an environment ready at your disposal, with all the changes from the PR. All you have to do is, push code, test your changes for the theme, and perhaps show it your team.

You can also deploy your branches manually on Dockup. This can be super useful when a non-tech team member wants to test how the site looks for any commit or branch. Let’s see how to set this up.

Setting up Dockup

Assuming you have prior knowledge on how/where themes and plugins fits in WordPress, let me quickly setup a sample plugin for the sake of this documentation. If you don’t have any current project, tag along the next step to have a simple source code which we can deploy on Dockup to test things.

This plugin for WordPress will append a line to each of the post that we create. This can be used to write some thanks message or a goodbye message to the end of each post.

Setup the project folder as below:

  • We have a root project folder called “ending-line-wp-plugin”
  • A file with the plugin code, “ending-line-wp-plugin/ending-line/ending-line.php”
  • Dockerfile for building images for Dockup
Project structure

Copy paste the following code in the ending-line.php file:

https://medium.com/media/32bcb3805798dfd3dba3a13acda1f7c8/href

Now let’s dockerise this one. Its pretty straightforward here. All you have to do is copy this one project folder in to a plugins folder inside the “wp-content”.

FROM wordpress:php7.3-apache
WORKDIR /var/www/html
COPY ending-line/ wp-content/plugins/ending-line/

Note that we are not using any scripts to start a MySQL server before we run the actual WordPress server. Dockup lets us spin up both containers separately and we will connect them using the environment variables.

Let’s create a Dockup Blueprint for this project and see how we can stage our plugin project.

We will need two containers here:

  1. MariaDB
  2. The GitHub source using which we will build image.
Make sure that you have set the env variables for the database container. Refer the ones below for a start
MYSQL_USER=wordpress
MYSQL_PASSWORD=password
MYSQL_ROOT_PASSWORD=password
MYSQL_DATABASE=wordpress
Container for database
Note that we are using GitHub as the source here, but you can also use a pre-built docker image of the plugin/theme you are developing.
Also, double check the env variables that your project might need. For this one, we need three. To know what env variables are supported by WordPress, head over here

Some important env variables are:

WORDPRESS_DB_USER=wordpress
WORDPRESS_DB_PASSWORD=password
WORDPRESS_DB_NAME=wordpress
WORDPRESS_DB_HOST=${DOCKUP_SERVICE_mysql}

In case you are wondering what the DOCKUP_SERVICE is, please read more about Environment Variable Substitution available in Dockup.

Container for WordPress server

Looks good, let’s try deploying this environment.

Successfully deployed WordPress

Now, you have a staging environment ready for you to test the plugin you just wrote!

Don’t forget to activate your plugin from the admin panel of WordPress. If you are following this sample app, remember to add the text to be appended to every post via Settings > Ending Line Plugin

Want Dockup on-demand environments for your WordPress sites? Click here to get started.


How to create on-demand environments for WordPress was originally published in Dockup on Medium, where people are continuing the conversation by highlighting and responding to this story.

by Sreenadh T C at July 31, 2019 08:41 AM

How to create on-demand environments for Jekyll Blogs

How to create on-demand environments for Jekyll sites

“See how your Jekyll site and articles look like before you publish them.”

Want to see how your site will turn out before you publish? Just open a PR on your repo and Dockup will spin up a live site for you!

Assuming that you have a Jekyll blog in place, let’s see how we can dockerise it and create a Dockup Blueprint. Here’s the Jekyll site we’ll use: Minima.

We’ll add a couple of files to the root directory:

  1. Dockerfile to build the docker image of our site.
  2. nginx.conf to serve the static site using Nginx.
https://medium.com/media/5ee0f57b0b6822aee61e7b21202bd11e/href

Create Blueprint

Now that we have a Dockerfile added to the source, let’s create a Dockup Blueprint.

Make sure you have configured you GitHub account with Dockup. If you haven’t done it yet, you can head over to Dockup Settings
Container for Jekyll

And that’s all! Wasn’t that easy?

Now every time you open a pull request, Dockup will stage that branch and give you a new deployment as shown below:

Deployed successfully

You can follow similar steps to create on-demand environments for other static site generators, say for e.g. Hugo.

Would you like to test it out on your blog? Click here to get started.


How to create on-demand environments for Jekyll Blogs was originally published in Dockup on Medium, where people are continuing the conversation by highlighting and responding to this story.

by Sreenadh T C at July 31, 2019 08:40 AM

December 19, 2018

Balasankar C

DebUtsav Kochi 2018

Heyo,

Been quite some time since I wrote about anything. This time, it is Debutsav. When it comes to full-fledged FOSS conferences, I usually am an attendee or at most a speaker. I have given some sporadic advices and suggestions to few in the past, but that was it. However, this time I played the role of an organizer.

DebUtsav Kochi is the second edition of Debian Utsavam, the celebration of Free Software by Debian community. We didn’t name it MiniDebConf because it was our requirement for the conference to be not just Debian specific, but should include general FOSS topics too. This is specifically because our target audience aren’t yet Debian-aware to have a Debian-only event. So, DebUtsav Kochi had three tracks - one for general FOSS topics, one for Debian talks and one for hands-on workshops.

As a disclaimer, the description about the talks below are what I gained from my interaction with the speakers and attendees, since I wasn’t able to attend as many talks as I would’ve liked, since I was busy with the organizing stuff.

The event was organized by Free Software Community of India, whom I represented along with Democratic Alliance for Knowledge Freedom (DAKF) and Student Developer Society (SDS). Cochin University of Science and Technology were generous enough to be our venue partners, providing us with necessary infrastructure for conducting the event as well as accommodation for our speakers.

The event span across two days, with a registration count around 150 participants. Day 1 started with a keynote session by Aruna Sankaranarayanan, affiliated with OpenStreetMap. She has been also associated with GNOME Project, Wikipedia and Wikimedia Commons as well as was a lead developer of the Chennai Flood Map that was widely used during the floods that struck city of Chennai.

Sruthi Chandran, Debian Maintainer from Kerala, gave a brief introduction about the Debian project, its ideologies and philosophies, people behind it, process involved in the development of the operating system etc. An intro about DebUtsav, how it came to be, the planning and organizations process that was involved in conducting the event etc were given by SDS members.

After these common talks, the event was split to two parallel tracks - FOSS and Debian.

In the FOSS track, the first talk was by Prasanth Sugathan of Software Freedom Law Centre about the needs of Free Software licenses and ensuring license compliance by projects. Parallely, Raju Devidas discussed about the process behind becoming an official Debian Developer, what does it mean and why it matters to have more and more developers from India etc.

After lunch, Ramaseshan S introduced the audience to Project Vidyalaya, a free software solution for educational institutions to manage and maintain their computer labs using FOSS solutions rather than the conventional proprietary solutions. Shirish Agarwal shared general idea about various teams in Debian and how everyone can contribute to these teams based on their interest and ability.

Subin S showed introduced some nifty little tools and tricks that make Linux desktop cool, and improve the productivity of users. Vipin George shared about the possibility of using Debian as a forensic workstation, and how it can be made more efficient than the proprietary counterparts.

Ompragash V from RedHat talked about using Ansible for automation tasks, its advantages over similar other tools etc. Day 1 ended with Simran Dhamija talking about Apache SQOOP and how it can be used for data transformation and other related usecases.

In the afternoon session of Day 1, two workshops were also conducted parallel to the talks. First one was by Amoghavarsha about reverse engineering, followed by an introduction to machine learning using Python by Ditty.

We also had an informal discussion with few of the speakers and participants about Free Software Community of India, the services it provide and how to get more people aware of such services and how to get more maintainers for them etc. We also discussed the necessity of self-hosted services, onboarding users smoothly to them and evangelizing these services as alternatives to their proprietary and privacy abusing counterparts etc.

Day 2 started with a keynote session by Todd Weaver, founder and CEO of Purism who aims at developing laptops and phones that are privacy focused. Purism also develops PureOS, a Debian Derivative that consists of Free Software only, with further privacy enhancing modifications.

On day 2, the Debian track focused on a hands-on packaging workshop by Pirate Praveen and Sruthi Chandran that covered the basic workflow of packaging, the flow of packages through various suites like Unstable, Testing and Stable, structure of packages. Then it moved to the actual process of packaging by guiding the participants through packaging a javascript module that is used by GitLab package in Debian. Participants were introduced to the tools like npm2deb, lintian, sbuild/pbuilder etc. and the various debian specific files and their functionalities.

In the FOSS track, Biswas T shared his experience in developing keralarescue.in, a website that was heavily used during the Kerala Floods for effective collaboration between authorities, volunteers and public. It was followed by Amoghavarsha’s talk on his journey from Dinkoism to Debian. Abhijit AM of COEP talked about how Free Software may be losing against Open Source and why that may be a problem. Ashish Kurian Thomas shed some knowledge on few *nix tools and tricks that can be a productivity booster for GNU/Linux users. Raju and Shivani introduced Hamara Linux to the audience, along with the development process and the focus of the project.

The event ended with a panel discussion on how Debian India should move forward to organize itself properly to conduct more events, spread awareness about Debian and other FOSS projects out there, prepare for a potential DebConf in India in the near future etc.

The number of registrations and enthusiasms of the attendees for the event is giving positive signs on the probability of having a proper MiniDebConf in Kerala, followed by a possible DebConf in India, for which we have bid for. Thanks to all the participants and speakers for making the event a success.

Thanks to FOSSEE, Hamara Linux and GitLab for being sponsors of the event and thus enabling us to actually do this. And also to all my co-organizers.

A very special thanks to Kiran S Kunjumon, who literally did 99% of the work needed for the event to happen (as you may recall, I am good at sitting on a chair and planning, not actually doing anything. :D ).

Group photo

December 19, 2018 06:00 AM

March 25, 2018

Balasankar C

FOSSAsia 2018 - Singapore

Heyo,

So I attended my first international FOSS conference - FOSSAsia 2018 at Lifelong learning institute, Singapore. I presented a talk titled “Omnibus - Serve your dish on all the tables” (slides, video) about the tool Chef Omnibus which I use on a daily basis for my job at GitLab.

The conference was a 4-day long one and my main aim was to network with as many people as I can. Well, I planned to attend sessions, but unlike earlier times when I attended all the sessions, these days I am more focussed on certain topics and technologies and tend to attend sessions on those (for example, devops is an area I focuses on, block chain isn’t).

One additional task I had was attend the Debian booth at the exhibition from time to time. It was mainly handled by Abhijith (who is a DM). I also met two other Debian Developers there - Andrew Lee(alee) and Héctor Orón Martínez(zumbi).

I also met some other wonderful people at FOSSAsia, like Chris Aniszczyk of CNCF, Dr Graham Williams of Microsoft, Frank Karlitschek of NextCloud, Jean-Baptiste Kempf and Remi Denis-Courmont of VideoLan, Stephanie Taylor of Google, Philip Paeps(trouble) of FreeBSD, Harish Pillai of RedHat, Anthony, Christopher Travers, Vasudha Mathur of KDE, Adarsh S of CloudCV (and who is from MEC College, which is quite familiar to me), Tarun Kumar of Melix, Roy Peter of Go-Jek (with whom I am familiar, thanks to the Ruby conferences I attended), Dias Lonappan of Serv and many more. I also met with some whom I know knew only digitally, like Sana Khan who was (yet another, :D) a Debian contributor from COEP. I also met with some friends like Hari, Cherry, Harish and Jackson.

My talk went ok without too much of stuttering and I am kinda satisfied by it. The only thing I forgot is to mention during the talk that I had stickers (well, I later placed them in the sticker table and it disappeared within minutes. So that was ok. ;))

PS: Well, I had to cut down quite a lot of my explanation and drop my demo due to limited time. This caused me miss many important topics like omnibus-ctl or cookbooks that we use at GitLab. But, I had a few participants come up and meet me after the talk, with doubts regarding omnibus and its similarity with flatpak, relevance during the times of Docker etc, which was good.

Some photos are here:

Abhijith in Debian Booth

Abhijith in Debian Booth

Abhijith with VLC folks

Abhijith with VLC folks

Andrew's talk

Andrew's talk

With Anthony and Harish: Two born-and-brought-up-in-SG-Malayalees

With Anthony and Harish: Two born-and-brought-up-in-SG-Malayalees

Chris Aniszczyk

With Chris Aniszczyk

Debian Booth

At Debian Booth

Frank Karlitschek

With Frank Karlitschek

Graham Williams

With Graham Williams

MOS Burgers - Our breakfast place

MOS Burgers - Our breakfast place

Premas Cuisine - The kerala taste

Premas Cuisine - The kerala taste

The joy of seeing Malayalam

The joy of seeing Malayalam

With Sana

With Sana

Well, Tamil, ftw

Well, Tamil, ftw

Zumbi's talk

Zumbi's talk

March 25, 2018 05:00 AM

January 17, 2018

Balasankar C

Introduction to Git workshop at CUSAT

Heyo,

It has been long since I have written somewhere. In the last year I attended some events, like FOSSMeet, DeccanRubyConf, GitLab’s summit and didn’t write anything about it. The truth is, I forgot I used to write about all these and never got the motivation to do that.

Anyway, last week, I conducted a workshop on Git basics for the students of CUSAT. My real plan, as always, was to do a bit of FOSS evangelism too. Since the timespan of workshop was limited (10:00 to 13:00), I decided to keep everything to bare basics.

Started with an introduction to what a VCS is and how it became necessary. As a prerequisite, I talked about FOSS, concept of collaborative development, open source development model etc. It wasn’t easy as my audience were not only CS/IT students, but those from other departments like Photonics, Physics etc. I am not sure if I was able to help them understand the premise clearly. However, then I went on to talk about what Git does and how it helps developers across the world.

IIRC, this was the first talk/workshop I did without a slide show. I was damn lazy and busy to create one. I just had one page saying “Git Workshop” and my contact details. So guess what? I used a whiteboard! I went over the basic concepts like repositories, commits, staging area etc and started with the hand-on session. In short, I talked about the following

  1. Initializing a repository
  2. Adding files to it
  3. Add files to staging areas
  4. Committing
  5. Viewing commit logs
  6. Viewing what a specific commit did
  7. Viewing a file’s contents at a specific commit
  8. Creating a GitLab account (Well, use all opportunity to talk about your employer. :P)
  9. Creating a project in GitLab
  10. Adding it as a remote repository to your local one
  11. Pushing your changes to remote repository

I wanted to talk about clone, fork, branch and MRs, but time didn’t permit. We wound up the session with Athul and Kiran talking about how they need the students to join the FOSSClub of CUSAT, help organizing similar workshops and how it can help them as well. I too did a bit of “motivational talk” regarding how community activities can help them get a job, based on my personal experience.

Here are a few photos, courtesy of Athul and Kiran:

January 17, 2018 06:00 AM

September 07, 2016

Balasankar C

SMC/IndicProject Activities- ToDo List

Heyo,

So, M.Tech is coming to an end I should probably start searching for a job soon. Still, it seems I will be having a bit of free time from Mid-September. I have got some plans about the areas I should contribute to SMC/Indic Project. As of now, the bucket list is as follows:

  1. Properly tag versions of fonts in SMC GitLab repo - I had taken over the package fonts-smc from Vasudev, but haven’t done any update on that yet. The main reason was fontforge being old in Debian. Also, I was waiting for some kind of official release of new versions by SMC. Since the new versions are already available in the SMC Fonts page, I assume I can go ahead with my plans. So, as a first step I have to tag the versions of fonts in the corresponding GitLab repo. Need to discuss whether to include TTF file in the repo or not.
  2. Restructure LibIndic modules - Those who were following my GSoC posts will know that I made some structural changes to the modules I contributed in LibIndic. (Those who don’t can check this mail I sent to the list). I plan to do this for all the modules in the framework, and to co-ordinate with Jerin to get REST APIs up.
  3. GNOME Localization - GNOME Localization has been dead for almost two years now. Ashik has shown interest in re-initiating it and I plan to do that. I first have to get my committer access back.
  4. Documentation - Improve documentation about SMC and IndicProject projects. This will be a troublesome and time consuming task but I still like our tools to have proper documentation.
  5. High Priority Projects - Create a static page about the high priority projects so that people can know where and how to contribute.
  6. Die Wiki, Die - Initiate porting Wiki to a static site using Git and Jekyll (or any similar tool). Tech people should be able to use git properly.

Knowing me pretty much better than anyone else, I understand there is every chance of this being “Never-being-implemented-plan” (അതായത് ആരംഭശൂരത്വം :D) but still I intend to do this in an easy-first order.

September 07, 2016 04:47 AM

August 29, 2016

malayaleecoder

GSoC — Final Report!

So finally it’s over. Today is the last date for submission of the GSoC project. This entire ride was a lot informative as well as an experience filled one. I thank Indic Project organisation for accepting my GSoC project and my mentors Navaneeth K N and Jishnu Mohan for helping me out fully throughout this project.

The project kicked off keeping in mind of incorporating the native libvarnam shared library with the help of writing JNI wrappers. But unfortunately the method came to a stall when we were unable to import the libraries correctly due to lack of sufficient official documentations. So my mentor suggested me an alternative approach by making use of the Varnam REST API. This has been successfully incorporated for 13 languages with the necessity of the app requiring internet connection. Along with it, the suggestions which come up are also the ones returned by Varnam in the priority order. I would be contributing further to Indic Project to make the library method work in action. Apart from that see below the useful links,

  • this and this is related to adding a new keyboard with “qwerty” layout.
  • this is adding a new SubType value and a method to identify TransliterationEngine enabled keyboards.
  • this is adding the Varnam class and setting the TransliterationEngine.
  • this and this deals with applying the transliteration by Varnam and returning it back to the keyboard.
  • this is the patch to resolve the issue, program crashes on switching keyboards.
  • this makes sure that after each key press, the displayed word is refreshed and the transliteration of the entire word is shown.
  • this makes sure that on pressing deletion, the new word in displayed.
  • this creates a template such that more keyboards can be added easily.
  • this makes sure that the suggestions appearing are directly from the Varnam engine and not from the inbuilt library.
  • The lists of the commits can be seen here which includes the addition of layouts for different keyboards and nit fixes.

Add Varnam support into Indic Keyboard

https://medium.com/media/30df9a95b2ac8d2171a7e7a1d00fe0ad/href

The project as a whole is almost complete. The only thing left to do is to incorporate the libvarnam library into the apk and then we can call that instead of the Varnam class given here. The ongoing work for that can be seen below,

malayaleecoder/libvarnam-Android

//Varnam
varnamc -s ml -t "Adutha ThavaNa kaaNaam" //See you next time

by Vishnu H Nair at August 29, 2016 08:18 AM

August 23, 2016

Anwar N

GSoC 2016 IBus-Braille-Enhancement Project - Summary

Hi,
   First of all my thanks to Indic Project and Swathanthra Malayalam Computing(SMC) for accepting this project. All hats off to my mentors Nalin Sathyan and Samuel Thibault. The project was awesome and I believe that I have done my maximum without any prior experience

Project Blog : http://ibus-braille-enhancement.blogspot.in/


Now let me outline what we have done during this period.

Braille-Input-Tool (The on-line version)
  Just like Google transliteration or Google Input Tools online. This is required because it's completely operating system independent and it's a modern method which never force user to install additional plugin or specific browser. The user might use this form temporary places like internet cafe. This is written using JQuery and Html. And works well in GNU/Linux, Microsoft windows, Android etc

See All Commits : https://github.com/anwar3746/braille-input/commits/gh-pages
Test with following link : http://anwar3746.github.io/braille-input/


IBus-Braille enhancements
See All Commits : https://gitlab.com/anwar3746/ibus-braille/activity

1 IBus-Braille integrated with Liblouis : The Liblouis software suite provides an open-source braille translator, back-translator and formatter for a large number of languages and braille codes. So maintaining and shipping separate braille maps(located at /share/ibus-sharada-braille/braille) with ibus-braille is a bad idea. Through this we completely adopted Ibus-Braille to use Liblouis. The conversion is done in an entire word manner instead of each letter. ie the conversion does after writing direct braille unicode and pressing space.
Commit 1 : https://gitlab.com/anwar3746/ibus-braille/commit/6826982fa39cbd2e155bfb389658e16cc57b0dae
Commit 2 : https://gitlab.com/anwar3746/ibus-braille/commit/7032cf7b0c8cea7ce6c619c39750f5110effcfa3
Commit 3 : https://gitlab.com/anwar3746/ibus-braille/commit/46ec83a1caab75b2b25bbd06e1156d927b33c211

See Picture of Ibus-Braille preferences given below

2 8-Dot braille Enabled : Yes languages having more than 64 characters which can't be handled with 64 (6 dots ) combination are there, Music notations like  “Abreu” and LAMBDA (Linear Access to Mathematics for Braille Device and Audio Synthesis) uses 8-dot braille system.  unicode support 8-dot braille.
Commit 1 : https://gitlab.com/anwar3746/ibus-braille/commit/54d22c0acbf644709d72db076bd6de00af0e20b9

See key/shortcut page picture of ISB preferences dot setting

3 Dot 4 issue Solved :  In IBus-Braille when we type in bharati braille such as Malayalam, Hindi, etc. we have to use 13-4-13 to get letter ക്ക(Kka). But according to braille standard in order to get EKKA one should press 4-13-13. And this make beginners to do extra learning to start typing. Through this project we solved this issues and a conventional-braille-mode switch is provided in preferences in order to switch between.

Commit : https://gitlab.com/anwar3746/ibus-braille/commit/089edca78d31355c3ab0e08559f0d9fe79929de6

4 Add Facility to write direct Braille Unicode : Now one can use IBus-Braille to type braille dot notation directly with the combination.  The output may be sent to a braille embosser. Here braille embosser is an impact printer that renders text in braille characters as tactile braille cells.

Commit : https://gitlab.com/anwar3746/ibus-braille/commit/4c6d2e3c8a2bbe86e08ca8820412201a52117ad1


5 Three to Six for disabled people with one hand : A three key implementation which uses delay factor between key presses for example 13 followed by
13 having delay less than delay factor (eg:0.2) will give X. If more, then output would be KK. If one want to type a letter having combination only 4,5,6 he have to press "t" key prior. The key and the Conversion-Delay can be adjusted from preferences.

Commit : https://gitlab.com/anwar3746/ibus-braille/commit/dda2bd83ba69fb0a0f6b526a940bc878bf230485

6 Arabic language added
Commit : https://gitlab.com/anwar3746/ibus-braille/commit/bd0af5fcfabf891f0b0e6649a3a6c647b0d5e336

7 Many bugs solved
Commit : https://gitlab.com/anwar3746/ibus-braille/commit/da0f0309edb4915ed770e9ab41e4355c2bd2c713
others are implied

Project Discourse : https://docs.google.com/document/d/16v-BMLLzWmzbo1n5S-wDTnUmFV-cwhoon1PeJ0mDM64/edit?usp=sharing
IBus-Sharada-Braille (GSoC 2014) : http://ibus-sharada-braille.blogspot.in/

Plugins for firefox and chrome
    This plugin can be installed will work with every text entry on the web pages no need for copy paste. extensions are written in Javascript.
See All Commits : https://github.com/anwar3746/braille-browser-addons/commits/master


Modification yet desirable are as following

1 Announce extra information through Screen Reader:  When user expand abbreviation or a contraction having more than 2 letters is substituted the screen reader is not announcing it. We have to write a orca(screen reader) plugin for Ibus-Braille

2 A UI for Creating and Editing Liblouis Tables

3 Add support for more Indic Languages and Mathematica Operators via liblouis

Braille-input-tool (online version)
                             
                       Liblouis integration
Conventional Braille, Three Dot mode and Table Type selection 
Chrome Extension

Direct braille unicode typing
 Eight dot braille enabled

by Anonymous (noreply@blogger.com) at August 23, 2016 04:39 AM

August 22, 2016

Sreenadh T C

It’s a wrap!

“To be successful, the first thing to do is to fall in love with your work — Sister Mary Lauretta”

Well, the Google Summer of Code 2016 is reaching its final week as I get ready to submit my work. It has been one of those best three-four months of serious effort and commitment. To be frank, this has to be one of those to which I was fully motivated and have put my 100%.

Well, at first, the results of training wasn’t that promising and I was actually let down. But then, me and my mentor had a series of discussions on submitting, during which she suggested me to retrain the model excluding the data set or audio files of those speakers which produced the most errors. So after completing the batch test, I noticed that four of the data set was having the worst accuracy, which was shockingly below 20%. This was causing the overall accuracy to dip from a normal one.

So, I decided to delete those four data set and retrain the model. It was not that of a big deal, so I thought its not gonna be drastic change from the current model. But the result put me into a state of shock for about 2–3 seconds. It said

TOTAL Words: 12708 Correct: 12375 Errors: 520
TOTAL Percent correct = 97.38% Error = 4.09% Accuracy = 95.91%
TOTAL Insertions: 187 Deletions: 36 Substitutions: 297
SENTENCE ERROR: 9.1% (365/3993) WORD ERROR RATE: 4.1% (519/12708)

Now, this looks juicy and near to perfect. But the thing is, the sentences are tested as they where trained. So, if we change the structure of sentence that we ultimately give to recognize, it will still be having issues putting out the correct hypothesis. Nevertheless, it was far more better than it was when I was using the previous model.

So I guess I will settle with this for now as the aim of the GSoC project was to start the project and show proof of that this can be done, but will keep training better ones in the near future.

Google Summer of Code 2016 — Submission

  1. Since the whole project was carried under my personal Github repository, I will link the commits in it here : Commits
  2. Project Repository : ml-am-lm-cmusphinx
  3. On top of that, we (me and the organization) had a series of discussions regarding the project over here: Discourse IndicProject
https://medium.com/media/9e8990c8b26cb11e147e0d3e4c5642a7/href

Well, I have been documenting my way through the project over here at Medium starting from the month of May. The blogs can be read from here.

What can be done in near future?

Well, this model is still in its early stage and is still not the one that can be used error free, let alone be applied on applications.

The data set is still buggy and have to improved with better cleaner audio data and a more tuned Language Model.

Speech Recognition development is rather slow and is obviously community based. All these are possible with collaborated work towards achieving a user acceptable level of practical accuracy rather than quoting a statistical, theoretical accuracy.

All necessary steps and procedure have been documented in the README sections of the repository.

puts "thank you everyone!"

by Sreenadh T C at August 22, 2016 07:01 AM

August 21, 2016

Arushi Dogra

GSoC Final Report

Its almost the end of the GSoC internship. From zero knowledge of Android to writing a proposal, proposal getting selected and finally 3 months working on the project was a great experience for me! I have learned a lot and I am really thankful to Jishnu Mohan for mentoring throughout .

Contributions include :-

All the tasks mentioned in the proposal were discussed and worked upon.

Layouts 
I started with making the designs of the layouts. The task was to make Santali Olchiki and Soni layouts for the keyboard. I looked at the code of the other layouts to get a basic understanding of how phonetic and inscript layouts work. Snapshot of one of the view of Santali keyboard :

Screen Shot 2016-08-21 at 6.53.03 PM

Language Support Feature 
While configuring languages, the user is prompted about the locales that might not be supported by the phone.

Screen Shot 2016-08-21 at 6.33.25 PM

Adding Theme Feature
Feature is added at the setup to enable user to select the keyboard theme

Screen Shot 2016-08-21 at 6.49.21 PM

Merging AOSP code
After looking at everything mentioned in the proposal, Jishnu  gave me the job of  merging AOSP source code to the keyboard as the current keyboard doesn’t have changes that were released along with  android M code drop because of which target sdk is not 23 . There are a few errors yet to be resolved and I am working on that 😀

Overall, it was a wonderful journey and I will always want to be a contributor to the organisation as it introduced me to the world of open source and opened a whole new area to work upon and learn more.
Link to the discourse topic : https://discourse.indicproject.org/t/indic-keyboard-project/45

Thank You!  😀

by arushidogra at August 21, 2016 01:29 PM

August 17, 2016

Balasankar C

GSoC Final Report

Heyo,

It is finally the time to wind up the GSoC work on which I have been buried for the past three months. First of all, let me thank Santhosh, Hrishi and Vasudev for their help and support. I seem to have implemented, or at least proved the concepts that I mentioned in my initial proposal. A spell checker that can handle inflections in root word and generate suggestion in the same inflected form and differentiate between spelling mistakes and intended modifications has been implemented. The major contributions that I made were to

  1. Improve LibIndic’s Stemmer module. - My contributions
  2. Improve LibIndic’s Spell checker module - My contributions
  3. Implement relatively better project structure for the modules I used - My contributions on indicngram

1. Lemmatizer/Stemmer

TLDR

My initial work was on improving the existing stemmer that was available as part of LibIndic. The existing implementation was a rule based one that was capable of handling single levels of inflections. The main problems of this stemmer were

  1. General incompleteness of rules - Plurals (പശുക്കൾ), Numerals(പതിനാലാം), Verbs (കാണാം) are missing.
  2. Unable to handle multiple levels of inflections - (പശുക്കളോട്)
  3. Unnecessarily stemming root words that look like inflected words - (ആപത്ത് -> ആപം following the rule of എറണാകുളത്ത് -> എറണാകുളം)

The above mentioned issues were fixed. The remaining category is verbs which need more detailed analysis.

I too decided to maintain the rule-based approach for lemmatizer (actually, what we are designing is half way between a stemmer and lemmatizer. Since it is more inclined towards a lemmatizer, I am going to call it that.) mainly because for implementing any ML or AI techniques, there should be sufficient training data, without which the efficiency will be very poor. It felt better to gain higher efficiency with available rules than to try out ML techniques with no guarantee (Known devil is better logic).

The basic logic behind the multi-level inflection handling lemmatizer is iterative suffix stripping. At each iteration, a suffix is identified from the word and it is transformed to something else based on predefined rules. When no more suffixes are found that have a match on the rule set, we assume the multiple levels of inflection have been handled.

To handle root words that look like inflected words (hereafter called ‘exceptional words’) from being stemmed unnecessarily, it is obvious we have to use a root word corpus. I used the Datuk dataset that is made available openly by Kailash as the root word corpus. A corpus comparison was performed before the iterative suffix stripping started, so as to handle root words without any inflection. Thus, the word ആപത്ത് will get handled even before the iteration begins. However, what if the input word is an inflected form of an exceptional word, like ആപത്തിലേക്ക്? This makes it necessary to introduce the corpus comparison step after each iteration.

Lemmatizer Flowchart

At each iteration, suffix stripping happens from left to right. Initial suffix has 2nd character as the starting point and last character as end point. At each inner iteration, the starting point moves rightwards, thus making the suffix shorter and shorter. Whenever a suffix is obtained that has a transformation rule defined in the rule set, it is replaced with the corresponding transformation. This continues until the suffix becomes null.

Multi-level inflection is handled on the logic that each match in rule set induces a hope that there is one more inflection present. So, before each iteration, a flag is set to False. Whenever a match in ruleset occurs at that iteration, it is set to true. If at the end of an iteration, the flag is true, the loop repeats. Else, we assume all inflections have been handled.

Since this lemmatizer is also used along with a spellchecker, we will need a history of the inflections identified so that the lemmatization process can be reversed. For this purpose, I tagged the rules unambiguously. Each time an inflection is identified, that is the extracted suffix finds a match in the rule set, in addition to the transformation, the associated tag is also pushed to a list. As the result, the stem along with this list of tags is given to the user. This list of tags can be used to reverse the lemmatization process - for which I wrote an inflector function.

A demo screencast of the lemmatizer is given below.

So, comparing with the existing stemmer algorithm in LibIndic, the one I implemented as part of GSoC shows considerable improvement.

Future work

  1. Add more rules to increase grammatical coverage.
  2. Add more grammatical details - Handling Samvruthokaram etc.
  3. Use this to generate sufficient training data that can be used for a self-learning system implementing ML or AI techniques.

2. Spell Checker

TLDR

The second phase of my GSoC work involved making the existing spell checker module better. The problems I could identify in the existing spell checker were

  1. It could not handle inflections in an intelligent way.
  2. It used a corpus that needed inflections in them for optimal working.
  3. It used only levenshtein distance for finding out suggestions.

As part of GSoC, I incorporated the lemmatizer developed in phase one to the spell checker, which could handle the inflection part. Three metrics were used to detect suggestion words - Soundex similarity, Levenshtein Distance and Jaccard Index. The inflector module that was developed along with lemmatizer was used to generate suggestions in the same inflected form as that of original word.

There were some general assumptions and facts which I inferred and collected while working on the spell checker. They are

  1. Malayalam is a phonetic language, where the word is written just like it is pronounced. This is opposite to the case of English, where alphabets have different pronunciations in different words. Example is the English letter “a” which is pronounced differently in “apple” and “ate”.
  2. Spelling mistakes in Malayalam, hence, are also phonetic. The mistakes occur by a character with similar pronunciation, usually from the same varga. For example, അദ്ധ്യാപകൻ may be written mistakenly as അദ്യാപകൻ, but not as അച്യാപകൻ.
  3. A spelling mistake does not mean a word that is not present in the dictionary. The user has to be considered intelligent and he should be trusted not to make mistakes. A word not present in dictionary can be an intentional modification also. A "mistake" is something which is not in the dictionary AND which is very similar to a valid word. If a word is not found in dictionary and no similar words are found, it has to be considered an intentional change the user induced and hence should be deemed correct. This often solves the issues of foreign words deemed as incorrect.
  4. Spelling mistakes in inflected words usually happen at the lemma of the word, not the suffix. This is also because most commonly used suffix parts are pronounced differently and mistakes have a smaller chance to be present there.

Spell checker architecture

The first phase, obviously is a corpus comparison to check if the input word is actually a valid word or not. If it is not, suggestions are generated. For this, a range of words have to be selected. From the logic of Malayalam having phonetic spelling mistakes, the words starting with the characters that are linguistic successor and predecessor of the first character of the word is selected. That is, for the input words ബാരതം, which have ബ as first character the words selected will be the ones starting by ഫ and ഭ. Out of these words, the top N (which is defaulted to 5) words have to be found out that are most similar to the input word.

Three metrics were used for finding out similarity between two words. For Malayalam, a phonetic language, soundex similarity was assigned the top priority. To handle the words that were similar but not phonetically similar because of a difference on a single character that defines phonetic similarity, levenshtein distance was also used. This finds out distance between two words, or the number of operations needed for one word to be transformed to other. To handle the other words, Jaccard index was also used. The priority was assigned as soundex > levenshtein > jaccard. Weights were assigned to each possible suggestion based on the values of these three metrics based on the following logic:

If soundex == 1, similarity = 100
Elseif levenshtein <= 2, weight = 75 + (1.5 * jaccards)
Elseif levenshtein < 5, weight = 65 + (1.5 * jaccards)
Else, weight = 0

To differentiate between spelling “mistakes” and intended modifications, the logic used that if a word did not have N suggestions that have weight > 50, it is most probably an intended word and not a spelling mistake. So, such words were deemed correct.

A demo screencast of the spell checker is given below.

3. Package structure

The existing modules of libindic had an inconsistent package structure that gave no visibility to the project. Also, the package names were too general and didn’t convey the fact that they were used for Indic languages. So, I suggested and implemented the following suggestions

  1. Package names (of the ones I used) were changed to libindic-. Examples would be libindic-stemmer, libindic-ngram and libindic-spellchecker. So, the users will easily understand this package is part of libindic framework, and thus for indic text.
  2. Namespace packages (PEP 421) were used, so that import statments of libindic modules will be of the form from libindic.<module> import <language>. So, the visibility of the project ‘libindic’ is increased pretty much.

August 17, 2016 04:47 AM

August 16, 2016

Anwar N

IBus-Braille Enhancement - 3

Hi,
 A hard week passed!

1 Conventional Braille Mode enabled : Through this we solved dot-4 issue and now one can type using braille without any extra knowledge

commit 1 : https://gitlab.com/anwar3746/ibus-braille/commit/089edca78d31355c3ab0e08559f0d9fe79929de6

2 handle configure parser exceptions : corrupted isb configuration file can make it won't start. so I solved this by proper exception handling

commit 2 : https://gitlab.com/anwar3746/ibus-braille/commit/da0f0309edb4915ed770e9ab41e4355c2bd2c713

3 Liblouis integration : I think our dream is about to come true!  But still also we are struggling with vowel substitution on the middle.
commit 3 : https://gitlab.com/anwar3746/ibus-braille/commit/6826982fa39cbd2e155bfb389658e16cc57b0dae
commit 4 : https://gitlab.com/anwar3746/ibus-braille/commit/46ec83a1caab75b2b25bbd06e1156d927b33c211
commit 5 : https://gitlab.com/anwar3746/ibus-braille/commit/7032cf7b0c8cea7ce6c619c39750f5110effcfa3

by Anonymous (noreply@blogger.com) at August 16, 2016 08:35 PM

August 09, 2016

Sreenadh T C

What now?

“Now that the basic aim was fulfilled, what more can we work on, given there is almost half a month to GSoC Submission!”

Well, as of now the phoneme transcription was done purely based on the manner the word was written and not completely based on the Speech pattern. What I mean is that there are some exceptions in how we write the word and pronounce it (differently). This was pointed out by Deepa mam. She also asked if I could possibly convert some of the existing Linguistic rules(algorithms) that was made with Malayalam TTS in mind, so that it could be used to re-design the phoneme transcription. This could also turn out to be helpful for future use like using it for a fully intelligent Phoneme Transcriber for Malayalam Language Modeling.

This is what we are working on right now, and am literally like scratching my head over some loops in Python!

juzzzz jokinnn
The basic idea is to iterate over each line in the ‘ml.dic’ file and validate the transcription I made earlier with the set of rules. Correcting them (if found invalid) as it goes over.

Seems pretty straight forward! Will see how it goes!

Update — 4th August

Wew!, This is going nuts! OK so I first tried using Lists to classify the different types of phones. It all was good, until I reached a point in algorithm where I have to check if the current phoneme in the transcription is a member of a particular class of phoneme ( now, when I say, class of Phoneme, I just mean, the classification and not the class ). Of course I can search in List for the presence of the element and its quite sufficient enough to say in small comparisons. Our case is different. We are talking about around 7000 words in a file, on top of which each line will have significant amount of if-elif clauses.

This could slow down things and make the script less efficient ( will eventually see the difference ). So I went back to Python documentation and read about the Set Types ( set and frozenset )

A set object is an un-ordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. — said the Python doc.

This is exactly what I wanted. I mean, I don’t have to do any manipulation to the phoneme classes, so there is no real meaning in using a List. Furthermore, the Set supports the ‘in’ using which the membership can be checked with no additional searching procedure. How cool is that!

here!

Update — 9th August

So, after some test on the script, I generated the dictionary file once again, this time applying some of the TTS rules. Now the SphinxTrain is running with this dictionary file. Hopefully, there should be some change in the accuracy.!

left panel with new dictionary, right panel with old dictionary
left panel with new dictionary, right panel with old dictionary

This might as well be the last development phase update if all goes well. Then it is submission time.

puts 'until then ciao'

by Sreenadh T C at August 09, 2016 01:56 PM

Anwar N

IBus-Braille Enhancement - 2

Hi, with this week I where fighting with my final semester exams! and it's over.  Also within this week I added the facility for typing direct braille Unicode.

https://gitlab.com/anwar3746/ibus-braille/commit/4c6d2e3c8a2bbe86e08ca8820412201a52117ad1

instead of converting to Unicode I added it as a new language so that one can later edit and use. 

by Anonymous (noreply@blogger.com) at August 09, 2016 03:40 AM

July 31, 2016

malayaleecoder

GSoC Progress — Week 8 & 9

Awesome, something good is happening :)

Cmake was giving me some trouble in the beginnning. After clearing all the dependency issues with the Cmake example, I was successfully able to run the endless-tunnel on my phone. Following the similar pattern of how the modules are being incorporated in the cmake app, we tried to incorporate the varnam module. The code for the attempt is given here.

Now there comes a problem :| I have documented the issue here,

Adding a new native module using CMake yields "Error: exception during working with external system:"

After 9 days, there has still not been a single response :( So as an alternative we have decided to use the varnam API. I have completed the class for the same and is yet to link to the Keyboard input from the Indic Keyboard app. This part is the agenda for the next week.

//Pascal
program HelloWorld(output);
begin
writeln("That's all for now, see you next time!")
end.

by Vishnu H Nair at July 31, 2016 04:53 PM

GSoC Progress — Week 6 & 7

Why doesn’t it work!!!!!

Alright, for the past two weeks, me and my mentor have been trying a lot to call the varnam library in Java. First we went on trying to load the prebuilt library onto Android Studio and then use the methods in Java, which didn’t work :(

Now we are on a different route of compiling varnam during runtime. For this we are following the cmake example given here. Another thing to note that is, cmake requires canary Android Studio which can be downloaded here. It all started off well when it was seen that OSX has a problem running that.

Now I am getting it all setup on Linux as well as Windows( just in case :P ) Sorry in not writing any technical details, will make it up in the next week.

//Rust
fn main() {
println!('That's all for now, see you next time!');
}

by Vishnu H Nair at July 31, 2016 04:51 PM

GSoC Progress — Week 4 & 5

Ooh boy, half way through GSoC and lot to be done. Finally we decided to do the entire project in Android Studio so that the later integration with Indic Keyboard would be easier. As said in the last post, I was in a state of completing the wrappers of varnam_init() and rest of the functions when a queue of challenges popped up.

First of all since we are moving out of the regular “PC” kind of architecture, storing the scheme files in a specific directory is still a problem. First we decided to store it in the internal storage of the mobile which then eventually caused a lot of problems because varnam_set_symbols_dir() required a string path to the directory, which was not possible. Then we later decided to store it in the external storage of the device. This decision is temporary because once the user removes the external SD card, Varnam keyboard would not be functional :P

Then came the problem of build architectures. Since my work machine is a Mac, all the built libraries are in the form of .dylib files. Android accepts only .so files as the jniLibs. After generating the binary in my dual boot Ubuntu, it turned out that Android accepts only 32 - bit architecture libraries. Then using VirtualBox I finally managed to get the desired files. Now out of nowhere the thrown error is,

"Cannot find: libpthread.so.0"

I have currently written wrappers for most of the required methods, but have to resolve these errors to get the testing going smoothly. I will upload a list of references I have gone through(there a tons of em) in the next post so that anyone working in this topic may find it useful.

//Scala
object Bye extends Application {
println('That's all for now, see you next time!')
}

by Vishnu H Nair at July 31, 2016 04:51 PM

Sreenadh T C

‘He’ just recognized what I said!

Yipeeee!!
Well, the title says it all. The computer just recognized what I said in my Mother Tongue! A major step in the right the direction.

For this to happen, I had to complete the Acoustic Model training. So then!

What is Acoustic Model!

Well it is a set of statistical representational parameters used to learn the language by representing the relation between audio signal and corresponding linguistic features that make up that speech or audio ( phoneme, and transcription! ).

To produce these we need to set up a database structure as documented by the CMU SphinxTrain team. Some of these files were common to the Language Model preparation like the phoneme transcription file. After setting up the database it should look like this irrespective of the language!

The training is straight forward if you get the database error free which was not my case! Thank you! ( ** if you get it error free on the first run, you are probably doing it wrong! ** )

I had to solve two issues ( 1 and 2 ) before I could run the training without any hiccups! It took a day to make the patch works in the files. The documentation didn’t mention that the phone set should contain a maximum of 255 phones due to practical limitation though theoretically it had no problems ( found out myself from the CMU help forums. ). That was the Issue : Reduce phoneset to a max of 255 #31. I successfully reduced to what is found in the current repository version.

Update — July 27

Acoustic Model is ready for testing!

How??!!
$ sphinxtrain -t ml setup

This command will setup the ‘etc’ and ‘wav’ folder as mentioned above. Now we need to setup the sphinx_train.cfg config file which is excellently documented by the team.

Once that is out of the way, run the training.

$ cd ml

and,

$ sphinxtrain run

and,

wait!!

..

….

still waiting!!

.

..

Finally its done! Took quite a lot of time!

Not only that, my Zenbook finally started showing heating and fan noise. That sixth gen Intel needed some extra air! ( ** nice! ** ).

Update — July 29

Well, this means, the GSoC 2016 aim have been achieved which was to develop the Language Model and Acoustic Model. Only thing left is to keep testing it.

The discussion with Deepa mam helped in bringing out a possibility in improving the accuracy which am working on as a branch in parallel to the testing.

With that in mind for the coming week, that’s it for this week

puts "until then ciao!"

by Sreenadh T C at July 31, 2016 07:46 AM

July 30, 2016

Anwar N

IBus-Braille Enhancement - 1

Hi,
  This week I forked IBus-Braille project from SMC GitLab repository  added two things.

1 Eight-Dot braille enabled. Now one can add languages with 8 dot's. The default keys are Z for dot 7 and period for dot-8. This can be remapped using preferences. 
https://gitlab.com/anwar3746/ibus-braille/commit/54d22c0acbf644709d72db076bd6de00af0e20b9


2 Arabic Language added and tested with users
https://gitlab.com/anwar3746/ibus-braille/commit/bd0af5fcfabf891f0b0e6649a3a6c647b0d5e336

See commits : https://gitlab.com/anwar3746/ibus-braille/commits/master

by Anonymous (noreply@blogger.com) at July 30, 2016 03:23 AM

July 26, 2016

Arushi Dogra

Updates on work

My next task was to show instead of all layouts, filter them on the basis of language. My first option I decided to do filtering based on locale. So instead of ACTION_INPUT_METHOD_SUBTYPE_SETTINGS we can use ACTION_LOCALE_SETTINGS but the problem here was that it was giving a list of all the locales in the system instead of the locales in our app. So I skipped this idea. And then decided to create a list and enable users selection on that. But there was no way to connect that to enabled system subtypes. I was stuck on this for quite some time .We ditched the plan and moved on to the “Theme selection” task.

I am currently working on the Theme Selection task . I have successfully added the step . But now I am working on adding the fragment instead of the whole activity . After I am done with this, I will move to adding the images of the themes. I will hopefully complete this task by the weekend.

Also , after a meeting with the mentor, it is decided that after this task I will work on merging AOSP source code to the keyboard as the current keyboard doesn’t have changes that were released along with  android M code drop because of which target sdk is not 23 . So my next task will be merging AOSP code which will give the benifit of run time permissions. 😀

by arushidogra at July 26, 2016 12:34 AM

July 25, 2016

Balasankar C

4 Days. 22 Hours. LaTeX.

Heyo folks,

One of the stuff I love doing is teaching what I know to others. Though it is a Cliché dialogue, I know from experience that when we teach others our knowledge expands. From 10 students, you often get 25 different doubts and minimum 5 of them would be ones you haven’t even thought yourself earlier. In that way, teaching drives a our curiosity to find out more.

I was asked to take a LaTeX training for B.Tech students as a bridge course (happening during their semester breaks. Poor kids!). The usual scenario is faculty taking class and we PG students assisting them. But, since all the faculty members were busy with their own subjects’ bridge courses and LaTeX was something like an additional skill that the students need for their next semesters for their report preparation, I was asked to take to take it with the assistance of my classmates. At first, I was asked to take a two-day session for third year IT students. But later, HOD decided that both CS and IT students should have that class, and guess what - I had to teach for four days. Weirdly, the IT class was split to two non-continuous dates - Monday and Wednesday. So, I didn’t have to take class for four consecutive days, but only three. :D

The syllabus I followed is as follows:

  • Basic LaTeX – Session I
    1. Brief introduction about LaTeX, general document structure, packages etc.
    2. Text Formatting
    3. Lists – Bullets and Numbering
  • Graphics and Formulas – Session II
    1. Working with Images
    2. Tables
    3. Basic Mathematical Formulas
  • Academic Document Generation (Reports and Papers) – Session III
    1. Sectioning and Chapters
    2. Header and Footer
    3. Table of Contents
    4. Adding Bibliography and Citations
    5. IEEETran template
  • Presentations using Beamer – Session IV

As (I, not the faculty) expected, only half of the students came (Classes on semester breaks, I was surprised when even half came!). Both the workshops - for CS and IT - were smooth without any much issues or hinderences. Students didn’t hesitate much to ask doubts or tips on how to do stuff that I didn’t teach (Unfortunately, I didn’t have time to go off-syllabus, so I directed them to Internet. :D). Analysing the students, CS students were more lively and interactive but they took some time to grasp the concept. Compared to them, even though kind of silent, IT students learned stuff fast.

By Friday, I had completed 4 days, around 22 hours of teaching and that too non-stop. I was tired each day after the class, but it was still fun to share the stuff I know. I would love to get this chance again.

IT Batch

IT Batch




CSE Batch

CSE Batch

July 25, 2016 05:00 AM

July 24, 2016

Sreenadh T C

Developing the Language Model

Finally, I can start the work towards Milestone — 2, which is completing the development of Language Model for Malayalam. Time to completely switch to Ubuntu from here on. Why?

Well, all the forums related to CMU Sphinx keep telling that they won’t monitor the reports from Windows anyways, and since all the commands and codes mentioned in the documentation is more inclined to Linux, let’s just stick to it as well. After all, when it comes to Open-Source, why should I develop using Microsoft Windows. (** Giggle **)

What is a Statistical Language Model?

Statistical language models describe more complex language, which in our case is Malayalam. They contain probabilities of the words and word combinations. Those probabilities are estimated from a sample data ( the sentence file ) and automatically have some flexibility.

This means, every combination from the vocabulary is possible, though probability of such combination might vary.

Let’s say if you create statistical language model from a list of words , which is what I did for my Major Project work, it will still allow to decode word combinations ( phrases or sentences for that matter. ) though it might not be our intent.

Overall, statistical language models are recommended for free-form input where user could say anything in a natural language and they require way less engineering effort than grammars, you just list the possible sentences using the words from the vocabulary.

Let me explain this with a traditional Malayalam example:

Suppose we have these two sentences “ ഞാനും അവനും ഭക്ഷണം കഴിച്ചു ” and “ ചേട്ടൻ ഭക്ഷണം കഴിച്ചില്ലേ ”.

If we use the statistical language model of this set of sentences, then it is possible to derive more sentences from the words( vocabulary ).

ഞാനും (1) , അവനും (1) , ഭക്ഷണം (2) , കഴിച്ചു (1) , ചേട്ടൻ (1) , കഴിച്ചില്ലേ (1)

That is, we can have sentences like “ ഞാനും കഴിച്ചു ഭക്ഷണം ” or maybe “ഭക്ഷണം കഴിച്ചില്ലേ ”, or “ അവനും കഴിച്ചില്ലേ ” and so on. It’s like the Transitive Property of Equality but in a more complex manner. Here it's related to probability of occurrence of a given word after a word. Now this is calculated using the sample data that we provide as the database.

Now, you might be wondering what the numbers inside the parenthesis mean. Those are nothing but the number of occurrences of each word in the given complete set of sentences. This is calculated by the set of C libraries provided by a toolkit that I will introduce shortly.

Update — July 18

Okay!

Let’s start building. If you remember from my previous blog post/articles, you can recollect me writing about extracting words and then transcribing those to phonetic representation. Those words are nothing but the vocabulary that I just showed.

For building a language model of such a large scale vocabulary, you will need to use specialized tools or algorithms. One such set of algorithms are provided as C Libraries by the name “CMU-Cambridge Statistical Language Modeling Toolkit” or in short CMU-CLMTK. You can head over to their official page to know more about it. I have already installed it. So we are ready to go.

So according to the documentation,

The first step is to find out the number of occurrences. (text2wfreq)

cat ml.txt | text2wfreq > ml.wfreq

Next we need the .wfreq to .vocab file without the numbers and stuff. Just the words.

cat ml.wfreq | wfreq2vocab -top 20000 > ml.vocab

Oops, there are some issues with the generated vocab file regarding repetitions and additional words here and there which are not required. This might have happened while I was filtering the sentences file but forgot to update or skipped updating the transcription file. Some delay in further process. It's already late night! I need to sleep!

Update — July 19

‘Meld’. Thank you StackExchange

With this guy, its easy to compare everything and make changes simultaneously. It should be done by today!

.

.

Done!

Okay, now that the issue have been handled, we are getting somewhere. It should be pretty much straight forward now.

Next we need find list of every id n-gram which occurred in the text, along with its number of occurrences. i.e. Generate a binary id 3-gram of the training text ( ml.txt ), based on this vocabulary ( ml.vocab ).

By default, the id n-gram file is written out as binary file, unless the -write_ascii switch is used in the command.

-temp ./ switch can be used if youwant to run the command without root permission and use the current working directory as the temp folder. Or you can just run it as root, without any use, which by default will use /usr/tmp as temp folder.

cat ml.txt | text2idngram -vocab ml.vocab -temp ./ > ml.idngram

Finally, we can generate the Language Model. This can either be an ARPA model or a Binary.

idngram2lm -idngram ml.idngram -vocab ml.vocab -binary ml.bin

or

idngram2lm -idngram ml.idngram -vocab ml.vocab -arpa ml.arpa

Even though ARPA is available, using the binary format of the language model is recommended for faster operations.

Here is the basic work-flow.

as provided by the Toolkit Documentation.

That’s it. The Language Model is complete. I can now go ahead into next step, that is building and training the Acoustic Model.

by Sreenadh T C at July 24, 2016 06:13 AM

July 21, 2016

Balasankar C

Kerala State IT Policy - A Stakeholder Consultation

Heyo folks,

Last Saturday, that is 16th July, I attendeda a meeting regarding the upcoming Kerala State IT Policy. It was a stakeholder consultation organized by DAKF, Software Freedom Law Centre and Ernakulam Public Library Infopark branch. The program was presided by Prasanth Sugathan of SFLC (I had met him during Swatanthra, when I helped Praveen in the Privacy track) and was inaugurated by M. P Sukumaran Nair, advisor to the Minister of Industries. The agenda of the meeting was to discuss about the suggestions that needs to be submitted to the Government before they draft the official IT policy, that will be in effect for the next few years. I attended the meeting representing Swathanthra Malayalam Computing. Even though the meeting had a small audience, some of the key topics were brought into the mix.

Professor Jyothi John, retired principal of Model Engg. College, discussed about MOOCs to improve the education standard of the State. He also talked about improving the industry-academia-research relationship that is in a pathetic state as of now. I was asked to talk a few words. But, since SMC hadn’t taken any official stand or points for the meeting, I actually talked about my views about the issue. Obviously, my topics were more focused on Language Computing, Digital empowerment of the language and as well as how FOSS should be the key stone of the IT policy. I also mentioned about the E-Waste problem that Anivar had discussed the other day on the Whatsapp group.

Me Talking

Me Talking | PC: Sivahari

Mr. Joseph Thomas, the president of FSMI also talked on the importance of FOSS in IT policy (Kiran Thomas had some pretty strong disagreements with it. :D ). Following that, Babu Dominic from BSNL talked about their success stories with FOSS and how the project was scraped by government. There were some brilliant insights from Satheesh, who is a Social Entrepreneur now and once ran an IT-based company.

Following that, the meeting took the form of a round table discussion where interesting points regarding E-Waste and the money-saving nature of FOSS (Microsoft has been targetting Institutions for pirated copies, not home users) were raised by Mr. Bijumon, Asst Professor of Model Engg College. Mr. Jayasreekumar, who is a journalist talked about the important issue of the downtrodden people, or the people in the lower socio-economic belt were not part of the discussion and the image of digital divide that carves. We have to seriously increase diversity of participants in these meetings, as a large part of the population has no representation in them. Such meetings will be only fruitful, if the sidelined communities who also should benefit from this policy are brought together to participate in them.

The general theme of the meeting was pointing towards how the IT policy should focus more on the internal market, and how it should be helpful in entrepreneurs in competing with foreign competitors, atleast in the domestic market.

News Coverage in Deshabhimani

News Coverage | PC: Deshabhimani

More and more meetings of this nature are a must, if the state is to advance in the domain of IT.

July 21, 2016 05:00 AM

July 20, 2016

Anwar N

work progress in browser-addon

Hi,
 About two months passed. We do many testing on online braille-input tool. And some widgets rearranged for user comforts. In the recent weeks we made a good progress in both Firefox and Chrome browser addons. But still we suffer from a grate problem with  these addons, The plugins are not working in google chat and Facebook chat entry's.  We are seeking the solution...

by Anonymous (noreply@blogger.com) at July 20, 2016 08:15 PM

July 19, 2016

Balasankar C

GSoC Update: Week #7 and #8

Heyo,

Last two weeks were seeing less coding and more polishing. I was fixing the LibIndic modules to utilize the concept of namespace packages (PEP 420) to obtain the libindic.module structure. In the stemmer module, I introduced the namespace package concept and it worked well. I also made the inflector a part of stemmer itself. Since inflector's functionality was heavily dependent on the output of the stemmer, it made more sense to make inflector a part of stemmer itself, rather than an individual package. Also, I made the inflector language-agnostic so that it will accept a language parameters as input during initialization and select the appropriate rules file.

In spellchecker also, I implemented the namespace concept and removed the bundled packages of stemmer and inflector. Other modifications were needed to make the tests run with this namespace concept, fixing coverage to pick the change etc. In the coding side, I added weights to the three metrics so as to generate suggestions more efficiently. I am thinking about formulating an algorithm to make comparison of suggestions and root words more efficient. Also, I may try handling spelling mistakes in the suffixes.

This week, I met with Hrishi and discussed about the project. He is yet to go through the algorithm and comment on that. However he made a suggestion to split out the languages to each file and make init.py more clean (just importing these split language files). He was ok with the work so far, as he tried out the web version of the stemmer.

[caption id="attachment_852" align="aligncenter" width="800"]hrishi_testing_spellchecker Hrishi testing out the spellchecker[/caption]

July 19, 2016 12:47 AM

July 16, 2016

Sreenadh T C

Mentioning the huge contributions

“ In open source, we feel strongly that to really do something well, you have to get a lot of people involved. — Linus Torvalds ”

I have always loved the idea of Open Source and have been fortunate enough to be participating in one of the world’s best platform for a student to develop, grow, and learn. Google Summer of Code 2016 have gone past it’s mid-term evaluation, and so have I. The last couple of weeks have been in a slow pace compared to the weeks in June.

contribution graph — May-June-July

This is simply because, I was ahead of my schedule while in the Mid-term evaluation period , also I didn’t want to rush things up and screw it up. But, I thought this is the right time to mention the contributions that have been taking place towards this Open Source Project.

Gathering recordings or speech data for training would mean that a lot of people have to individually record their part, and then send it to me. Now this might seem simple enough to some of you out there, but believe me, recording 250 lines or sentences in Malayalam with all its care is not going to be that interesting.

Nonetheless, pull requests have been piling up on my Repository since the early days of the project. The contribution has been really awesome.

What more can you ask for when your little brother who have absolutely no idea about what the heck am doing, but decides to record 250 sentences in his voice so that I could be successful in completing the project! (** aww… you little prankster… **)

And he did all this without making much of a mistake or even complaining about the steps I instructed him to follow. He was so careful that he decided to save after I confirm each and every sentence as he records them. (** giggles **). For those who are interested in knowing what he contributed, take a look at this commit and this. Oh and by the way, he is just 11 years old :) .

To not mention other friends along with this, would be unfair.

So here is a big shout out to all 18 other guys and gals without whom this would not have reached this far.

I know this blog post was not much about the project when looking in one aspect but, when you look it in another point of view, this is one of the most important part of my GSoC work.

With the final evaluation, coming up in 4.5 weeks or so, it is time to start wrapping up my work and put up a final submission in a manner that someone with same enthusiasm or even better can take up this and continue to work on it to better it.

I guess that’s it for this week’s update. More to follow as I near the completion of this awesome experience.

puts "until then ciao!"

by Sreenadh T C at July 16, 2016 06:23 AM

July 11, 2016

Arushi Dogra

Update on work

The week started with continuing the task for detection of supported locales. I was facing some problems initially. I was trying to first change the contents of a static file during runtime which I later realised couldn’t be done. So as directed by mentor I changed the approach and decided to prompt the user at the setup time about which languages might not be supported by the phone.
It looks something like this:

Screenshot_2016-07-12-00-24-21

Unfortunately my system crashed and the later part of my time was given to formatting the laptop,taking backup, installing the OS and re-setup of the project. Then I went home for my parents wedding anniversary for 3 days.

My next task : Improving the setup wizard . Since the user might not be interested in all the languages , so instead of showing all the layouts at once , we are planning to first ask the user to chose the language and then the corresponding layout in it. I have to discuss more with Jishnu regarding this task.

by arushidogra at July 11, 2016 07:16 PM

July 06, 2016

Anwar N

Braille-Input-Tool : The final touch

Hi,

            With this two weeks we have done many testing with users and done many additions according to their needs. The first one  is Key reassigning. as you know there are many keyboard variants also user like to set there own keys instead of using f,d,s,j,k and l. But this make the necessity of saving user preferences. So we done this using jstorage. it's working fine
https://github.com/anwar3746/braille-input/commit/9e8bb0b5ef9a54d61dfa5081d0966ec9d10f01a0


Key reassigning can be done by clicking "Configure Keys" button which will popup many entry's where user can remap his keys. Restore option is also provided there.
https://github.com/anwar3746/braille-input/commit/3d3469ab8a68711ba0189d61f02c7231297ded3a


New and Save are the basic things that should be provided by a online editor
https://github.com/anwar3746/braille-input/commit/074829d2f4be81b7fa984931a90a108e3bac03ab

Changing font color, font size and background color are very impotent for partially impaired blind people. For keeping the page accessible we choose combobox containing major color list instead of providing graphical color picker
https://github.com/anwar3746/braille-input/commit/f1f6d3de308386d08977f40bc417c4c1ac0b3eb9

Various bugfixes
https://github.com/anwar3746/braille-input/commit/9b8cbc8d54051e9cb330514aacc6d8e6066cf7c6
https://github.com/anwar3746/braille-input/commit/d8127ceb3dc567bfb1778a437d29c2cfe989b24f
https://github.com/anwar3746/braille-input/commit/d3a01c17db64d4fabbad29b18d605992b633270f
https://github.com/anwar3746/braille-input/commit/f34104bfb55c3e4e7735a23016ee913311444702

Braille-Input-Tool : http://anwar3746.github.io/braille-input/
See all commits : https://github.com/anwar3746/braille-input/commits/gh-pages


by Anonymous (noreply@blogger.com) at July 06, 2016 09:09 PM

July 05, 2016

Balasankar C

GSoC Update: Week #5 and #6

Heyo,

Last two weeks were spent mostly in getting basic spellchecker module to work. In the first week, I tried to polish the stemmer module by organizing tags for different inflections in an unambiguous way. These tags were to be used in spellchecker module to recreate the inflected forms of the suggestions. For this purpose, an inflector module was added. It takes the output of stemmer module and reverses its operations. Apart from that, I spent time in testing out the stemmer module and made many tiny modifications like converting everything to a sinlge encoding, using Unicode always, and above all changed the library name to an unambiguous one - libindic-stemmer (The old name was stemmer which was way too general).

In the second week, I forked out the spellchecker module, convert the directory structure to match the one I've been using for other modules and added basic building-testing-integration setup with pbr-testtools-Travis combination. Also, I implemented the basic spell checking and suggestion generation system. Like stemmer, marisa_trie was used to store the corpus. Three metrics were used to generate suggestions - Soundex similarity, Levenshtein Distance and Jaccard's Index. With that, I got my MVP (Minimum Viable Product) up and running.

So, as of now, spell checking and suggestion generation works. But, it needs more tweaking to increase efficiency. Also, I need to formulate another comparison algorithm, one tailored for Indic languages and spell checking.

On a side note, I also touched indicngram module, ported it to support Python3 and reformatted it to match the proposed directory that I have been using for other modules. A PR has been created and am waiting for someone to accept it.

July 05, 2016 01:57 PM

June 25, 2016

Arushi Dogra

Weekly Blog

I am given the task to detect whether a language is supported by the keyboard or not. In my phone Punjabi is not supported so I did all the testing with that. Whenever a language is not supported it is displayed as blank so that gave me an idea on how I will work on this issue. So I created the bitmap for all the characters of the language and compared it with an empty bitmap. So If the language was not supported it had empty bitmap and I declared it as not supported.

I have to improve on : Currently it is checking every time when the keyboard is opening. So I will do it such that it checks for all languages during the setup wizard and stores the info.

My task for next week is checking in setup wizard for all languages and in the list displaying the languages which cannot be supported as not supportable so that the user can know.

by arushidogra at June 25, 2016 01:35 PM

June 21, 2016

Anwar N

Bug fixes on Online Braille-Input-Tool

Hi,
  The first month is over, the webpage is almost finished we gone through many bugs in the last week. Sathyaseelan mash and Balaram G really helped us to find out the bugs. One of the crucial and not easy to detect was the bug with map initialization.  We take lot of time to find and fix it.  Another one was with the insertion of text at the middle. following are the names of other commits

CapsLock(G) and Beginning-Middle switch(Alt)
Simple mode checkbox
Word and letter deletion enabled
Abbreviation enabled

https://github.com/anwar3746/braille-input/commits/gh-pages


by Anonymous (noreply@blogger.com) at June 21, 2016 05:45 AM

June 17, 2016

Arushi Dogra

Working with the layouts!

I started with making the designs of the layouts. The task was to make Santali Olchiki and Soni layouts for the keyboard. I looked at the code of the other layouts to get a basic understanding of how they were working.

Soni Layout
It took some time to understand how the transliteration codes were working.I did changes in the ime submodule for the layout. I messed up with the locale names and fixed that later. The changes were merged! Then I updated  the submodule on the Indic keyboard branch .

Santali Olchiki Layout

Previously I made the inscript layout of Santali Olchiki but after discussion with the mentor, it was decided to work on the phonetic layout as it can fit in smaller layout and thus easier to type too. I made the design of the keyboard and wrote the code for it and tested on the device. It is coming out fine too.

After that I explored various keyboard apps to see their setup wizards.

My task for the next week is to detect whether a language is supported by the system or not. I am planning to do it by checking if a character is typed it gives empty results or not. I will look for other ways too. I will update about the progress in the next blog.

 

by arushidogra at June 17, 2016 12:18 PM

June 13, 2016

malayaleecoder

GSoC Progress — Week 2 & 3

First of all, apologies for skipping the last week’s post. Last two weeks were somewhat a bit rocky :P

Discussion with my mentor suggested me to start of with the varnam_init method. Following with the initial trial build sent in a queue of issues to be resolved. Following are the errors in order

Finally after a few resolution of machine dependency and reinstallation I got it finally running :) I am now finishing varnam_init and will move on to the whole libvarnam in the coming week.

//R
cat('That's all for now, see you next time!')

by Vishnu H Nair at June 13, 2016 06:57 PM

June 11, 2016

Anwar N

Basic Online Braille-Input Tool

Hi,

         As three weeks passed, After developing basic Chrome and Firefox extensions we moved to development of webpage braille-input where one can type in six key way. For achieving this we have gone through a lot of things such as Ajax, jQuery, JSON, Apache web server, etc..  The most major referred links are given at the end of this post. even my mentor also new to web based developments he always suggest me to keep it more ideal as possible.  Even the concept of Map switching bitween the begining, middle and contraction list was bit difficult to understand later I realize that's the way it should be. Finaly when we requesting for a space to host the web page one of another mentor from my organization Akshay S Dinesh gave us a hint about facility in Github itself to host. So we done it with a simple effort even we faced jQery download problem and Contraction file listing.

Source Code : https://github.com/anwar3746/braille-input

Now one can try it using the following link
http://anwar3746.github.io/braille-input/

Now we have to implement Abbreviations, Simple-Mode, Open, New , Save, Option to change font, font size, Background and Foreground Color etc.. as done in Sharada-Braille-Writer.




Refered Links :

http://viralpatel.net/blogs/dynamic-combobox-listbox-drop-down-using-javascript/
http://stackoverflow.com/questions/6116474/how-to-find-if-an-array-contains-a-specific-string-in-javascript-jquery
http://stackoverflow.com/questions/4329092/multi-dimensional-associative-arrays-in-javascript
http://stackoverflow.com/questions/7196212/how-to-create-dictionary-and-add-key-value-pairs-dynamically-in-javascript
http://www.w3schools.com/jquery/jquery_events.asp
http://stackoverflow.com/questions/133310/how-can-i-get-jquery-to-perform-a-synchronous-rather-than-asynchronous-ajax-re
http://stackoverflow.com/questions/351409/appending-to-array
http://stackoverflow.com/questions/952924/javascript-chop-slice-trim-off-last-character-in-string

by Anonymous (noreply@blogger.com) at June 11, 2016 12:11 PM

How would be the browser extensions


Hi All,

       Yes the community bonding period is over and coding period started. Me and my mentor really happy to announce that with this community bonding period we just made the basic chrome and firefox extensions that can show how it's going to be!! Once again thanks to my mentor and varnam project. The code is hosted on github with the name braille-browser-addons.

Repository URL : https://github.com/anwar3746/braille-browser-addons

To test it in firefox do the following steps
1 - git clone https://github.com/anwar3746/braille-browser-addons.git
2 - cd braille-browser-addons/firefox/
3 - jpm run -b /usr/bin/firefox
4 - Go to google.com and right click on text entry,from the context menu select Enable from braille sub menu.
5 - Try typing l(fds), k(fs), a(f), b(fd), c(fj)

To test it in chrome
1 - git clone https://github.com/anwar3746/braille-browser-addons.git
2 - Open chrome browser
3 - Go to settings and select extensions
4 - Check Developer mode
5 - Click Load unpacked extensions
6 - Choose chrome folder from braille-browser-addons
7 - Go to google.com and right click on text entry,from the context menu select Enable from braille sub menu.
8 - Try typing l(fds), k(fs), a(f), b(fd), c(fj)

References 
  1. https://developer.mozilla.org/en-US/Add-ons/SDK/Tutorials/Getting_Started_(jpm)
  2. https://developer.mozilla.org/en-US/Add-ons/SDK/Tools/jpm
  3. https://developer.mozilla.org/en-US/Add-ons/SDK/Guides/Content_Scripts 
  4. https://developer.mozilla.org/en-US/Add-ons/SDK/Guides/Content_Scripts/port 
  5. https://developer.chrome.com/extensions/getstarted 
  6. ibus-sharada-braille.blogspot.com/
  7. https://developer.mozilla.org/en-US/docs/Web/API/EventTarget/addEventListener

An article that I read from OpenSourceForU May 2016
Courtesy  : CHMK Library, University Of Calicut







Thank You,
Anwar

by Anonymous (noreply@blogger.com) at June 11, 2016 12:01 PM

June 06, 2016

Arushi Dogra

Weekly Update

This week started with the successful gradle build of the project on my system.The build was successful with  version of gradle : 2.13, SDK version : 22 , build tools version : 22.0.1  . After that I deployed the app on the emulator and on my phone. I am currently working on making Santali Olchiki and Soni layouts for the keyboard.

by arushidogra at June 06, 2016 11:15 AM

May 29, 2016

malayaleecoder

GSoC Progress — Week 1

A week into GSoC has come to an end and it’s been a wonderful learning experience. My goal for the next couple of weeks would be to compile libvarnam in Android. So I decided to play around with the working and the flow of NDK to use native code in the Java program.

Android NDK code flow

The above image roughly explains the workflow of the NDK in Android. This tutorial explains very well as to how one can get started with NDK. The NDK knowledge would come very handy in the fututre for the progress on this project. Another excellent resource is the set of videos by Aleksander Gargenta which can be found here,

https://medium.com/media/e980ad26c890700ed0282c9f4c425b5d/href

I did follow the entire playlist, and found it extremely useful. He explains each and every detail of the process and I would highly suggest it for people looking to get started with NDK. So the future plan is to implement a very basic application which calls the libvarnam module inside the Java application and then hook the skeleton of the program to the Indic Keyboard app.

//Java
System.out.println("That's all for now, see you next time!");

by Vishnu H Nair at May 29, 2016 05:50 PM

Community Bonding Period comes to an end

Alright, so the almost one month long Community Bonding Period is over. During this time I went through the codebase of libvarnam extensively and understood the core structure of the module. Libvarnam is beautifully written and the whole credit goes to Navaneeth for the work he has done.

I have been working on this issue

Allow varnamc to accept sentense to transliteration and reverse transliteration · Issue #133 · varnamproject/libvarnam

and have come up with a fix as this pull request

Enable Varnam for sentences by malayaleecoder · Pull Request #136 · varnamproject/libvarnam

Previously Varnam was only enabled for single words,

$ varnamc -s ml -t kerala
കേരളാ
കേരള
കെരള
കേരലാ
കേരല
കെരല
കേരളാബു
കേരളാഭിമാനി
കേരളാഭ്യുദയം

the issue was to enable Varnam for sentences. With the current fix, sentences in quotes works pretty well (though the below one is not a meaningful one :P)

$ varnamc -s ml -t "malayalam kerala malayalam"
മലയാളം കേരളാ മലയാളം

Now, I would be playing around with NDK trying to find out the way compile native C/C++ code in Android.

//VBScript
MsgBox "That's all for now, see you next time!"

by Vishnu H Nair at May 29, 2016 05:49 PM

GSoC — 2016 under Indic Project

Your proposal Add Varnam support to Indic Keyboard has been accepted!

Those were one of the few overwhelming words I have heard for a long time :’) So as it says, I have been selected for GSoC under Indic Project. My project is to add a new keyboard in the existing Indic Keyboard app with underlying Varnam module. Mentoring me would be two core contributors of this organization, Navaneeth and Jishnu.

During the Community Bonding period, I would be focusing mainly to get well acquainted with the codebase and also to solve some existing issues.

//Javascript
document.write(‘That’s all for now, See you next time!’);

by Vishnu H Nair at May 29, 2016 04:35 PM

May 27, 2016

Arushi Dogra

Update on work

In the initial weeks of Community Bonding, I learned Java and Android and made a sample app on Android.

Then I started working on building Indic Keyboard on my system . First problem that I faced was of gradle version. Then maven was not working on proxy net which extended the building process. I am new to android so  many errors came, and it took some time to resolve them. Currently stuck on exception on processDebugResources . My work for the week includes Santali and Soni Keyboard layouts. Building Indic Keybaord on the system is taking longer than I expected.

by arushidogra at May 27, 2016 03:50 PM

May 24, 2016

Anwar N

The initial work


Hi All,
             
It is really good for me to engaged with free software activities which are interesting for me. I felt great to learn new aspects of life. Also as a new one to SMC and Indic project I get a lot's of new experiences such Instant Relay Chat (#smc-project).



As I mentioned earlier my first target is to create the web interface. As the first and easy step I tried to create a webpage that shows a textarea and user can type in braille way. Since I have no previous experience and lot of things to do immediately I asked my mentor for a hint and he said it can be done with a javascript and learn more about Apache2 webserver. After learning the working strategy of IBus-Braille and some works I just implemented a basic mechanism for testing the possibility and it worked!!. The code is as follows.

<html>
<body>
   <textarea id="brailletextarea" cols=60 rows=30 type="textarea" onKeyPress="KeyPress(event)" onKeyUp="KeyRelease(event)"> </textarea>
</body>
<script>
    var items = "";
    function KeyPress(event)
    {
items=items+String.fromCharCode(event.keyCode)
event.preventDefault();
    }
    function case_insensitive_comp(strA, strB) 
    {
  return strA.toLowerCase().localeCompare(strB.toLowerCase());
    }

    function KeyRelease(event)
    {
brailletextarea = document.getElementById('brailletextarea');
if(items != ""){
items = items.split("");
items = items.sort( case_insensitive_comp )
items = items.join("");
console.log(items);

if(items == "f"){
brailletextarea.value = brailletextarea.value+"a";
}
if(items == "df"){
brailletextarea.value = brailletextarea.value+"b";
}
if(items == "fs"){
brailletextarea.value = brailletextarea.value+"k";
}
if(items == "dfs"){
brailletextarea.value = brailletextarea.value+"l";
}
if(items == "fj"){
brailletextarea.value = brailletextarea.value+"c";
}
if(items == " ")
{
brailletextarea.value = brailletextarea.value+" ";
}
}
items="";
event.preventDefault();
    }
</script>
</html> 


Then I where searching over the internet for how extensions are created. On that time mentor Nalin gave me a good hint(https://github.com/varnamproject/varnam-browser-addons) which is varnam project(https://www.varnamproject.com/). Varnam is a cross platform opensource transliterator for Indian languages which have browser extentions hosted on github. I cloned the codebase of varnam-browser-addon and read it and I understand that it's the best place start the project. Also later I realize that varnam project is also a part of Swathanthra Malayalam Computing(SMC) done by navaneeth.

by Anonymous (noreply@blogger.com) at May 24, 2016 06:41 AM

May 15, 2016

Arushi Dogra

Working with the Libindic Modules

After I had successfully completed the setup of libindic on my system. My next target was looking for bugs and solving them. So I started playing around with the modules and started looking for bugs.

The first module that I started with was Soundex. The module was not showing proper  results , so I started to work on that . I added conditions for the inter and intra language cases and the module started to work fine after that. I made a pull request to merge the changes.I was new to open source , so initially I faced hurdles like  travis test failure. I used to check my mail almost all day to see if the changes were merged or not. After successful merge of my first pull request, I was very happy! 😀  This gave me motivation to explore more modules and work on them.

The next module that I worked on was Spellchecker. I realized that the dictionaries taht were used didn’t have many words. So I started looking for more datasets(ILCI, ILMT, WikiDump, IITB Hindi Wordnet )and started to make dictionaries out of them. Since they were huge corpus, I spent my whole week sitting in the lab day and night watching the code run .

Next module and the one which I enjoyed the most working on was the Scriptrender module. The pdf generated from the urls had alignment problems , so I decided to work on that. I started looking for python libraries and pdfkit library solved the alignment problems. So I made a pull request. But there were some rendering problems with the Malayalam Font and complex script rendering was broken. Slowly, many people from the community got involved in helping me get better results. And I started to try different fonts.”Meera” font solved the problem.  And we got the desired results. The appreciation for my work gave me a lot of confidence to work more.  While working on this module , I realized how helpful everyone in the community was , which gave me courage to work more.

Next was the IndicStemmer module. The community already had stemmer for Malayalam . I made stemmers for Hindi and Punjabi . I followed research papers for the rule based approach for the 2 languages and implemented them. Whenever I am bored, I start adding quotes to the fortune corpus for Hindi and really enjoy reading them .

I really enjoy contributing to the modules and will continue exploring them whenever I get time .   Primary focus is my GSoC project. ( yes , thats what the next blog post will be about  :D)

 

 

by arushidogra at May 15, 2016 09:32 AM

Libindic Installation Guide

by arushidogra at May 15, 2016 08:20 AM

May 05, 2016

Anwar N

My Observations on the Ibus Braille Project

Hi,

I am very happy about GSOC 2016 has accepted my proposal on IBus Braille Enhancemnet.   


I convey my thanks to the Indic Project Team for requesting a slot for my project proposal.
 

Here I am writing about the staus of the Ibus Braille Enhancement Project.
As the coding has officially not started. I also didn't started coding till today.
I am making the analysis and studying about the Ibus Sharada Braille project done by Mr. Nalin Sathyan. I am happy to say that he is mentoring me in this project along with Samuel Thibault. So I got 2 experts as mentors. I am thanking Indic team for giving me those potential mentors.


As of today

Communicating with the mentor Nalin Sathyan directly because he is in the   same university where I am from. He gave me so many suggestion regarding how the User Interface should look like and how the mapping of the keys should be done. He told me about how he implemented the Ibus Sharada Braille project.

Had lots of communication with Mr.Sathyan Master One of the Potential users of the Ibus Sharada Braille Project. He is a teacher in Kasargod Govt School for the Blind. He trains his students typing using the Ibus Braille Project. I got some opinions from so many students those who are using the project. they like the project a lot and felt simple to type the languages Malayalm and English using the Ibus Braille Keyborad.

Some people suggested me about they are not able to use this because they are from the Windows platform. I am now a little bit concerned about the Web User Interface one of the core component of my proposed project.


I am getting into the other segments of the project which I described in my Proposal. It may be spotted after getting it thoroughly.

Thank You
Anwar
  




by Anonymous (noreply@blogger.com) at May 05, 2016 09:12 AM