The following table shows the character map used for the conversion, i.e., what latin letters you need to enter to get the output you want. The column headed 'User map' is what you need. The 'Internal Map' column describes some parameters used in the code internally, and are given here mainly to ease debugging. This is discussed in more details below.
| Bengali Codepoint | User map | Internal Map | Comments |
| অ, অ-কার | a | a | |
| আ, া | aa, A | A | |
| ই, ি | i | i | |
| ঈ, ী | ii, ee, I | I | use e_e for এএ |
| উ, ু | u | u | |
| ঊ, ূ | uu, U | U | |
| ঋ, ৃ | RRi, Ri | R | |
| এ, ে | e | e | |
| ঐ, ৈ | ai | E | use a_i for অই |
| ও, ো | o | o | |
| ঔ, ৌ | au | O | use a_u for অউ |
| Bengali Codepoint | User map | Internal Map | Comments |
| ক | k | k | |
| খ | kh | kh | |
| গ | g | g | |
| ঘ | gh | gh | |
| ঙ | G, GN | G | |
| চ | ch | c | |
| ছ | chh | ch | |
| জ | j | j | |
| ঝ | jh | jh | |
| ঞ | J, JN | J | |
| ট | T | T | |
| ঠ | Th | Th | |
| ড | D | D | |
| ঢ | Dh | Dh | |
| ণ | N | N | |
| ত | t | t | |
| ৎ | t. | q | |
| থ | th | th | |
| দ | d | d | |
| ধ | dh | dh | |
| ন | n | n | |
| প | p | p | |
| ফ | f, ph | ph | |
| ব | b, v | b | |
| ভ | bh | bh | |
| ম | m | m | |
| য | y | y | |
| র | r | r | |
| ল | l | l | |
| শ | sh | sh | |
| ষ | Sh, S | S | |
| স | s | s | |
| হ | h | h | |
| ড় | .D | X | |
| ঢ় | .Dh | Z | |
| য় | Y, .y | Y | |
| ০-৯ | 0-9 | 0-9 |
| Bengali Codepoint | User map | Internal Map | Comments |
| ং | .n, M | M | |
| ঁ | .N, C | C | |
| ঃ | :, H | H | |
| । | | | | | daanri |
| _ | nothing | used to disambiguate strings like au, ee, etc. | |
| # | zero-width non-joiner | needed for khando-ta and hasanta in the middle of a word |
The keymap is modeled on, but not exactly equivalent to, the default Bengali input map that ships with Yudit. The main feature of this map, that may initially throw some people off, is that it needs an explicit `a' to indicate the Vowel Sign corresponding to অ (a). This is close to what most people would normally use to write Bengali using latin letters, except that the trailing `a' is rarely used phonetically, so is often omitted when writing as well (although that is grammatically incorrect).
Hopefully, this would not be a big impediment. On the upside, this eliminates the need for anything to indicate a `hasanta' (it is sometimes required in the middle of a word, when successive consonants are required to be displayed separately without forming a conjugate -- use the `#' sign there to indicate a ZERO-WIDTH NON-JOINER (U+20CC) ). যুক্তাক্ষরs (conjugates) and vowel signs need no special treatment.
To keep things simple, the processing is done in two stages. The first step is to convert the given input into an internal format. The `simplicity' comes from the fact that this format has to use only a one or two letter code for each bengali codepoint in the unicode chart (none for the more esoteric and rarely used points like ৠ ) [Note: this is without the trailing vowel, if any]. In the first stage of processing a more intuitive and flexible format is transformed into the internal format by simple string replacements.
The code is very simple javascript, and can be easily modified to work with other keymaps. The code is GPL-ed, so you are free to modify it as you please, as long as you adhere to the GPL if and when you redistribute it.
This page is essentially some JavaScript code to process latin
(english) characters entered in a textarea field and
transform them into some other characters, which when interpreted as
being encoded in UTF-8, represent Bengali characters. This is done
according to a particular algorithm, explained briefly in the section
above. Try playing around with it (enter some text that you think
should make sense as bengali, and click on the `Process' button). If
you don't understand what's going on, go away, this page is not for
you (... unless you see lots of boxes, in which case read on).
You need
If you don't have a Bengali font, you could download my Jamrul font, or get one from elsewhere.
This is obviously a very Bengali-specific tool, but it could be easily extended to other languages. My reason for writing this was to function as a support tool for my Bengali Document Archive Project. Much has changed since then, and Bengali input support has improved considerably, making this less important, but still useful on occasion.