Quick Tips

Suggested Usage: Maintain a plain ASCII file in a text editor (e.g., Notepad, Yudit), cut and paste into the page's top text area, process, see the output, and if you want, cut and paste back into the text editor.

Character Map

The following table shows the character map used for the conversion, i.e., what latin letters you need to enter to get the output you want. The column headed 'User map' is what you need. The 'Internal Map' column describes some parameters used in the code internally, and are given here mainly to ease debugging. This is discussed in more details below.

Vowels
Bengali Codepoint User map Internal Map Comments
অ, অ-কার a a
আ, া aa, A A
ই, ি i i
ঈ, ী ii, ee, I I use e_e for এএ
উ, ু u u
ঊ, ূ uu, U U
ঋ, ৃ RRi, Ri R
এ, ে e e
ঐ, ৈ ai E use a_i for অই
ও, ো o o
ঔ, ৌ au O use a_u for অউ
Consonants
Bengali Codepoint User map Internal Map Comments
k k
kh kh
g g
gh gh
G, GN G
ch c
chh ch
j j
jh jh
J, JN J
T T
Th Th
D D
Dh Dh
N N
t t
t. q
th th
d d
dh dh
n n
p p
f, ph ph
b, v b
bh bh
m m
y y
r r
l l
sh sh
Sh, S S
s s
h h
.D X
.Dh Z
Y, .y Y
০-৯ 0-9 0-9
Miscellaneous
Bengali Codepoint User map Internal Map Comments
.n, M M
.N, C C
:, H H
| | daanri
_ nothing used to disambiguate strings like au, ee, etc.
# zero-width non-joiner needed for khando-ta and hasanta in the middle of a word

The keymap is modeled on, but not exactly equivalent to, the default Bengali input map that ships with Yudit. The main feature of this map, that may initially throw some people off, is that it needs an explicit `a' to indicate the Vowel Sign corresponding to অ (a). This is close to what most people would normally use to write Bengali using latin letters, except that the trailing `a' is rarely used phonetically, so is often omitted when writing as well (although that is grammatically incorrect).

Hopefully, this would not be a big impediment. On the upside, this eliminates the need for anything to indicate a `hasanta' (it is sometimes required in the middle of a word, when successive consonants are required to be displayed separately without forming a conjugate -- use the `#' sign there to indicate a ZERO-WIDTH NON-JOINER (U+20CC) ). যুক্তাক্ষরs (conjugates) and vowel signs need no special treatment.

How the Code works

To keep things simple, the processing is done in two stages. The first step is to convert the given input into an internal format. The `simplicity' comes from the fact that this format has to use only a one or two letter code for each bengali codepoint in the unicode chart (none for the more esoteric and rarely used points like ৠ ) [Note: this is without the trailing vowel, if any]. In the first stage of processing a more intuitive and flexible format is transformed into the internal format by simple string replacements.

Changes in the Keymap

The code is very simple javascript, and can be easily modified to work with other keymaps. The code is GPL-ed, so you are free to modify it as you please, as long as you adhere to the GPL if and when you redistribute it.

What Is This Thing Anyway ?

This page is essentially some JavaScript code to process latin (english) characters entered in a textarea field and transform them into some other characters, which when interpreted as being encoded in UTF-8, represent Bengali characters. This is done according to a particular algorithm, explained briefly in the section above. Try playing around with it (enter some text that you think should make sense as bengali, and click on the `Process' button). If you don't understand what's going on, go away, this page is not for you (... unless you see lots of boxes, in which case read on).

What Do I Need to Get it to Work ?

You need

If you don't have a Bengali font, you could download my Jamrul font, or get one from elsewhere.

What Use Will This be ?

This is obviously a very Bengali-specific tool, but it could be easily extended to other languages. My reason for writing this was to function as a support tool for my Bengali Document Archive Project. Much has changed since then, and Bengali input support has improved considerably, making this less important, but still useful on occasion.