fi'i mi'e mublin.

Devanāgarī transliteration

This Python module (view, download) implements the Devanāgarī part of ISO 15919, an international standard for the transliteration of Indic scripts to the Latin alphabet. Devanāgarī is the main script used to write languages like Sanskrit, Hindi, Marathi, and Nepali.

The transliteration scheme is described in Tony Stone's Transliteration of Indic scripts: How to use ISO 15919 and Thomas T. Pedersen's Transliteration of Hindi, Marathi & Nepali (pdf). The Unicode character code chart for the Devanāgarī script (pdf) is available on the Unicode homepage.

An ISO 15919 transliteration of all Hindi words distributed with GNU Aspell is available in HTML (865K) and plain text (459K).

Cyrillic transliteration

There is also a Python module (view, download) implementing ISO 9:1995 (GOST 7.79 System A), the standard for the transliteration of Cyrillic characters to the Latin alphabet. It should handle characters for the languages Abkhaz, Altay, Belarusian, Bulgarian, Buryat, Chuvash, Karachay-Balkar, Macedonian, Moldavian, Mongolian, Russian, Rusyn, Serbian, Udmurt, Ukrainian, and all Caucasian languages using páločka.

An ISO 9 transliteration of all Russian words distributed with GNU Aspell is available in HTML (8.9M) and plain text (5.3M).

Lojban Etymology

This is an attempt to reconstruct the natural language words used in generating the Lojban gismu, in the six source-languages Arabic, Chinese, English, Hindi, Russian, and Spanish. Each reconstructed source-language etymology is available in HTML and plain text and accompanied by a README providing further information. The current and preceding versions of each etymology are also available as a gzip'd tar archives.

Chinese [HTML] [TXT] [README] archives: [1.0] [1.1] [1.2] [1.3] ← current
Spanish [HTML] [TXT] [README] archives: [1.0] [1.1] [1.2] [1.3] [1.4] [1.5] [1.6] ← current
English [HTML] [TXT] [README] archives: [1.0] [1.1] [1.2] [1.3] [1.4] ← current
Russian [HTML] [TXT] [README] archives: [1.0] [1.1] [1.2] [1.3] [1.4] [1.5] [1.6] [1.7] [1.8] [1.9] ← current
Hindi [HTML] [TXT] [README] archives: [1.0] ← current

Lojban cmavo tables

Here is a colour coded, interactive table of cmavo, the structure words of the language Lojban. Try hovering over and clicking on any cmavo or selma'o in the table. There is also an earlier version, which is more colourful, but slightly less useful.

If the page does not fit on your screen, try using the zoom facility of your browser (the tables are floating). You'll need JavaScript for the “clickability”. Sorry, no cross-browser testing, it was developed with Firefox.

Some cmavo are omitted from the table to make it more compact: All cmavo in Y and BY2, which are all those containing the letter ‘y’, as well as those cmavo in UI with diphtonghs other than “ai, au, ei, oi.”

The cmavo are arranged alphabetically in rows and columns. Cmavo belonging to the same selma'o have the same colour. If you click on a selma'o below the table, all cmavo of that selma'o are highlighted. If you click on a cmavo, other cmavo of the same selma'o are highlighted. If you hover over a cmavo, an English gloss appears in a tooltip.

The cmavo table was generated by a Python script (view, download). The earlier version of the cmavo table was generated by another Python script (view, download).

Alice in alternate orthography

I converted “la alis. cizra je cinri zukte vi le selmacygu'e” (the Lojban translation of “Alice In Wonderland”) to a non-standard orthography. The result is available in plain text format encoded in UTF-8. The goal of this alternate orthography is to provide a visually lightweight representation of Lojban text, making it more readable and more pleasing to the eye. It differs from standard orthography in two ways:

Apostrophe. The orthography completely eliminates the apostrophe ('), which is replaced by a diaeresis only where this is required to distinguish a vowel pair from a diphthong. The diaeresis is placed on the letters ‘i’ and ‘u’, which are the only ones used to form diphthongs.

More precisely,

a diaeresis is applied to the second vowel of a two-syllable vowel pair if the pair would otherwise constitute a falling diphthong (“aï, eï, oï, aü”), and to the first vowel if the pair would otherwise constitute a rising diphtong (“ïa, ïe, ïi, ïo, ïu, ïy, üa, üe, üi, üo, üu, üy”); the diaeresis is not used in any other case.

Period. An alternate period ‘·’ (middle dot, U+00B7) is consistently applied before the sentence link “.i” and its compounds, except at the beginning of paragraphs. This is done as a visual aid to the reader, to mark the beginning of sentences. This alternate period is also applied in conjunction with “la'o gy · … · gy”. Otherwise, the period (which is optional according to the Reference Grammar) is omitted, as in large parts of the original version.


My bookmarks are available online. This is a collection of Lojban links with tags, which may become useful as it grows.

The gismu picture list on the Lojban wikipedia is an alphabetic list of picturable gismu, with images depicting their first place (or another place where marked by a cmavo of selma'o SE).