README
Version 1.3, April 2008

This document describes the reconstructed English etymology of Lojban, which is available in the original plain text format and in HTML format, generated by a conversion script.

  1. Copying conditions
  2. Format of the HTML version
  3. Format of the plain text version
  4. Etymological sources

Copying conditions

The “English etymology of Lojban” in plain text and in HTML format as well as the conversion script were prepared by mublin in March 2008. The content of these three files is hereby placed irrevocably in the public domain.

The official gismu list, prepared by The Logical Language Group, Inc., is in the public domain.


Format of the HTML version

In the generated HTML version, each gismu is given in bold, followed by the English keyword and the Lojbanised source word on one line.

On the next line, the reconstructed English source word is given.

If present, a comment starts on a new line in smaller font.


Format of the plain text version

The plain text file is encoded in UTF-8 with UNIX style line breaks. Each gismu has one line with TAB-separated fields, in the following format:

  1. gismu
  2. English keyword
  3. English source word in Lojbanised form
  4. English source word
  5. comment (optional)

A number of gismu have two or three Lojbanised forms; each of these forms gets its own line. These gismu are “bargu, bilga, blabi, bumru, burna, carce, censa, cinla, cortu, cpedu, dertu, donri, fendi, gerku, jersi, jmaji, jubme, jufra, kamni, krasi, mamta, marce, pinta, spofu, sucta, tansi, taxfu, viska, voksa, vraga, xrula, and zargu.”

The etymology does not include the cultural gismu, the gismu “broda, brode, brodi, brodo, brodu” which have been constructed from “bridi”, and other gismu which have not been generated from the six source languages.

The following conventions are used inside the comment field:

FIXIT ...
needs review for the given reason
FIXIT correct transcription “...”
the source word does not exactly match the Lojbanised form; the correct Lojbanisation for the source word is specified

Etymological sources

The most important etymological source for Lojban is the list of gismu with Lojbanised source words and scores. The format of this file is described in detail in this message to the Lojban mailing list and the file etysample.txt on the Lojban server. Additional information can be found at the Lojban Etymology wiki page, on the Lojban file server and in this directory on the Lojban server.

The gismu generation process is described in more detail in “What is Lojban?”, ch. 4, sec. 17, and in the “Reference Grammar”, ch. 4, sec. 14.

The gismu “mleca” (less) is listed as “ckamu” in the original etymology file; it was changed in 1990 according to the etymology file itself. Similary, the gismu “donri” (daytime) is listed as “dinri”; it was changed in 1993 as reported by the minutes of the LLG. Both gismu are listed in the newer form here.

The following gismu are missing in the gismu etymology file: “gocti” (yocto), “gotro” (yotta), “zepti” (zepto), “zetro” (zetta), “slovo” (Slavic), and “vukro” (Ukrainian); the latter two were added in 1993 as reported by the minutes of the LLG. The gismu “mexco” (Mexican) was changed to “mexno” (see this message to the Lojban list). None of these gismu were generated from the six source languages, so this does not affect the English etymology.

The correspondence between Lojban gismu and TLI Loglan, which is also of etymological interest, is described in detail in the file oldlog.txt on the Lojban server.