[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

TECH: missing part of morphology paper



When I posted the morphology paper, section 13 (examples of lujvo making)
was unwritten.  The Synopsis version was clearly unacceptable and I had
nothing ready as a substitute.  Here is that section of the paper.
Commentators, please note.

=====cut here=====

13.  lujvo-making Examples

This section contains examples of making and scoring lujvo.  First, we
will start with the tanru "gerku zdani" ("dog house") and construct a
lujvo meaning "doghouse", that is, a house where a dog lives.  We will
use a brute-force application of the algorithm in Section 12, using every
possible rafsi.

The rafsi for "gerku" are "-ger-", "-ge'u-", "-gerk-", and "-gerku".  The
rafsi for "zdani" are "-zda-", "-zdan-", and "-zdani".  Step 1 of the
algorithm directs us to use "-ge'u-" and "-gerk-" as possible rafsi for
"gerku"; Step 2 directs us to use "-zda-" and "-zdani" as possible rafsi
for "zdani".  The four possible forms of the lujvo are then: "ge'u-zda",
"ge'u-zdani", "gerk-zda", and "gerk-zdani".  We must then insert appropriate
hyphens in each case.

The first form, "ge'u-zda", needs no hyphen, because even though the first
rafsi is CVV, the second one is CCV, so there is a consonant cluster in the
first five letters.  So "ge'uzda" is this form of the lujvo.

The second form, "ge'u-zdani", however, requires an "r"-hyphen; otherwise,
the "ge'u-" part would fall off as a cmavo.  So this form of the lujvo is
"ge'urzdani".

The last two forms require "y"-hyphens, as all 4-letter rafsi do, and so
are "gerkyzda" and "gerkyzdani" respectively.

The scoring algorithm is heavily weighted in favor of short lujvo, so we
might expect that "ge'uzda" would win.  Its L score is 7, its A score is 1,
its H score is 0, its R score is 13, and its V score is 3, for a final score
of 26133.  The other forms have scores of 22994, 24492, and 22453
respectively.  Consequently, this lujvo would probably appear in the
dictionary in the form "ge'uzda".

For the next example, we will use the tanru "bloti klesi" ("boat class")
presumably referring to the category (rowboat, motorboat, cruise liner)
into which a boat falls.  We will omit the long rafsi from the process,
since lujvo containing long rafsi are almost never preferred by the scoring
algorithm.

The rafsi for "bloti" are "-lot-", "-blo-", and "-lo'i-"; for "klesi" they
are "-kle-" and "-lei-".  Both these gismu are among the handful which have both
CVV-form and CCV-form rafsi, so there is an unusual number of possibilities
available for a two-part tanru:

	lotkle		blokle		lo'ikle
	lotlei		blolei		lo'irlei

Only "lo'irlei" requires hyphenation (to avoid confusion with the cmavo
sequence "lo'i lei").  All six forms are valid versions of the lujvo, as are
the six further forms using long rafsi; however, the scoring algorithm
produces the following results:

	lotkle 26622	blokle 26642	lo'ikle 26113
	lotlei 25633	blolei 25653	lo'irlei 25044

So the form "blokle" is preferred, but only by a tiny margin over "lotkle";
the next three forms are only slightly worse, and only "lo'irlei" suffers
because of its hyphen.

Our third example will result in forming both a lujvo and a name from the
tanru "logji bangu girzu", or "logical-language group" in English.
The available rafsi are "-loj-" and "-logj-"; "-ban-", "-bau-", and "-bang-";
and "-gri-" and "-girzu", and (for name purposes only) "-gir-" and "-girz-".
The resulting 12 lujvo possibilities are:

	loj-ban-gri	loj-bau-gri	loj-bang-gri
	logj-ban-gri	logj-bau-gri	logj-bang-gri
	loj-ban-girzu	loj-bau-girzu	loj-bang-girzu
	logj-ban-girzu	logj-bau-girzu	logj-bang-girzu

and the 12 name possibilities are:

	loj-ban-gir.	loj-bau-gir.	loj-bang-gir.
	logj-ban-gir.	logj-bau-gir.	logj-bang-gir.
	loj-ban-girz.	loj-bau-girz.	loj-bang-girz.
	logj-ban-girz.	logj-bau-girz.	logj-bang-girz.

After hyphenation, we have:

	lojbangri	lojbaugri	lojbangygri
	logjybangri	logjybaugri	logjybangygri
	lojbangirzu	lojbaugirzu	lojbangygirzu
	logjybangirzu	logjybaugirzu	logjybangygirzu

	lojbangir.	lojbaugir.	lojbangygir.
	logjybangir.	logjybaugir.	logjybangygir.
	lojbangirz.	lojbaugirz.	lojbangygirz.
	logjybangirz.	logjybaugirz.	logjybangygirz.

The only fully reduced lujvo forms are "lojbangri" and "lojbaugri", of which
the latter has a slightly higher score: 23673 versus 23704, respectively.
However, for the name of the organization, we chose to make sure the
name of the language was embedded in it, and to use the clearer long-form
rafsi for "girzu", producing "lojbangirz."

Finally, here is a four-part lujvo with a cmavo in it, due to James Cooke
Brown:  "nakni ke cinse ctuca" or "male (sexual teacher)".  The "ke" cmavo
ensures the interpretation "teacher of sexuality who is male", rather than
"teacher of male sexuality".  Here are the possible forms of the lujvo, both
before and after hyphenation:

	nak-kem-cin-ctu		nakykemcinctu
	nak-kem-cin-ctuca	nakykemcinctuca
	nak-kem-cins-ctu	nakykemcinsyctu
	nak-kem-cins-ctuca	nakykemcinsyctuca
	nakn-kem-cin-ctu	naknykemcinctu
	nakn-kem-cin-ctuca	naknykemcinctuca
	nakn-kem-cins-ctu	naknykemcinsyctu
	nakn-kem-cins-ctuca	naknykemcinsyctuca

Of these forms, "nakykemcinctu" is the shortest and is preferred by the
scoring algorithm.  On the whole, however, it might be better to just
make a lujvo for "cinse ctuca" (which would be "cinctu") since the sex
of the teacher is rarely important.  If there was a reason to specify
"male", then the simpler tanru "nakni cinctu" ("male sexual-teacher")
would be appropriate.  This tanru is actually shorter than the four-part
lujvo, since the "ke" required for grouping need not be expressed.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.