lujvo and rafsi, response to Mark

Markl wrote:
>I also intended for _some_ examples to suggest a secondary point, which
>I was content to make only implicitly.  I'll make it explicitly now.
>Seems to me that rafsi serve, or can serve, two purposes.  Rafsi can be
>used to convert a commonly used or otherwise valuable tanru into a form
>which enjoys a standardized definition & a standardized place structure;
>that is, into a lujvo.  But rafsi can also be used to convert a commonly
>used or otherwise valuable tanru into a form which is short, but not
>standardized; that is, into a nonce-lujvo.  Accordingly, some of my
>examples were of ideas that I wanted to express in Lojban using fewer
>than the four syllables minimally required for tanru.

I sympathize with this point, though I have found more recently (perhaps
because I do less Lojban writing/speaking) that I am content to make
nonce-lujvo solely by the unreduced form method, except for a few key
rafsi in key positions that have started to become "morphological
suffixes" like final position "-mau".  It is only when I find that I
using a concept often that I bother to look up the rafsi and/or verify
against tosmbru form, etc. to get an optimal short-form.

And this habit I have developed seems perfectly in accordance with Zipf,
that frequency of use ties to shortness of word-form.  Nonce-lujvo are
practically by definition the most infrequent words in a language.

>> > hour-long (cacra, hour, no short rafsi),
>> Croatian uses the adjective "jednosatni" (4 syllables), which
>> translates directly to lojban as {pavcacra} (3), or, using the x2
>> default, just {cacra} (2). I am satisfied.
>The x2 default?  I thought all gismu defaulted to x1.  I've missed
>something here, which may mean that my "hour-long" example is already
>covered by an economical Lojban expression.

By this, Goran meant that the x1 of cacra is something that is x2 hours
long.  The default of x2 in normal contexts is singular (1), so that
lo cacra defaults to mean something that is one hour long.

>> I have never yet been in a situation where I would have to explicate
>> the material of a can, and if I ever am I would gladly use tanru.
>Yes, I'm sure tanru would suffice to describe the material of a can.
>But how would you succinctly refer to the idiomatic "tin cans," that is,
>to all cans which are not aluminum beverage cans?

Give me a sentence for context and I would know better what I would do
(after all, there are other kinds of cans besides beverage cans and "tin
cans", such as "paint cans" and "garbage cans".

So my best guess would be cidjylante as a starting point, assuming that
I know what you intend to cover by the term.  But then "tin" is
specifically mentioned as having a metaphorical meaning in tanru -
though many if not most Lojbanists reject those kinds of metaphorical
usages for animals/plants/metals and many body parts.

>So the answer to your question is that I will want economical
>expressions for everything that has a great deal to do with my
>livelihood, & that others will want economical expressions for
>everything intimately involved with theirs.  What resources does Lojban
>have to offer for the construction of such narrowly differentiated
>economical expressions?

Answer - it doesn't.  Expressions used frequently by only a small
portion of the community constitute jargon, and there is a tradeoff
between having unambiguity and specificity that forbids having a lot of
short-word space reserved for undetermined words of low overall
frequency of use in the language across the whole population.

Because of very low ambiguity, Lojban will by necessity tend to have
longer phoneme strings in order to cover the same semantic space.
Either that, or we have to increase density (and homonymy isn't
acceptable), which reduces communications redundancy when someone
misspeaks or mishears.  People who have tried to make preliminary
estimates feel that Lojban's redundancy is already below that of most
natural languages (easily seen if you realize that nearly all word forms
of the shapes CVVCCV CCVCVV CCVCCV CVCCCV CVCCVV (among 6 letter lujvo)
have a potential meaning so that one phoneme errors cause significant
difficulties in error correction.

>Incidentally, English has monosyllabic words for the tidal rise in water
>level (flow), the tidal fall in water level (ebb), the highest high tide
>(spring) & the lowest high tide (neap or neaps).  Two of these terms
>(spring & flow) are ambiguous, in that they also have other meanings.
>One could argue that "flow" has a single meaning, of which "tidal flow" &
>"river flow" are merely types, & then argue that the word "current"
>represents river flow so well that, in a coastal context, the "tidal"
>portion of "tidal flow" can simply be elided.

Except that in a coastal context, I would tend to associate that phrase
(even *with* the "tidal" remaining) as referring to "rip tides" and the
like that make shore areas dangerous for swimmers.

Most people do not think of tides in terms of currents, though perhaps
the sailors who made th original metaphor "flow" did so.  This shows one
of the problems in too hastily making a very short lujvo for a "local"
context.  If that lujvo can have only one meaning, the assignment of the
lujvo to the narrow use concept means that it cannot be used for any
more broadly understandable tanru semantics based on the words.

One result of this has plagued the TLI Loglan community.  One of their
"games" has been to take some narrow field that a person is interested
in, and then devise a whole bunch of lujvo that fit that narrow field.
This started shortly after Brown printed his first books, with articles
in the first issues of The Loglanist dominated by such things as lists
of color words (additive/subtractive and whatever), computer terminology
up the wazoo, etc.  More recently, an issue of Lognet had a whole bunch
of "short" words related to stereo equipment (CD, cassette, rewind, fast
forward, etc.).  The problem was that almost every tanru proposed would
suggest whole bunches of meanings unrelated to stereo equipment in any
context other than a discussion of such things, and it was obvious that
the person writing the article hadn't for a moment considered non-stereo
interpretations of the tanru.

Some other examples - I give only the keywords and syllable counts
(assuming optional disyllables are spoken as 2 syllables) for the
underlying tanru; in some cases Lojban has gismu that TLI Loglan does
not, words that would solve the problem more easily.  You'll noptice
that the shortest syllable counts are generally the worst tanru:

2 man-do (the classic example :  to "man" a ship - as if women never do
this or that you can't think of 50 million more likely interpretations
for that tanru.

3 scratch-record (phongraph record)
3 scratch-record-machine (phonograph)
4 ribbon-record (tape recording)
4 ribbon-record-machine (tape recorder)
3 record-ribbon (recording tape)
3 sound-ribbon (audio tape)
5 record-tape-container (cassette)
5 sound-tape-container (audio cassette)
3 light-record (CD)
4 video-ribbon (videotape)
6 video-ribbon-container (videocassette)
4 video-record (video recording)
4 video-record-machine (VCR)
2 record-use (to play a recording on a machine)
3 fast-record-use (fast forward)
4 see-fast-record-use (scans on fast forward)
4 reverse-record-use (rewind)
4 see-reverse-record-use (scans on rewind)
3 record-pause-make (pauses a recording)
3 record-offer (ejects - of a machine)
4 record-offer-make (agentive ejecting)

5 blood-lose-sick-person (hemophiliac)

3 possible-fiction (a superset of SF and fantasy, as if other kinds of
fiction are inherently "impossible", while fiction involving magic IS

4 science-possible-fiction (SF)
4 magic-possible-fiction (fantasy)
5 science-magic-possible-fiction (science fantasy)
3 people-story (folk tale)
3 magic-people-story (fairy tale)
3 religion-people-story (myth)
3 history-people-story (legend)
3 crime-discover-attempt (detective, one who identifies perpetrators of crime)
3 sad-end-story (tragedy)
3 happy-end-story (comedy, in the old sense)
3 funny-story (comedy, current sense)

4 four-direction-ball (hypersphere)

some words for sign language - most not too bad, but think creatively
about other contexts and they become a little less obvious:

3 hand-sign (gesture)
4 hand-sign-form (hand shape, I presume a jargon term for signers)
4 hand-sign-language (sign language)
3 hand-letter (letter from manual alphabet)
4 hand-letter-write (finger spell)
4 direction-do-word (directional verb including
                   pronomial references in its motion)
4 class-hand-sign (classifier signifying a member of an object class)

3 sex-attack (rape)
2 process-trouble (harass/pester)
3 beautiful-write (do calligraphy)
3 do-exist (functionally exists with effect x2; x1 is virtual)

>English has an oceangoing history.  Other languages reflect greater
>intimacy with arid desert.  I wanted to show that my rafsi critique was
>inclusive of both environmental extremes, & of the cultures concerned
>with them.  So I gave both "high tide" & "salt pan" as multicultural
>examples of lexical needs unmet, or met only laboriously, by Lojban.

And these (especially "salt pan") are excellent examples of why we need
to be very careful in making short lujvo.  It is likely that someone far
from the ocean won't understand a metaphor based on tides, and someone
far from a desert won't understand a metaphor related to the desert.
But Lojban, in order to be culturally neutral, cannot favor either
community by letting them make short lujvo that the other group will not
understand, or even worse, might use with a totally unrelated meaning.

>la dn cusku di'e
>> A succinct tanru is as useful as a compacted lujvo, in fact I would say
>> the tanru would be clearer in many circumstances.
>If so, that's only because so many rafsi are dissimilar to their gismu

No, it is because rafsi use inherently loses some redundancy in phonemic
recognition.  If syllable counts are the critical measurement factor
then no one would use an unreduced lujvo over the equivalent tanru
(you've lost one phoneme), and no one would use any disyllable rafsi
over the expanded form (you've lost 2 phonemes).

But no one has in speech to my knowledge chosen fukpyvla over fu'ivla
for clarity (in writing it takes a whole extra character, so it may be
more understandable %^); and indeed nor has anyone ever had trouble with
learning "-vla" even though it is the same type of "backwards" rafsi as

>> I don't think that having compounds which are, say, 2 syllables instead
>> of 4 is that significant to a language.
>Then why do so many two-syllable compound words exist in various tongues?

Because as lot of tongues have one syllable primitive roots.

However, I will note that Lojban DOES have a lot of two-syllable
compound words

Lessee, there are 66 CVV monosyllable rafsi in use, 210 CCV rafsi in
use, and 916 CVC rafsi, which statistically will not need hyphenation in
179/289 (~60%) of all CVCCVV lujvo and a similar percentage of CVCCCV

So there are 4356 CCVCCV lujvo, 13860 CVVCCV bisyllable lujvo, around
36640 CVCCVV bisyllable lujvo, around 115000 CVCCVV bisyllable lujvo,
and arguably 44100 CVVrCVV lujvo, which can usually be pronounced as
bisyllables.  That is well over 200000 bisyllable lujvo, far in excess
of the average person's working vocabulary in any language, and likely
larger than the entire lexicon of most languages.

The fact that most of these have not been used, and indeed perhaps most
are unlikely to be useful by any stretch of the imagination, does NOT
detract from the fact that they "exist" and therefore have meaning.  And
I dare predict that the fraction of them that will have useful meaning
covers a higher percentage of the N most common concepts than the
percentage of 2 syllable compounds doing so in most other languages, and
the total number with useful meaning will probably be far higher as
well.  (Chinese may beat us out because such a high percentage of their
wordstock are 2-syllable compounds).

If you go to three syllables, I think the number of Lojban lujvo
probably FAR exceeds even the total wordstock of English, though I
haven't calculated it out.  (But there are 914,760 of each of the
following forms:  CCVCVVCVV CVVrCCVCVV CVVrCVVCCV alone, 2.9 million
probably over 5 million CVCCVCCCV that need no hyphen.  That's over 25
million words to start with.

Any volunteers to start on THAT dictionary???