[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LLG Member Distribution - Preliminary Announcement of Lojban Baseline
- To: Logical Language Group <lojbab@access.digex.net>
- Subject: Re: LLG Member Distribution - Preliminary Announcement of Lojban Baseline
- From: "Jorge J. Llambias" <jorge@intermedia.com.ar>
- Date: Sun, 12 Jan 1997 22:04:35 -0300
- CC: n.nicholas@linguistics.unimelb.edu.au, colin@KINDNESS.DEMON.CO.UK, cowan@ccil.org, dews@WORLDNET.ATT.NET, jbhodges@nrv.NET, kstein@magenta.com, lojbab@access2.digex.net, pycyn@aol.com, sru@IC.NET, tommy@intercon.com
- References: <199701120350.WAA13317@access5.digex.net>
> My intention in including lujvo was to give examples of the lujvo making
> process in great number, and secondly, to maximize coverage of semantic space
> given our realtively limited gismu list.
I think lujvo may have some reason of being in the English-Lojban part of
the dict, but I wouldn't miss them in the Lojban-English. But if they
do show up in this part, then you must give the place structures. It has
to be clear that a Lojban word is not just its first argument.
> >only those lujvo be included which
> >have been used more than once in Lojban text[...]
> But my archives contain
> some writings (especially yours) myultiple times, and it is impractical to
> weed them out for an automatic counting program. So your words in most
> cases will make the grade by any standard we can abide by.
I don't think there is any need to weed them out. Just take only lujvo
that have been used more than ten times, or maybe a higher number. Adding
lujvo to the dictionary just because someone used it once or twice doesn't
really make much sense to me.
[...]
> - creating the text archive
> to be processed would be a massive manual job. The bulk of Lojban text now
> is to be found in the collected archives of Lojban List, thanks to Jorge's and
> Goran's (et. al.) extended conversation of late 1995, which was generating
> a screenful or two of text every day.
Why not just use the whole thing. English words only very rarely look like
lujvo, so they wouldn't interfere much, and those that do can be easily
separated out. Just decide how many lujvo it would be practical to include
(say 50) and then choose the 50 most frequently used.
> The bootom line is that the bulk of my work in preparing the raw dictionary
> master will be to get the lujvo entires in order, and to finish the last
> 5% of the gismu English entries. After that I will worry about lujvo, and
> time constraints will be the determining factor.
If you cut the number of lujvo now you might save yourself a lot of work.
I would much rather have 50 good quality lujvo entries than 1000 poor ones.
> I agree with you that we need other people to review lujvo. Jorge did some,
> but he was concentrating on new ones that you had never analyzed.
I would be willing to try again, but not on the whole list. Most lujvo
were obviously over-specific to some context, and it seemed like a waste of
time to work out the place structure for a word that probably noone would
ever want to use again. Someone with the right computing tool could make
a list of lujvo in order of frequency of use, and we could work down that
list, so that the most common words would get their due attention.
Perhaps you could make a request in Lojban List that one of them computer
types produce the ordered list from the whole list archive. It shouldn't
be too hard to do for someone with the right tools.
Jorge