Re: Most obscure vocabulary award.



Paul Blay <ask_me_or_get_spam_trapped@xxxxxxxxxxxxxxxxxxx> dixit:
><jwb@xxxxxxxxxxxxxxxxxx> wrote ...
>> Paul Blay <ask_me_or_get_spam_trapped@xxxxxxxxxxxxxxxxxxx> dixit:
>>>I wasn't going to because it's pretty obscure,
>>>although as far as I know it's 'real'..
>> Go on. Be a devil. Get a buzz out knowing that when some poor soul
>> wants to look it up at least one JE will have it.
>OK you've got it.

>But I'm starting to feel some sympathy for muchan's views on
>'core vocabulary'. You have the 20,000 "common words," a
>standard dictionary has around 60,000 and (depending on how
>you count them) Edict has well over 100,000.

Yes, the basis for counting varies. Many dictionary blurbs will
say "200,000 words", but there'll only be 30,000 headwords; the
other 170,000 are subordinates of the:

べんきょう - 勉強 ....
勉強家 ...
勉強時間 ...
勉強机 ...
勉強部屋 ...

variety. EDICT's 100,000 probably reflects more like 30-40,000
headwords.

>As such a marker for "obscure, technical and / or archaic"
>might be 'nice to have' but I'm not volunteering to check 114,000+
>records to classify them.

JMdict has a much finer marking, with frequency rankings for a lot of
words from the "wordfreq" file. I haven't attempted to include them
in the EDICT file, apart from using the higher rankings for part
of the (P) tags.

--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology,
Monash University, VIC 3800, Australia
ジム・ブリーン@モナシュ大学
.


Quantcast