Looking for english word list similar to lemma.num

From: TTK Ciar (ttk_at_remove_this_and_all_after_org.ciar.org.designdevices.com)
Date: 03/28/05


Date: 28 Mar 2005 09:30:09 GMT


Hello!

  I am developing software for analysis of conversational english
texts, but need a list of words which are annotated with the classes
to which the words might belong (e.g., verb, adverb, adjective).

  I have lemma.num (from ftp://ftp.itri.bton.ac.uk/bnc/lemma.num),
but it only has 6318 words, and only lists one class for words that
might take multiple classes in different contexts (e.g., "planning"
is listed as a noun, but it can also be a verb). It is far too
incomplete to be useful for my input data ("looking", for instance,
doesn't even show up in the list).

  In format, lemma.num is very close to what I need. It is easily
parsed by software, but I wish the class field was replaced by a
comma-delimited list of possible classes (or similar -- even a list
where words are listed multiple times with different classes would
be useful to me).

  Does anyone know where I can get a more complete list? 30,000
words would be wonderful.

Thanks,
-- TTK



Relevant Pages