Re: Universal grammar
- From: haberg@xxxxxxxxxx (Hans Aberg)
- Date: Mon, 23 Oct 2006 17:41:23 GMT
In article <Lp6dnZx63KTAdqHYnZ2dnUVZ_vudnZ2d@xxxxxxxxxxx>, Tak To
<takto@xxxxxxxxxxxxxx> wrote:
I am playing around with the same ideas, but strictly in the
domain of mathematics. Let me drop off some inputs:
Traditional metamathematics typically (i.e., most, but not all)
treat object math theories as sequences of strings. But
mathematicians do not agree on notation, even though they generally
agree on the notions, i.e., the semantics.
One might attempt to pin down this semantics using tree structures
similar to ASTs (abstract syntax tree - readers not knowing this
stuff might use the Wikipedia). Take the expression A => B, which
may be given prefix (Lukasiewicz) notation "A B =>" or RPN "=> A B";
but the AST is the same:
=>
/ \
A B
Sorry, I am not following you at all.
I don't know what your "object math theories" are. (I know of a
branch of computing science that tries to lay a mathematical
foundation of "objects" as in "object oriented programming". Most
of the work in that field in based of lambda calculus and has
nothing in common with what you have described.)
If you are talking about mathematical logic that deals with"propositions", "theorems", "truth values", etc, then I don't
know of any problemm related to notations.
The object mathematics is what a metamathematical theory studies. A common
theory is Hilbert's first order theory. See for example the book by
Mendelson, "Introduction to Mathematical Logic", or Kleene, "Introduction
to metamathematics".
When developing a metamathematical theory, notational problems do not
occur, because one just skips over them! :-) For example, Shoenfield,
"Mathematical Logic", defines his formal system in Lukasiewicz style
prefix notation, thus writing "A B =>", but as this becomes unreadable to
humans, he later just says let "A => B" be a different notation for "A B
=>". This way of glossing over details is typical for human written
mathematics. You can't do that when working with a computer. Also,
the practice is very confusing to non-mathematicians, which rather want to
define a common notation.
Take the classical example:
Time flies like an arrow, fruit flies like a banana.
Both parts are ambiguous, here. It is not difficult to write parser that
keeps track of all possibilities. But this way, all the possible parses of
the first part is s_1, ..., s_k, and the second part is t_1, ..., t_l, get
combined the expansion of the possibilities (s_1, ..., s_k) x (t_1, ...,
t_l), or k*l items.
By contrast, suppose the GLR grammar merges at the "," in the example
above. Then one is forced to write an action, or semantics description, of
the first s_1, ..., s_k possibilities. The parser then starts over with
one single parsing branch to get the t_1, ..., t_l possibilities. When
these merge, one gets a semantic description of (s_1, ..., s_k) x (t_1,
...., t_l), and the parsing is much quicker, as it does not have to carry
all those branching possibilities along after the merges. Natural
languages seem to have this kind of local ambiguities.
But this is hardly a worst case for GLR!?
I think you probably think of the capability of LR blowing up
exponentially in compilation. The same applies to NFA to DFA conversions
used in lexer generators such as Flex and Lex. These cases do not seem to
occur in naturally occurring grammars. These are compilation time problems
only, though. If it becomes a problem, one might revert to a
runtime compilation on the fly technique, caching compiled states. Also,
the problem relates to LR(1), not LALR(1) that Bison and Yacc use. But
LALR(1) isn't good for error recovery and displaying lookahead tokens, so
LR(1) is probably better to work with.
So I think this answers your questions. Yes, one is forced to develop
logical models for the natural language semantics.
I am still not sure what you plan to do with the output of the
parser. For example, do you plan to feed it into an inference
engine so that you can answer questions about the input sentence?
I guess, the simplest thing would just be something that can check for a
correct grammar. But your question is like "what are you going to do with
the output once you have decrypted this encrypted text" - that is another
story, perhaps for somebody else to deal with. :-)
--
Hans Aberg
.
- Follow-Ups:
- Re: Universal grammar
- From: LEE Sau Dan
- Re: Universal grammar
- From: Tak To
- Re: Universal grammar
- References:
- Universal grammar
- From: Hans Aberg
- Re: Universal grammar
- From: Peter T. Daniels
- Re: Universal grammar
- From: Hans Aberg
- Re: Universal grammar
- From: Peter T. Daniels
- Re: Universal grammar
- From: Hans Aberg
- Re: Universal grammar
- From: groups
- Re: Universal grammar
- From: Hans Aberg
- Re: Universal grammar
- From: groups
- Re: Universal grammar
- From: Tak To
- Re: Universal grammar
- From: Rob Freeman
- Re: Universal grammar
- From: Tak To
- Re: Universal grammar
- From: Hans Aberg
- Re: Universal grammar
- From: Tak To
- Universal grammar
- Prev by Date: Re: Pseudo-assimilation, Latin to Spanish
- Next by Date: Re: Pseudo-assimilation, Latin to Spanish
- Previous by thread: Re: Universal grammar
- Next by thread: Re: Universal grammar
- Index(es):
Relevant Pages
|