Re: OT: SQL



On May 4, 12:37 am, Marshall <marshall.spi...@xxxxxxxxx> wrote:
Charlie-Boo wrote:
On May 3, 7:01�pm, Marshall <marshall.spi...@xxxxxxxxx> wrote:
On May 3, 2:24 pm, Charlie-Boo <shymath...@xxxxxxxxx> wrote:

On May 3, 12:26 pm, Marshall <marshall.spi...@xxxxxxxxx> wrote:

Codd noticed that databases contain relations. �The rest of what he
said was a totally failed attempt to formalize and automate database
query processing. �SQL is an example of that failure, by virtue of it
being only a programming language rather than an automatic query
processor.

"Totally failed" is too strong. Partly failed, partly succeeded.
Still a definite step forward if you compare it to what came
before.

Is there a procedure for asking "List all employees who earn more than
their managers." (classic problem)

Um, yeah, that's pretty easy actually.

-- first, the data definition:
create table Employees
(
� EmployeeId int primary key,
� Salary Money
);

create table Management
(
� ManagerId int foreign key references Employees(EmployeeId),
� EmployeeId int foreign key references Employees(EmployeeId),
� primary key(ManagerId, EmployeeId)
);

-- then the actual query

-- list the employee id of all employees who have a
-- larger salary than their manager.

SELECT e.EmployeeId
FROM Employees e, Employees m, Management mgt
WHERE
� mgt.ManagerId = m.EmployeeId AND
� mgt.EmployeeId = e.EmployeeId AND
� e.Salary > m.Salary;

Notice that this solution specifies nothing whatsoever
about how the result is to be obtained. No execution
plan is mandated. This is a straightforward, unadorned
declarative specification of what information is desired.

No.  It says go through the Manager.Employee hierarchy and get the
Manager and Employee Salaries.

The relations in the FROM clause and the predicates in
the WHERE clause are unordered. There is no "go though"
anywhere. Again, the query is a logical construct; it does not specify
any physical data path or any physical schema; it does not depend
on the presence or absence of any index. It doesn't specify any
access path or depend on any physical data structure whatsoever.
It depends only on the absolute minimum set of things it necessarily
must depend on, which is the logical schema.

An alternate way would be to go through the Employee file, get the
Employee Salary and Manager, using the Manager ID get the Manager
Salary.

"File" is a filesystem concept; it is not a data management construct.

It is a way to run the query that the Codd et. al. model does not
detect, but the system that I describe in the ARXIV paper does.

You're not by chance an MV advocate, are you? They use that
word "file" a lot.

Some systems are smart enough to check for both possibilities, but are
hardwired to that specific 2nd possibility.  "Go through an index"
where index is Field => Key of Record is hardwired.  But in general
you can have any logical expression and any hierarchy of any
conditions.

You could have a file of employees who make more than their manager
and just go through that.  Or a file of employees who don't make more
than their managers and you go though all employees and don't list
those in that file.

All of these possible files are expressible as Predicate Calculus wffs
and will be seen as possibilities by the algorithm detailed in my
ARXIV paper.  How would your SQL query see them?  It wouldn't.

The query *shouldn't* see them. The things you describe are all
physical performance tweaks. You don't want to pollute your
logical schema with physical issues, such as the indexes or
materialized views you describe. The schema should be independent
of all of those consideration, and be a purely logical construct.
This enables swapping physical constructs independent of queries;
it increases modularity.

It is independent - the Predicate Calculus query is the same
regardless of what files exist. In your case, it is "independent"
because the "model" (a misnomer, as you do not represent existing file
structures) doesn't represent or use it. In my case, the query
procerssore sees that possibility and if that file structure exists it
uses it (automatically.)

Even now I am being sloppy because a file is not a primitive object.
"File = Relation" is not true. Only pitifully simple files, those
that your model supports, are single relations. In general, a file
supports one or more computer programs that access it. That is the
true model, very general, and completely independent of how you store
your files.

Again, higher end DBMS products can do what you describe:
use materialized views in query planning.

You are hardwired for 1 or 2 special cases only.

My query is not hardwired at all.

Does the system utilize files other than the main flat file and
indexes to it based on single fields within that file? That is way
limited, and doesn't succeed in modeling what databases contain in
general.

There could be other indexes e.g. Salary.Employee and after we get the
Employee we go through Salary and if Salary < Employee's get the
Employees with that Salary and if any is his Manager then he passes.

There are lots of possibilities of how to go through possible
databases and get that data, and there are lots of file structures
that might exist besides the "relation" Employee and maybe an index of
Manager.Employee.

The logical schema can be backed by whatever physical data
structures. Usually row stores are used, but there are companies
and academics advocating column stores these days. Hierarchical
stores have mostly been discredited because of the problems
with query bias.

If I have a system that uses hierarchical files - and almost all do if
programmed without constraining yourself to your "model" - and they
work, then you have no case. They are there for a reason: they are a
powerful way to store data. Codd said they're wrong only because he
was never able to include them in his "model".

The query can be expressed as standard Predicate Calculus

The relational algebra is expressively equivalent to predicate
calculus. This result is due to Codd.

No, IIRC, he proved that certain query languages can be expressed by
his Relational Calculus and never used actual Predicate Calculus.
(The Relational Algebra is way different than the predicate
Calculus.) The Relational Calculus uses "tuple variables", similar to
variables in programming languages, and other constructs that are not
really needed.

Standard Predicate Calculus works (see my ARVIX paper). The fact
that they never saw that is incredible. They have no brains at
all!!! It is a simple, obvious solution to modeling files in general
that they missed and still continue to miss. They even talk about
"Logic" but never used FOL.

but signify
input as e.g. I, J, K . . . and output as x, y, z, . . . and the
single relation (regardless of how many database files you have and
what their content or keys are)

The Universal Relation idea? Ullman advocated for that for many
years, but it ultimately failed. It never seemed even remotely like
a good idea to me.

No, the idea of using the Predicate Calculus but give significance to
unquantified variables as being "input" and "output". It is sper
powerful and useful for representing concepts in many branches of
Computer Science. In the ARXIV paper I use it to do Program
Synthesis, Database Query Processing and prove theorems of the Theory
of Computation.

is EMP(Emp#,Manager#,Salary):

(existsMGR)(existsEMPSAL)EMP(X,MGR,EMPSAL) ^ (existsMGRSAL)
(existsMGRMGR)EMP(MGR,MGRMGR,MGRSAL)^LT(MGRSAL,EMPSAL)

(A higher level interface can walk the user through this and produce
the Predicate Calculus wff from that.)

Inhttp://arxiv.org/html/cs/0003071I show how this Predicate Calculus
query can be reduced to Axioms that may be supplied as database
files.  Each time the axioms are all database files, you have a
different solution.

Codd talked about using Logic but all that time (up to the present) he
and the rest did not realize that standard Predicate Calculus suffices
to express queries.

Again, Codd proved relational algebra expressively equivalent to
the predicate calculus in, like, 1970 or so. There are other
equivalent theories as well, like the tuple calculus.

All of the SQL syntax and semantics is not
needed.

The syntax sucks ass, no question. The semantics are
wrong in a whole bunch of different places, too, especially
in the use of NULL and three-valued-logic. (3VL sucks ass.)

Codd never talked about listing prime numbers and Manna/Waldinger
never talked about listing Employees.  But they are all cases of
reducing a Predicate Calculus wff down to primitve programs that
consist of programming language constructs or database accesses
contained within that programming language.

If you said that in comp.database.theory, you'd get "of course"
as a reply.

They say that "a file is a relation" but that is 2 steps away from the
truth.

I don't know who these "they" are. I don't know anyone in the
database field outside of the fringe MV crowd that even uses
the word "file".

Researchers created all sorts of query languages, but you can stay
within standard Mathematics syntax and use an axiomatic framework to
unite Program Synthesis, Database Query Processing and even Theory of
Computation proofs using the same Rules of Inference but a handful of
axioms particular to the Theory of Computation.

Sure. I've been working on that for years. So have lots of other
people.

Who has done anything along those lines? No system uses that. They
use "flat files and indexes".

C-B

Combining these different problems, seeing the connections, and coming
up with a general solution to all of them is exactly what we want to
do.

Agreed.

Marshall- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -

.



Relevant Pages

  • Re: OT: SQL
    ... query processing. ... FROM Employees e, Employees m, Management mgt ... Manager and Employee Salaries. ... The scheme used does not model database files in general, ...
    (sci.logic)
  • Re: OT: SQL
    ... said was a totally failed attempt to formalize and automate database ... query processing. ... create table Employees ... Manager and Employee Salaries. ...
    (sci.logic)
  • Re: OT: SQL
    ... said was a totally failed attempt to formalize and automate database ... query processing. ... create table Employees ... Manager and Employee Salaries. ...
    (sci.logic)
  • Re: Sorting, filtering, wrapping
    ... The query will be more efficient, though, ... In a database, scanning all the records in a table to find the ones that ... and is used as the record source for an Employees form. ... >>> (filter by query vs. filter by code) in a situation like this? ...
    (microsoft.public.access.formscoding)
  • Re: DBMS and lisp, etc.
    ... Naively implemented with SQL, again for 10 ... (1 query for the initial orders, 1 query for each order for its ... soon as you upgrade to the SQL database. ... (eq (order-customer orderA) ...
    (comp.lang.lisp)