Re: OT: SQL



On May 3, 7:01 pm, Marshall <marshall.spi...@xxxxxxxxx> wrote:
On May 3, 2:24 pm, Charlie-Boo <shymath...@xxxxxxxxx> wrote:

On May 3, 12:26 pm, Marshall <marshall.spi...@xxxxxxxxx> wrote:

Codd noticed that databases contain relations.  The rest of what he
said was a totally failed attempt to formalize and automate database
query processing.  SQL is an example of that failure, by virtue of it
being only a programming language rather than an automatic query
processor.

"Totally failed" is too strong. Partly failed, partly succeeded.
Still a definite step forward if you compare it to what came
before.

Is there a procedure for asking "List all employees who earn more than
their managers." (classic problem)

Um, yeah, that's pretty easy actually.

-- first, the data definition:
create table Employees
(
  EmployeeId int primary key,
  Salary Money
);

create table Management
(
  ManagerId int foreign key references Employees(EmployeeId),
  EmployeeId int foreign key references Employees(EmployeeId),
  primary key(ManagerId, EmployeeId)
);

-- then the actual query

-- list the employee id of all employees who have a
-- larger salary than their manager.

SELECT e.EmployeeId
FROM Employees e, Employees m, Management mgt
WHERE
  mgt.ManagerId = m.EmployeeId AND
  mgt.EmployeeId = e.EmployeeId AND
  e.Salary > m.Salary;

Notice that this solution specifies nothing whatsoever
about how the result is to be obtained. No execution
plan is mandated. This is a straightforward, unadorned
declarative specification of what information is desired.

No. It says go through the Manager.Employee hierarchy and get the
Manager and Employee Salaries.

An alternate way would be to go through the Employee file, get the
Employee Salary and Manager, using the Manager ID get the Manager
Salary.

Some systems are smart enough to check for both possibilities, but are
hardwired to that specific 2nd possibility. "Go through an index"
where index is Field => Key of Record is hardwired. But in general
you can have any logical expression and any hierarchy of any
conditions.

You could have a file of employees who make more than their manager
and just go through that. Or a file of employees who don't make more
than their managers and you go though all employees and don't list
those in that file.

All of these possible files are expressible as Predicate Calculus wffs
and will be seen as possibilities by the algorithm detailed in my
ARXIV paper. How would your SQL query see them? It wouldn't. You
are hardwired for 1 or 2 special cases only.

There could be other indexes e.g. Salary.Employee and after we get the
Employee we go through Salary and if Salary < Employee's get the
Employees with that Salary and if any is his Manager then he passes.

There are lots of possibilities of how to go through possible
databases and get that data, and there are lots of file structures
that might exist besides the "relation" Employee and maybe an index of
Manager.Employee.

The query can be expressed as standard Predicate Calculus but signify
input as e.g. I, J, K . . . and output as x, y, z, . . . and the
single relation (regardless of how many database files you have and
what their content or keys are) is EMP(Emp#,Manager#,Salary):

(existsMGR)(existsEMPSAL)EMP(X,MGR,EMPSAL) ^ (existsMGRSAL)
(existsMGRMGR)EMP(MGR,MGRMGR,MGRSAL)^LT(MGRSAL,EMPSAL)

(A higher level interface can walk the user through this and produce
the Predicate Calculus wff from that.)

In http://arxiv.org/html/cs/0003071 I show how this Predicate Calculus
query can be reduced to Axioms that may be supplied as database
files. Each time the axioms are all database files, you have a
different solution.

Codd talked about using Logic but all that time (up to the present) he
and the rest did not realize that standard Predicate Calculus suffices
to express queries. All of the SQL syntax and semantics is not
needed. It took me no more than a minute to realize that when I first
looked at the problem years ago. The closest Codd came was the
Relational Calculus, but it still had Tuple Variables and other
unnecessary additions to Logic, and no way to process it. Everybody
uses SQL.

Database query processing is no different than Program Synthesis, and
the Program Synthesis solution applies whether the relations are
infinite mathematical ones like the set of prime numbers, or finite
database ones like the employees. But researchers treat them as
separate problems.

Codd never talked about listing prime numbers and Manna/Waldinger
never talked about listing Employees. But they are all cases of
reducing a Predicate Calculus wff down to primitve programs that
consist of programming language constructs or database accesses
contained within that programming language.

They say that "a file is a relation" but that is 2 steps away from the
truth. A file allows a set of Predicate Calculus queries that can be
solved directly, and each Predicate Calculus wff is composed of
references to relations.

Researchers created all sorts of query languages, but you can stay
within standard Mathematics syntax and use an axiomatic framework to
unite Program Synthesis, Database Query Processing and even Theory of
Computation proofs using the same Rules of Inference but a handful of
axioms particular to the Theory of Computation.

And of course Theory of Computation researchers are a 3rd group that
does not see the connection and works independent of the other 2
groups.

Combining these different problems, seeing the connections, and coming
up with a general solution to all of them is exactly what we want to
do. But researchers have yet to realize even the first step of using
Predicate Calculus for the wff.

It is a damn lie to say that they solved the general problem of
Database Query Processing. As I have described above, there is a
tremendous amount that can be done - it is all described in my ARXIV
paper with many detailed examples.

C-B

(If there are any database geeks out there, note also
that my solution makes no use of NULL; it stays firmly
in 2VL.)

 and it figures out different ways
to do it (then one is chosen somehow)?

Just so. The query planner will consider a number of
different query plans and choose the optimum, or
the heuristically-best if the search space for best
is too large.

In higher-end DBMS products, the query planner
will consider different execution types: merge join
vs. nested loops vs. whatever. Even in something
less sophisticated like MySQL, which only supports
nested loop join, the query planner will take into
account the existence of different indexes, and choose
an order to traverse the tables on that basis. And
in fact, the choice may well change based on what
indexes exist, which can change *without requiring
any change to the above query.* The query has an
associated logical semantics, but does not specify
an operational semantics. This is called
"physical independence," and it's one of the reasons
I say SQL is a partial success.

If you want to say SQL has problems, I won't disagree.
In fact I'll probably enthusiastically join in. But it's
not a total failure; it gets a number of things right.

Marshall

.



Relevant Pages

  • Re: OT: SQL
    ... query processing. ... FROM Employees e, Employees m, Management mgt ... Manager and Employee Salaries. ... The scheme used does not model database files in general, ...
    (sci.logic)
  • Re: OT: SQL
    ... said was a totally failed attempt to formalize and automate database ... query processing. ... create table Employees ... Manager and Employee Salaries. ...
    (sci.logic)
  • Re: OT: SQL
    ... said was a totally failed attempt to formalize and automate database ... query processing. ... FROM Employees e, Employees m, Management mgt ... his Relational Calculus and never used actual Predicate Calculus. ...
    (sci.logic)
  • Re: Sorting, filtering, wrapping
    ... The query will be more efficient, though, ... In a database, scanning all the records in a table to find the ones that ... and is used as the record source for an Employees form. ... >>> (filter by query vs. filter by code) in a situation like this? ...
    (microsoft.public.access.formscoding)
  • Re: How to make it so users see only certain records in a database
    ... The database stores information about the performance of the ... information about each employee entered by each manager. ... I have a table called Employees with the ... A combo box in the form uses this table as its record source. ...
    (microsoft.public.access.forms)