Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In first order predicate logic with an n-ary relation, attributes are columns and tuples are rows aka a table and what Codd used to justify it.

The typical normalization let's say in a star schema is viewable as a least fixed point. FO+LFP=P

It is similar to what Codd called adjacency lists, which luckily died in the 80s, because they conflicted with real adjacency lists. recursive CTEs add transitive closure, which when added to FO gets you to L or NL, I forget which .

Still it doesn't matter, the relational part of the relational model is attributes or column names...it is equivalent.



> the relational part of the relational model is attributes or column names

The relational part is defined, most importantly, by the set. That was key to Codd's model, and the source of his primary criticism of SQL.

But his ideas went out of fashion long ago. Postgres, pre-1995, was the last time we saw a relational database anyone heard of. Sure, there have been some esoteric attempts more recently to revive the concept, but they never went anywhere. For all intents and purposes the relational model is dead. We live in a tablational world now.

> it is equivalent.

It is not, though. In fact, I posit that lists/multisets are easier to reason about, at least where one doesn't have a strong mathematical understanding (i.e. the layman), and that is why SQL "won". A list, if carefully constrained, can represent a set — which is maybe what you are struggling to suggest — but that does not make it a set.


Can you provide any specific situation where the following does not hold?

Especially if I add the constraint that rows have to be unique?

Relations=Tables Rows=Tuples Columns=Attributes

I think we may be from different schools of set theory, where I am from the more abstract direction, where set membership, inclusion and equality are separate concepts and don't need to be decided on to define a relation.

Not that my view is better just different.

I just happened to come through the {{foo},{foo}}=={{foo}} school.


> Especially if I add the constraint that rows have to be unique?; Relations=Tables

This is kind of like saying that a dynamically-typed programming language is statically-typed because you can add tests to enforce the same constraints as what static typing enables.

There is a lot of truth in that statement, but you completely fail to capture the differences in calling said language statically-typed. And, sure, for a silly comment on HN, who cares, right? But on the other hand, what are you gaining by obscuring the differences? It is not like said programming language is somehow better because you call it statically-typed. It is still the same language no matter what you call it.

SQL does not follow the relational model, as it was proposed by Codd. And that's okay. Arguably it is better because of it. Why not call it what it is?

> Not that my view is better just different.

Of course you can view things however your little heart desires. You can even see what I call a chicken as being a relational database if that is what floats your boat. But we are not talking about your view, we are talking about Codd's view. Every single comment that preceded this was clearly about Codd's take, to the point that you even linked to some of his work for us to engross in his thoughts. And we know very well how Codd felt about tables. He wrote about everything he thought was wrong with SQL at length.


No, you are confusing convention and simplifications for convenience for fundamental properties..

What is generally considered Standard set theory today is typically ZF(C), which can be viewed and was intentionally developed as what is not called a "material set theory" or sometimes a "membership based set theory"

In ZF(C), sets are characterized only by the binary “∈” operation and propositional equality of sets.

It is actually the less common schools like the Elementary Theory of the Category of Sets (ETCS), where individual elements have no identity and must be unique.

There are well documented disagreements between Zermelo and Frege vs Cantor and Lawvere, was the former was for membership and the later were against it.

With ∈ or "is a member of" both of the following hold:

     Given a set A={1,2,3,4}; 3 ∈ A


     Given a set A={1,1,2,2,3,3,4,4}; 3 ∈ A

There is value in defining things like multisets/bags and accessing concepts like multiplicity, but you could use a function you defined for a multiset and discard that information and it would fit the axioms of ZF.

In computer science and databases, this is less than just a trivial factoid depending how deep you get, it will directly relate to how you can trade space for time or reduce communication costs to save on time.

And yes even historically there were very popular network protocols where they would append the same element to a set and shift the complexity from insert time to read time.

To be clear I am not asserting any claim of better outside of local context to doing so.

You are absolutely correct that no production quality RDBMs are based on Codd's rules. That absolutely doesn't change the fact of what the term 'relational model' means, nor does it remove the reality that they follow enough of it so that where it breaks, even from a set centric perspective, is directly related to that origin. Especially in parallel, distributed systems that are the norm now.

And the fact that things moved, doesn't change the reality that the 'relation' in the 'relational model' is as described in previous posts.


No. You are getting confused by semantics. It is understood that definitions are just made up on the spot and anyone can define anything however they like. That was never in question.

However, we are specifically talking about Codd. His definition for the relational model does not match anything we use today. Tablational databases are something quite different.

So, sure, you can call what I call dynamically-typed languages statically-typed if you want. You're not wrong. Definitions cannot be wrong, fundamentally. But it is different, and since we're not talking about your definition and never were...


From [0] which is Codd justifying it explicitly.

> The term relation is used here in its accepted mathematical sense. Given sets S_1, S_2, ..., S_n (not necessarily distinct), R is a relation on these n sets if it is a set of ntuples each of which has its first element from S_1, its second element from S_z , and so on.’ We shall refer to S_i as the jth domain of R.

Note the "(not necessarily distinct)" part and how he uses a table to describe it on the same page.

From what is called the "Alice book" Foundations of Databases - Abiteboul, Hull, Vianu[1] the seminal textbook in database theory.

> In the 1970s, Codd’s relational model revolutionized the field. In this model, humans view the data as organized in relations (tables), and more “declarative” languages are provided for data access.

Unfortunately I can't find chapter 3, where the relational model is described in detail online.

But for more concrete SQL as defined in ISO/IEC 9075-2:1999 (E)[2]....calls them "tables"

Outside of didactic simplifications or domain specific conventions I am not finding anything.

I could be convinced otherwise...but as every real world practical RDBMS has to impose order of often multple tables to scale, through logs, cache etc...and as what everyone agrees is SQL has to support various transaction isolation levels which ensures that on a concrete implementation detail, that rows aren't unique, but that the special case of a membership function, expressly being an 'element' is supported in them all. Plus you only get DISTINCT with a candidate key, and by default SQL is using bags/multisets anyway...there is just to much undefined behavior with indistinct rows and so we avoid them but they are allowed.

In fact, while useful for teaching and day to day use, you are actually violating the axiom of pairing, which is also used to construct the singleton's in modern ZFC.

In fact AC is called the well-ordering axiom by some, expressly because it allows it.

I think you are possibly confusing the absence of ordering from the users perspective?

Either way I have provided actual cites and would appreciate the same, because I would rather learn where I am wrong than prove myself right.

[0] https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf [1] http://webdam.inria.fr/Alice/pdfs/Chapter-1.pdf [2] https://web.cecs.pdx.edu/~len/sql1999.pdf


> Either way I have provided actual cites and would appreciate the same

You did, but I am not sure why. Best to come to a discussion to discuss it yourself. If you have to rely on third-parties to write for you, why bother participating at all?

But, for the sake of trying to demonstrate good faith, I'll break my rule of not providing citations this once and give you this:

   Codd, E.F. (1988) Fatal Flaws in SQL:
   "Relations in the relational model and in mathematics do not have duplicate rows."
In contrast, SQL does have duplicate rows, as Codd defines that. Hence why we call what SQL has "tables" instead of "relations", even though at the end of the day they serve the same purpose. SQL is unquestionably inspired by the relational model, but it, itself, is not relational as Codd saw it.

If you want to still call it as such, go nuts! You can call it a "flying monkey" for all anyone cares. But, if you do call SQL relational, I am interested in what you would call a database that does adhere strictly to Codd's model since also calling it relational wouldn't paint an accurate picture. There are clear differences, as you even told earlier.


That 1988 paper was a modification of the previous design,as an attempt to address criticisms, but thank you for the cite.

It is important to realize that relations are abstract data structures that we treat like sets, not objects we constructed from sets.

If you want to do the set specific approach, reviewing any books on basic set theory and their intentional and extensional set definitions with show your argument is confusing the map for the territory.

The set operations applied to relations produce relations or relations of relations.

Same in material set theory, excluding membership, the only set operation that isn't predicated, this operating on the set, and not opening it up to act on its members.

While the platonic ideal may seem to make it seem different, urelements/atoms/individuals are not what those operations are acting on.

That is what made ZF(C) different.

Remember that Codd was targeting the hierarchical model, records etc...

While the other camp was targeting the network model, specifically using lists.

From the platonic view, uniqueness is a trivial, syntax property, you make that choice, and it has costs.

The typical example is I am a person who lives in (a member) the US, and the US is a member of the UN. Yet I am not a member of the UN.

Similar challenges are why Codd added Recursive CTEs to capture some of the expressiveness of the network model, which OOP is related to.

The problems Codd was targeting in your cite were mostly addressed post SQL99 with the addition of structural data options, specifically challenges with foreign keys.

The intersection of a column and a row being a field, which maps to cobol record concepts is why tables and relations are equal.

If you look at the 1970 paper I provided above, you can see where his concrete implementation using arrays calls out it isn't equal and requires distinct entries.

Note the link to the author's page allowing you to download the Alice book in its entirety works now.

I encourage you to get it.

The concept of a relation as a data structure is far more abstract than it seems.

Yes Codd did attempt to become more prescriptive later, but there is a reason no one exactly implemented it, just like platonic ideals it doesn't really work for real world needs, nor is it required for ACID transactions.

But the relation is still a binary operations that is equivalent to a table.


> That 1988 paper was a modification of the previous design,as an attempt to address criticisms

That paper is the criticism (of SQL). Is the contention here simply about not understanding that we're talking about what Codd thought, not what others thought/think? It seems abundantly clear what Codd thought. His language doesn't offer a whole lot of ambiguity as far as I am concerned.

> The concept of a relation as a data structure is far more abstract than it seems.

If you want to interpret that way, sure. You can imagine a relation any way you want. But that's not what we're talking about.

> From the platonic view, uniqueness is a trivial, syntax property, you make that choice

Absolutely. Akin to choosing to define your program constraints in tests instead of using a type system. The end result is effectively the same, but one offers certain guarantees out of the box, while the other relies on you to get things right. They, while effectively the same, are not identical. There is good reason why we have static and dynamic typed classifications, just as we have relational and tablational. There is not some kind of magic in there, it is merely a communication device so that others can figure out what you are trying to say.

> but there is a reason no one exactly implemented it

Right, because the relational model, as Codd saw it, isn't really all that good, at least not in real-world scenarios. Just as we've brought up many times before. There is good reason why SQL "won". Why embarrass SQL by calling it relational?

But, if you wish to — and fair enough if you do, I would still love to know what you would call a database that is built in the eye of Codd? It cannot also be relational if SQL is deemed as such. As you point out, there are clear differences. Calling them both relational would be confusing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: