Because I've never thought about this, I'll ask the dumb question...
A shopping cart has many items. An item belongs to a shopping cart. In a relational database, without foreign keys, how do you associate the shopping cart with the items?
The suggestion is to not use foreign key constraints in the database.
The suggestion is not to eschew columns which might be JOINed on in a query.
I.e. "Don't enforce FK relationships with a constraint in the DB. Make sure the values in both tables which may be joined are consistent by using application code to enforce this."
> Don't enforce FK relationships with a constraint in the DB. Make sure the values in both tables which may be joined are consistent by using application code to enforce this.
For those reading at home who aren't good with databases. This is exactly the same statement as:
"Don't enforce valid inputs on the backend. The javascript front-end will enforce it for us".
Always validate your inputs at the appropriate layer. Your backend can never trust input from a web front-end (even if there is "only one web front-end"). Likewise, your database should never trust input from its clients (even if there is "only one client"). Ignore this rule at your own peril.
Are your engineers omnipotent beings who never fail, never make mistakes and understand the system entirely (even all the parts they don't know about)? Is your QA process perfect? Your servers never crash in weird states?
A foreign key is a hard-constraint that must be enforced.
There's no stopping you from storing item IDs in a theoretical shopping cart table without FK constraints though. It's just that the onus is on your application to provide the guarantee that the item exists.
This becomes a little clearer if your item IDs are not simply auto-incrementing IDs (which have their own challenges in a large distributed system), but instead are SKUs, or things which have actual meaning.
If you have a need for a ID that is unique but otherwise meaningless in a distributed system, feeding a concatenated node ID, timestamp (to ms), and looping transaction counter (mod a sufficiently large number) into a secure hash like SHA-256 should take care of it.
Sure, that satisfies the uniqueness aspect if that's all you're looking for in an ID. But it doesn't satisfy the predictability aspect.
You can't know what the ID will be ahead-of writing the item to the database, nor store the item in a shopping cart without retrieving the item via a different piece of information (which is pretty likely to be SKU in this case)
When there aren't any connections to known unique keys (like SKUs for shopping), PK predictability isn't really possible (or necessary). The above formula is just an alternative to using auto-incrementing integers when in a distributed environment.
In a non-relational model, you'd generally copy the items/products and their quantity into the shopping cart.
That's a huge change from the relational model, and takes much more space than a normalized approach, but it also comes with some advantages. For example, it makes sure that the price of the items in the cart remain constant, even when you change product prices. That way, you don't have to inform the user "oh, your cart just got 8% more expensive" before checkout.
Agreed. Invoices, for example, are a representation of history so they need the exact data that was used at the time. They can have a reference to a product id but at any time it's possible for the data associated with that product to be completely changed from the meaning at the time of the invoice. This happens with, e.g., UPCs, which can be reused. On the other hand, if you have assemblies of products then you probably want those items to update themselves automatically.
It's confusion over the terminology. There are two definitions of "foreign key" in wide use:
(1) When one table's key occurs as a value in another table, in that second table that column is sometimes called a foreign key. For example, table A has id, table B has A_id, and people refer to A_id as "a foreign key".
(2) When you define this relationship explicitly in the database, so that the database can enforce it (and/or to document it), that's called a "foreign key constraint" but also sometimes just a "foreign key".
This article just means they don't use #2. It doesn't mean they don't use #1.
The phrasing was probably clearer in the context of the discussion they were having at the time because the comment above it linked some docs for the software in question (qh-ost), and under the "Limitations" section, those docs said, "Foreign key constraints are not supported." So to them it was clear they were talking about database constraints.
I agree, whether or not a given data model has foreign keys is a math question with a correct answer (up to isomorphism of the data model); if it does, where in a system they should or should not be enforced is an entirely separate, implementation (vs semantic) level issue.
you are talking about relation, in this context fk is an abstract concept
linked article describes fk as an dbms construct - a schema constraint
relations and conceptual foreign keys may exist without dmbs constraints, in such scenario dmbs does not guard validity of conceptual "foreign keys" and treats them as usual data
The association would still exist implicitly in the code that uses the database. The columns could stay the same, and the code could still use the item's hypothetical cart_id column to get the items in a cart.
You lose the explicitness and safety of referential integrity in the database, but it's a trade off that can enable other things, like sharding, as mentioned by the GitHub employee.
You do not have to have a foreign key defined to do a join across tables. Foreign keys just enforce that all the values in the given column of a table are present in another given column on a specified table (usually a second table). They prevent you from entering data that does not match up.
A FK declaration uses named field(s) in your table. You can omit the FK but the associative fields remain. An FK is a constraint on what values can be in those fields.
An item should never be deleted from a catalog. That would break a lot more than carts.
If you need to track the inventory status/availability of an item, then you store that in the database. The specific implementation (field, list of current items, etc) depends on your needs.
A shopping cart has many items. An item belongs to a shopping cart. In a relational database, without foreign keys, how do you associate the shopping cart with the items?