Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The main criticism seems to be that the FK relationship makes migrating the referenced table difficult. But why not remove the FK with ALTER TABLE before the migration, migrate, and add the FK back again (which will catch any missing primary keys), preferably inside a transaction?


I think the emphasis there is the word "online".

The method you propose might work, but not necessarily well. Catching missing primary keys at the end is a problem because then you have a bunch of data integrity issues to sort out, preventing you from re-applying the constraint, in a live system where apps are relying on the database to do integrity checks, meaning that there's a decent chance that the running system is creating integrity issues faster than you can sort them out. Possibly orders of magnitude faster.

Doing everything inside a transaction may invite concurrency issues around locking, maybe even deadlocks. Which is again a problem in a live system, because you might be causing services to fail.

I'm not sold on FKs being something you should never do, ever, but I will say that I used to work at a place that pushed their database servers very hard on real-time tasks, and also tended to avoid foreign keys, and my biggest complaint was not data integrity, it was just that there was extra effort that needed to go into documenting the logical foreign keys (the keys still exist, they're just not enforced by a database constraint), since you couldn't just read them from the database schema.

I'd like to say that this can be something where you default to having FKs, and remove them as you hit performance problems, except that the poster is right: Developers have a habit of leaning on database constraints instead of doing their own sanity checks, and they simply don't know all the places they're doing it - they're not necessarily even aware they're doing it when they're doing it - meaning that removing FK's after the fact doesn't get you to "constraints are enforced by the apps", it gets you to "constraints aren't enforced". I think you probably want to try and anticipate ahead of time if you'll have a problem, so that everyone can know from the get-go whether they'll be working without a safety net.


Is it even possible to realistically do FK checks in app code? Nobody uses serializable transactions and with other isolation levels I think it's possible to check for FK, it's OK now, insert new row, but another transaction simultaneously deletes that row and both transactions happily commit. It's called phantom read. Probably it's very rare event, but it'll happen.


At the place I worked that skipped FK checks, they used append-only database schemata. If you limit yourself to CR operations, a lot of things get easier than they are with CRUD - ensuring (a somewhat weak form of) database consistency without transactions, yeah, but also things like audit trails and slurping new data into the BI system.


I'm not sure "you can't add checks because they might fail!" is sound reasoning...


It's not "you can't add checks because they might fail"; it's "you can't temporarily remove checks in a live system where the database's users are used to leaning on them, because when you try to re-apply them you just might find yourself in a world of fail".


>preferably inside a transaction?

MySQL does not support transactional DDL, unfortunately.


So then don't do it inside a transaction. It still wouldn't be any less safe than their current approach of just not using FKs at all


Adding a foreign key to a massive table can take a long time, since all existing row data must be validated against the constraint. With logical replication, long operations like this can be quite disruptive, even with a multi-threaded logical replication scheme.


That’s no longer true since MySQL8 https://dev.mysql.com/doc/refman/8.0/en/atomic-ddl.html


Sadly I don't think this is actually true. From that page:

DDL statements, atomic or otherwise, implicitly end any transaction that is active in the current session, as if you had done a COMMIT before executing the statement. This means that DDL statements cannot be performed within another transaction, within transaction control statements such as START TRANSACTION ... COMMIT, or combined with other statements within the same transaction.

Whereas with real transactional DDL, the DDL can be done within large transactions, multiple ALTER's could be rolled back, no changes are visible outside the transaction...


MySQL 8 was extremely new at the time this comment was posted. It's likely that mysql limitations were a big part of the thinking. The mention of pt-online-schema-change at the bottom makes me think so.


MySQL 8 DDL is atomic at the statement level, but cannot be combined with other SQL DDL or DML statements in transactions yet.


I didn't quite understand the migration issue TFA talks about, but at least with PG, there's more than enough clever ways to make schema changes incrementally.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: