Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Derek's approach is one that can be carried between many old version of PG. Today, you can achieve a lot more with generated columns, (materialized) views, partitioning and other fun features added in recent versions.

In any way, we used this approach in our company dealing with billions of rows of data and this allowed us to scale way past our "weight class".



Thanks for this. People are always skeptical of this approach, not because the tried it and failed (or even thought 5 minutes about it) but because they read some blogpost somewhere.

Not to say though that this solves everything, there are cases where it does not work (as someone commented correctly and gave an example where they needed to use linear algebra over the data)


That's what generated columns [0] are for ;) There is very little you can't do on the side of the database these days. And if all of it fails, then stored procedures and triggers can solve the rest. We [1] as a rule don't allow any data to be post-processed once they leave the database and we run database of 100TB+ with trillions of rows with absolutely no problems.

[0] https://www.postgresql.org/docs/current/ddl-generated-column... [1] https://www.citusdata.com/customers/pex




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: