I always tell people there are two clear areas where DynamoDB has some major benefits:
- Very high scale applications that can be tough for an RDBMS to handle
- Serverless applications (e.g. w/ AWS Lambda) due to how the connection model (and other factors) work better with that model.
Then, for about 80% of OLTP applications, you can choose either DynamoDB or RDBMS, and it really comes down to which tradeoffs you prefer.
DynamoDB will give you consistent, predictable performance basically forever, and there's not the long-term maintenance drag of tuning your database as your usage grows. The downside, as others have mentioned, is more planning upfront and some loss of flexibility.
One difference with DynamoDB is that there's no query planner, so you can have a pretty good sense of how many items you'll hit and how big that read is.
Question for you -- what are the performance implications of "re-keying" my records during the transform? Should I try to keep the same key for my transformed record as from my original record so that they align with the same partitions, or are they likely going to get sent to different partitions across different brokers anyway?
Good question! You can't re-key your records but this done is by design. We want the materialized partitions to reside on the same brokers as their sources so there is no network overhead for performing the transform, and it occurs on all nodes. So by choosing this design we traded off more CPU for lower latency and less network bandwidth.
Wow! Congrats to you, Jay and Frank. I've been a fan of your work on both Seed.run & Serverless Stack for a while. Best of luck, and I'm excited to see Seed grow :)
I highly recommend this book and Swyx's other work. He's thorough while also distilling down to the most important bits.
Great book for people that are just starting their coding career but also for those with a good bit of experience. If you're feeling 'stuck', this is a great guide to understand how to advance your career.
'Learn in public' is probably the #1 piece of advice I'd give to anyone in tech. It helps your writing, it builds your network, and it grinds down your ego (because people will certainly let you know when you're wrong). Swyx has been a huge proponent of this, and this whole book is a great kick in the pants to get started.
I'd be shocked if this book doesn't make you >>>10x the amount you spend on it (even counting a healthy hourly rate for reading it).
Awesome! Glad to hear that there's a section on that. Quick question. I'm thinking of leveraging elasticsearch for the fulltext search capabilities. Is the work to get sorting on various different attributes heavy from a dev perspective and is there any advantages of doing it through dynamo rather than querying with elasticsearch?
I recommend On-Demand pricing 'until it hurts'[0], but that's because a ton of people I talk to are spending <$50/month on DynamoDB. At that point, it really doesn't make sense to spend hours of time optimizing your DynamoDB bill.
If you are at the point where you are spending over thousands of dollars a month on DynamoDB, then it does make sense to review your usage, fine-tune your capacity, set up auto-scaling, buy reserved capacity, etc. But don't waste your time doing that to save $14 a month. There are better things to do.
But it's really nice to have a database where you can set up pay-per-use, don't have to think about exhausting your resources, and have an option to back out into a cheaper billing mode if it does get expensive.
[0] - Hat tip to Jared Short for this advice & phrase
Daniel, I'm a big fan of yours but disagree with this take :).
It's definitely a database. The modeling principles are different, and you won't get some of the niceties you get with a RDBMS, but it still allows for flexible querying and more.
S3 and DDB are incredibly similar. Their fundamental operators are the same: key-value get/put and ordered list, and their consistency is roughly the same.
What differentiates DDB and S3 the most is cost and performance.
They're both highly-durable primitive data structures in the cloud, with a few extra features attached.
All the examples are specific to DynamoDB and use DynamoDB features.
That said, the principles apply pretty well to other popular NoSQL databases, especially MongoDB and Cassandra. There will be some slight differences -- MongoDB allows better nesting and querying on nested objects -- but it's broadly the same. If you want to model NoSQL for scale, you need to use these general patterns.
If you want to check it out but find out it doesn't work for you, just let me know. I've got a 100% money-back guarantee with no questions asked if you don't like it.
- Very high scale applications that can be tough for an RDBMS to handle
- Serverless applications (e.g. w/ AWS Lambda) due to how the connection model (and other factors) work better with that model.
Then, for about 80% of OLTP applications, you can choose either DynamoDB or RDBMS, and it really comes down to which tradeoffs you prefer.
DynamoDB will give you consistent, predictable performance basically forever, and there's not the long-term maintenance drag of tuning your database as your usage grows. The downside, as others have mentioned, is more planning upfront and some loss of flexibility.