Hacker Newsnew | past | comments | ask | show | jobs | submit | more abd12's commentslogin

Awesome! Hit me up if you have any questions :)


I can understand the sentiment and don't fault you for it.

That said, I think you can definitely handle complex, relational patterns in DynamoDB pretty easily. It will take some work to learn new modeling patterns, but it's absolutely doable.


At what cost, though? RDBMS do provide a better querying interface with fewer pitfalls that invariably one encounters trying to fit a square peg in a round hole.

That said, I've seen people use DynamoDB as a timeseries database (modelling multi-dimensonal data with z-indices on top of the two dimensional partition-key and range-key pair), so it is definitely possible to be clever and practical at the same time.

Disclaimer: ex-AWS.


Thanks for your support! I'm grateful for the fix you suggested as well :)


I'd much rather pay for reads & writes directly rather than guessing at how my CPU and RAM will translate to the reads and writes that I need.

RDBMS capacity planning basically goes:

1. How much traffic will I get? 2. How much RAM & CPU will I need to handle the traffic from (1).

With DynamoDB, you can skip the second question.


In a nutshell:

- It was designed for super high scale use cases (think Amazon.com retail on Cyber Monday). It has decent adoption there. Competes mostly with Cassandra or other similar tools.

- With the introduction of AWS Lambda, it got more adoption in the 'serverless' ecosystem because of how well its connection model, provisioning model, and billing model works with Lambda. RDBMS doesn't work as well here.

A lot of people find 'problems' with it because they try to use it like a relational database, which it most certainly isn't. You have to model differently and think about it differently. The book helps here :).


Thank you for the kind words! :) Glad you're liking it.


My contention is that it's much easier to have an access pattern that won't scale in a relational database than in DynamoDB. DynamoDB basically removes all the things that can prevent you from scaling (JOINs, large aggregations, unbounded queries, fuzzy-search).

This is underrated, but it's really helpful. So many times w/ a relational database, I've had to tweak queries or access patterns over time as response times degrade. DynamoDB basically doesn't have that unless you really screw something up.


So what is the cost of doing a bit of query tuning and de-norming every now and then compared to the development costs imposed by DynamoDB?


It depends!

For me, I like that 98% of DynamoDB work is frontloaded. I spend the time building the model but once it's done -- set it and forget it.

With RDBMS, it's like there's a hidden 5% tax that's lurking at all times. You have to spend time tuning querying, reshaping data, changing patterns, etc. It can add up to significant drag over time.

Different teams might think the costs are different for their application, or they may be fine with one pattern over the other. Fine with me! I just know which one I choose now :)


Fair enough! I think that's a reasonable position.

IMO, there are two times you should absolutely default to DynamoDB:

- Very high scale workloads, due to its scaling characteristics

- Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

You can use DynamoDB for almost all OLTP workloads, but outside of those two categories, I won't fault you for choosing an RDBMS.

Agree that DynamoDB isn't _blazing_ fast. It's more that it's extremely consistent. You're going to get ~10 millisecond response times when you have 1GB of data or when you have 10 TB of data, and that's pretty attractive.


Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

If you can use Aurora Serverless, the Data API makes sense for lambda.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...


True! I'm not a huge fan of Aurora Serverless and the Data API. The scaling for Aurora Serverless is slow enough that it's not really serverless, IMO. And the Data API adds a good bit of latency and has a non-standard request & response format, so it's hard to use with existing libraries. But it's definitely an option for those that want Lambda + RDBMS.

The RDS Proxy is _hopefully_ a better option in this regard but still early.


Differing opinion - I think RDS Proxy is the wrong approach. Adding an additional fixed cost service to enable lambda seems like an indicator of a bad architecture. In this case the better approach would likely be to just use a Fargate container which would have a similar cost and fewer moving parts.

By the time you pay a fixed cost for the proxy on top of what you already pay for the RDS server, it'd be a far simpler architecture with less moving parts to just run a Fargate container (or better yet, AWS would offer a Google Cloud Run competitor)

The Data API, while still rough around the edges, at least keeps the solution more "serverless-y". Over time it should get easier to work with as tooling improves. At the very least, it won't be more difficult to work with than DynamoDB was initially with it's different paradigm.

For services that truly require consistently low latency, lambda shouldn't be used anyway, so the added latency of the data api shouldn't be a big deal IMO.

For those reasons, I view the RDS Proxy as an ugly stopgap that enables poor architecture, whereas the Data API actually enables something new, and potentially better. So I'd much rather AWS double down on it and quickly add some improvements.


I agree completely. We have APIs that are both used by our website and our external customers (we sell our API for our customers to integrate with their websites and mobile apps) and for batch loads for internal use.

We deploy our APIs to Fargate for low, predictable latency for our customers and to Lambda [1] which handles scaling up like crazy and scaling down to 0 for internal use but where latency isn’t a concern.

Our pipeline deploys to both.

[1] As far as being “locked into lambda”, that’s not a concern. With API Gateway “proxy integration” you just add three or four lines of code to your Node/Express, C#/WebAPI, Python/Flask code and you can deploy your code as is to lambda. It’s just a separate entry point.

https://github.com/awslabs/aws-serverless-express

https://aws.amazon.com/blogs/developer/deploy-an-existing-as...


Yup, that's exactly how I recommend clients to write lambdas for API purposes... Such a great balance of getting per request pricing while retaining all existing tooling for building APIs


For an internal API with one or two endpoints, I‘ll do things the native Lambda way. Your standard frameworks are heavy when all you need to do is respond to one or two events and you can do your own routing, use APIGW and API Key for authorization, etc.

There is also a threshold between “A group of developers will be developing this API and type safety would be nice so let’s use C#” and “I can write and debug this entire 20-50 line thing in the web console in Python, configure it using the GUI, and export a SAM CloudFormation template for our deployment pipeline.”


> By the time you pay a fixed cost for the proxy on top of what you already pay for the RDS server, it'd be a far simpler architecture with less moving parts to just run a Fargate container

A lot of people want to use lambda (or serverless) even so. So AWS is just accommodating their wishes.


We can’t use Aurora Serverless even in our non Prod environments because we have workflows that involving importing and exporting data to and from S3. But really, our Aurora servers in those environments are so small that most of our costs are storage.


Not to mention the same also applies for load. You get about 10ms at 10, 1000 or 1000000 requests per second, again irrespective of how much data you have.


There's a third use: if you want a free ride, AWS free tier for DynamoDB is quite nice, enough to run a decent dynamic website.


Especially combined with the always free tier of lambda....


> Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

This is only true for AWS. Azure functions share resources and don't have this issue.

The speed is actually quite sad. Its 5-10x slower than my other databases at p95, and I can't throw money at the problem on the write side. Reads I can use DAX, but then there goes consistency.


Good point! I would usually not recommend using a database from a different cloud provider just because of different hassles around permissions, connections, etc.

I've never found the speed an issue, but YMMV. To me, the best thing is that you won't see speed degradation as you scale. With a relational database, your joins will get slower and slower as the size of your database grows. With DynamoDB, it's basically the same at 1GB as it is at 10TB.


Email me, and I'm happy to discuss :). alex@alexdebrie.com


Fair enough! IMO, it's worth it :). You could spend a bunch of time cobbling together free resources, and you'd still only get about 30% of what's in the book. How much is your time worth as a software engineer?

That said, a few notes:

1. I added a coupon code ('HACKERNEWS') to knock $20 off Basic, $30 off Plus, and $50 off Premium.

2. If you're from a country where PPP makes this pretty expensive, hit me up. I'm happy to help.

3. If you're facing income challenges due to COVID-19, hit me up, I'm happy to help.

4. If this is unaffordable for any reason, hit me up, I'm happy to help. :)


Your book does an excellent job explaining the single-table design pattern of DynamoDB. This pattern literally saves you money. So at a certain point you will earn back the $79 from a lower AWS bill (plus your applications will be much faster!)


Thanks, Matthew! Appreciate it, and I agree with you :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: