I don’t understand how aurora achieves the speed it does with a log based approa...

wgjordan · on Feb 26, 2019

> I don’t understand how aurora achieves the speed it does with a log based approach. Can someone please clarify?

Aurora splits out 'database' nodes (the server instances you provision and pay for) from 'storage' nodes (a 'multi-tenant scale-out storage service' that automatically performs massively-parallel disk I/O in the background). Instead of MySQL writing various data to tablespaces, redo log, double-write buffer, and binary log, Aurora sends only the redo-log over the network to the storage service (in parallel to 6 nodes/3 AZs for durability).

No need for extra tablespace, double-write buffer, binary-log writes, or extra storage-layer mirroring, since durability is guaranteed as soon as a quorum of storage nodes receives the redo-log. The reduced write amplification results in 7.7x fewer network IOs per transaction at the 'database' layer for Aurora (vs standard MySQL running on EBS networked storage, in the benchmark described in the paper), and 46x fewer disk IOs at the 'storage' layer [1].

[1] https://www.allthingsdistributed.com/files/p1041-verbitski.p...

gigatexal · on Feb 26, 2019

That’s an impressive savings in io

dmlittle · on Feb 26, 2019

Last time we benchmarked Aurora (~2 years ago) the write speed of Aurora is pretty slow compared to RDS (Postgres RDS was able to achieve 3x write throughput)

PetahNZ · on Feb 26, 2019

Anecdotally after migrating from RDS MySQL to serverless Aurora there was a noticeable slowdown of our dashboard and reporting tools. Our typical workloads (ecommerce transactions) are slightly slower on average, but the peaks seem down.

sciurus · on Feb 26, 2019

Serverless Aurora likely has different performance characteristics than normal Aurora.

danielecook · on Feb 26, 2019

I never did a thorough benchmark but I was working with a poorly indexed DB (actually...no indexes) that had millions of records and despite the lack of indices the database still queried quickly.

dmlittle · on Feb 27, 2019

Read speeds are pretty good but write speeds not so much. For our particular use case 99.99% of queries ran would have been inserts with reports only generated once per month.

preetamjinka · on Feb 26, 2019

What was the concurrency like in those benchmarks?

dmlittle · on Feb 27, 2019

I don't remember as this was 2 years ago. We were only concerned with write speeds as that was the majority of queries we would be performing. Read speeds were pretty good.