Hi, Mark from ArangoDB here. We used the default configuration for every Database in the benchmark under the assumption they are picked reasonable. If you have any specific suggestions how to configure Postgres for the specific environment, please let us know.
I think the first step would be to use the automatic configuration generation tool [1]. You give few inputs (like the amount of server memory and number of connections), and get the config.
Thanks a lot for the suggestion. We have used
http://pgtune.leopard.in.ua I have appended the resulting config.
The result is that the default config is already very good for our benchmark. There is no visible difference between the old and new config when running the benchmark. We will publish an update to the blog post and show the numbers using the tuned config.
best Frank
DBVersion: 10
Linux,
Type: "Mixed type of Applications"
122GB RAM
25 Connections
SSD Storage
That is a terrible approach. No database platform is tuned by default.
You may have to run a utility, or edit a config file, but a new system must be configured to fully utilize available hardware, and for the expected workload.
A benchmark run on default config is at best misleading - it smells like marketing spam. You actually dampen interest in a forum like this with such an approach.
Disclaimer: I'm part of the ArangoDB team. As written in the post the whole benchmark is open source. The idea is that you can run it on your own. Also, pull requests are welcome. If you think it's marketing spam, take the chance and improve the configuration. We will publish an update of the post.
Not really true any more. Many of the newer databases (e.g. Mongo) and database-like systems (e.g. Elasticsearch) try to give a good default experience.
Since I don't know about current versions of MongoDB, and am genuinely interested, how does Mongo know how much memory to use as an example?
Typically with PostgreSQL, you have to expand the defaults when running on modern server class hardware, so the engine uses enough RAM. Simply letting it use all RAM available can be problematic, since the DB engine and OS have to fight for cache, etc.
Java based systems like Elastic might still need JVM tuning, especially with a distributed platform like ES, which might be run in a large number of small machines, a smaller number of large machines, or containers.
Postgres is configured extremely conservatively by default. While it isn't reasonable to spend time tweaking each database, at least running pgtune would be reasonable for Postgres.
I think it's definitely reasonable to tune important production databases.
But for many people databases have become commodity applications that they install and then forget. So even in reality not all installments will be properly tuned.
Lies, damn lies, and benchmarks. Using default configuration invalidates the result at best, at worst it's knowingly misleading.
Why? Because nobody who cares about performance runs their database with default configuration. Plus some databases, like PostgreSQL are well known to have very conservative settings. So benchmarking with the defaults like that is knowingly trying to make your product look better than it is. You didn't know? Then what are you doing benchmarking databases?!?