Hacker Newsnew | past | comments | ask | show | jobs | submit | mpv89's commentslogin

Hi, Mark from ArangoDB here. We used the default configuration for every Database in the benchmark under the assumption they are picked reasonable. If you have any specific suggestions how to configure Postgres for the specific environment, please let us know.


I think the first step would be to use the automatic configuration generation tool [1]. You give few inputs (like the amount of server memory and number of connections), and get the config.

[1] https://www.pgconfig.org/#/tuning


Thanks a lot for the suggestion. We have used http://pgtune.leopard.in.ua I have appended the resulting config.

The result is that the default config is already very good for our benchmark. There is no visible difference between the old and new config when running the benchmark. We will publish an update to the blog post and show the numbers using the tuned config.

best Frank

DBVersion: 10 Linux, Type: "Mixed type of Applications" 122GB RAM 25 Connections SSD Storage

=>

max_connections = 25 shared_buffers = 31232MB effective_cache_size = 93696MB work_mem = 639631kB maintenance_work_mem = 2GB min_wal_size = 1GB max_wal_size = 2GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 100 random_page_cost = 1.1


That is a terrible approach. No database platform is tuned by default.

You may have to run a utility, or edit a config file, but a new system must be configured to fully utilize available hardware, and for the expected workload.

A benchmark run on default config is at best misleading - it smells like marketing spam. You actually dampen interest in a forum like this with such an approach.


Disclaimer: I'm part of the ArangoDB team. As written in the post the whole benchmark is open source. The idea is that you can run it on your own. Also, pull requests are welcome. If you think it's marketing spam, take the chance and improve the configuration. We will publish an update of the post.


One simple suggestion for one platform: https://github.com/weinberger/nosql-tests/issues/22

The more common pgtune CLI is not up to date for PostgreSQL 10+ at this time.

Also there are some old, open issues that indicate the benchmarks have problems:

- https://github.com/weinberger/nosql-tests/issues/16 - https://github.com/weinberger/nosql-tests/issues/13


We just published an Update to the Benchmark including PGtune, please find it here: https://news.ycombinator.com/item?id=16473117


> No database platform is tuned by default.

Not really true any more. Many of the newer databases (e.g. Mongo) and database-like systems (e.g. Elasticsearch) try to give a good default experience.


Since I don't know about current versions of MongoDB, and am genuinely interested, how does Mongo know how much memory to use as an example?

Typically with PostgreSQL, you have to expand the defaults when running on modern server class hardware, so the engine uses enough RAM. Simply letting it use all RAM available can be problematic, since the DB engine and OS have to fight for cache, etc.

Java based systems like Elastic might still need JVM tuning, especially with a distributed platform like ES, which might be run in a large number of small machines, a smaller number of large machines, or containers.


You are exactly right. You must adjust the java heap size on Elasticsearch.


Postgres is configured extremely conservatively by default. While it isn't reasonable to spend time tweaking each database, at least running pgtune would be reasonable for Postgres.


I think it's definitely reasonable to tune important production databases.

But for many people databases have become commodity applications that they install and then forget. So even in reality not all installments will be properly tuned.


Who does this?!

I know it happens because I inherit them often. But seriously knock it off.

Errrrr. Better yet don’t, job stability.

Nothing to see here. Move along.


Which is why many apps are slow... People search for a new toy, rather than learn what they already have.


> But for many people databases have become commodity applications that they install and then forget.

It seems unlikely to me that this group of people would reach benchmark posts.


People using them that way probably don't care about performance very much, though.


Lies, damn lies, and benchmarks. Using default configuration invalidates the result at best, at worst it's knowingly misleading.

Why? Because nobody who cares about performance runs their database with default configuration. Plus some databases, like PostgreSQL are well known to have very conservative settings. So benchmarking with the defaults like that is knowingly trying to make your product look better than it is. You didn't know? Then what are you doing benchmarking databases?!?

Seriously, this is shady as hell.


But which default database configuration has a reasonable tuning to run a benchmark?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: