Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apparently they have several benchmarks where they claim that decompression is faster than memcpy (!).

However, this is only the case because on several Intel x86_64 benchmarks they report memcpy performance between 5-10 GB/s, while even a basic DDR3 dual channel arch has 20 GB/s memory bandwidth, while a modern quad channel DDR4 can have 76.8 GB/s bandwidth, and of course there is no reason for memcpy to be substantially slower than memory bandwidth assuming it's properly implemented (AVX can separately read two and write one 256-bit per cycle = 128 GB/s memcpy at 4GHz).

Am I missing something or is this another case of "implausible claims = they screwed the benchmark = they are incompetent/malicious"?



The absolute numbers don't seem far fetched. An AVX optimized memcpy on my high end machine (DDR4) has a throughput of 30GB/s.

As long as they are using the same memcpy routine in both the decompression case and the 'only memcpy' case, that seems reasonable. Obviously, the quicker memcpy becomes, the faster the decompression has to become to maintain the same performance ratios, but things like faster clock speeds or multi-threading can make that issue moot.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: