Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have doubts about the "We’re actually spending some time on every row" claim. Saving data using columnstore often comes with meta-data saved at the page level, such as min, max and count values for the data in that page. These values are used to filter and optimize the flow of data processed for a query (I mean 'skipped' here). If you run a 'count' query 10 times, it's very unlikely the DB will count rows 10 times. It will rely on the page's existing meta-data when available (i.e., already computed). The tests described in the post are misleading IMHO.

EDIT: This comes on top of the fact that DBs can store queries results too. Moreover the post does not tell whether they have implemented clustered or filtered indexes on the considered columns. It does not explain how partition has been performed too. All this has a big impact on execution time.



The query in the example is not “count” but a count + group by query. While it is possible to precompute group by results on every possible columns per page we don’t do that and this query does touch every row. The article touches on how this is possible: operations on encoded data, AVX2, not using a hash table for the intermediate result of the group by. And we certainly don’t fake it with storing query results.

We guarantee that the result is legit.


What does the columnar format look like? Particularly, is the group by column compressed with RLE? That’s kind of a pre-computed group-by + count that would make this kind of query very very fast :)


This is an astute observation. Here is the accurate answer from the author of the feature: “We do have special handling for RLE for filters and group by column but in both cases there would be per row component (updating selection vector for filter on RLE, updating aggregates for group by on RLE) and so far I saw RLE encoding usually slower than dictionary encoding for both these cases”


There is a clustered columnstore index on the table. MemSQL doesn’t support “filtered” indexes (indexes that let you specify a where clause that can be matched in a query with an equal or stricter where clause). Since the query needs a full scan those don’t help anyway.

We partition data on stock symbol but this doesn’t make a difference for this query either. It would if we had filters.


I wouldn't be so sure. If data is partitioned on stock symbols, then there is already a grouping by subset happening at the partition level, which may considerably ease the job of aggregation (it depends on the number of symbols and it seems you have about 1500-1600). Considering there is a clustered index, it means values are saved "in order". Then, a page could easily have a min value = max value situation, and the saved count in meta-data would be valid (no need to scan all values in the page again). Should the index use a dictionary, then one only has to count the number of items in the corresponding entry without scanning the rows. Such information is stored in the index, not in the rows. I have no doubt about one's intention. It's just that I think we could do better to support the claim. If your where clause contained something like (shares mod 3) = 0, then I would be pretty sure all rows would be scanned, because such information is not aggregated at the page level. If possible, I would also check the execution plan for any incongruent values.


That’s right: partitioning on stock symbol allows to push down LIMIT 10 to each partition, however in this case with 1500 stock symbols it doesn’t buy us much. It’s actually possible to compute a full group by (without a limit) by on every partition and merge them at the end. Merging 1500 groups is computationally trivial.

Yes, shared mod 3 or other predicate would make it impossilbe to run this query in O(metadata). It would of course burn more instructions so we would have to have a bigger cluster to hit a trillion a second as well as have a complex explanation in the blog post why this matters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: