You might not be aware that this is one of, if not the, seminal paper in distributed systems.
Lamport clocks, described briefly in the review, are an answer to the general problem of establishing an order between events that happen on different machines that respects causality. They’re still practical to this day, over 50 years after this paper was published.
There are lots of in-depth descriptions of the paper and its contributions elsewhere. I’d totally recommend finding them - it’s a great paper and not a hard read at all.
If you find there is just too much unfamiliar technical language, it’s a good idea to pause and look up a definition. You might need to follow that chain several steps, but that will help you get to an understanding of new terms.
Companies usually provide the space and will sponsor pizza and drinks, at least in SF. That’s about all the costs beyond the time of the organizers which is offered for free.
I went to some Papers We Love meetups here in NYC a few years ago and learned a lot not only from the presenters but from other attendees ranging from database experts to sound engineers. Some very nice, kind people!
Authors try to compare their work against hash tables because, usually, HT represent an upper bound on the performance of point-lookups; we don't know how to do much better in general.
So if your data structure supports range queries _and_ point lookups, you should measure against hash tables to understand how far off the ideal you are. If it's not far, and your data structure is strictly more general, that's compelling.
Yeah, Masstree has settled into the standard set of comparator systems for most research since it was published - and not as a strawman "this system was crap, so let's pretend we've done good work by beating it!" but as a real challenge to do better than.
I don't know of one, but it's such a natural idea that I'd guess it's been studied. There are standard implementations of LRU caches that use e.g. a hash map and a linked list to get both fast lookup and ordering, but for real performance I think you'd want to try and minimise the number of data structures to avoid having competing cache behaviours.
Cool, looking forward. Please put Judy in as well :)
Seriously, the 17 year old judy is still pretty good, despite it's lack of use of cool vector load / compare / etc instructions for quickly traversing tree nodes that are too small for a full radix search (ART does "simultaneous search", i.e. compares to all stored keys in a single instruction, while judy afaik runs a linear search).
It would be pretty cool if someone vectorized that in judy, and replaced null-terminated strings by a binary-safe representation. Unfortunately, all implementations are old and very hard to read.
My super high-level partial understanding of Judy is the following:
Start with ART, which is pretty simple. Then do the obvious improvements: First, we want fast access to "element number N in sorted order", i.e. you also store number of descendants. Next, you do key compression: Storing the portion of the key that can be reconstructed from the tree traversal is silly. Next: an 8-bit tag for signaling one out of 4 node types? Are you, like, filthy rich? More node types it is. Next: Spending 64 bit for a node pointer? Are you crazy? Put the type-tag into the parents pointer (earlier resolution of the branch!), and use a custom allocator to get smaller pointers (you allocate big segments and your node pointers are offsets).
You end up with an unintelligible monstrosity. Give it a cute name, forget about SIMD because it is 2001, and you end up with something like Judy.
Lamport clocks, described briefly in the review, are an answer to the general problem of establishing an order between events that happen on different machines that respects causality. They’re still practical to this day, over 50 years after this paper was published.
There are lots of in-depth descriptions of the paper and its contributions elsewhere. I’d totally recommend finding them - it’s a great paper and not a hard read at all.