Exploring batch caching of trees

w10-1 · on April 8, 2024

I like the orange color as another path to recover the sub-tree caches where available.

Cache invalidation in arbitrary UI's is hard to get right for most cases.

SwiftUI manages state in a similar way, with automatic hashing and view-specific data-dependent updates. The annotated (@State, @Binding, @Environment, @Observable) data is the slice of interest, with other data expressly ignored. SwiftUI also gives the programmer some control, via explicitly changing the view id to trigger a refresh.

Caching schemes are measured in the context of the kinds of use they get - for views, the runtime update model. The OO general-purpose procedural model of listening for changes proves both too heavyweight and too chaotic in its generality. It's faster to use flattened view-models with these cacheable slices, but these don't handle any real backend cross-component logic or batch updates, so you end up with threading driving UI. Often, logging updates shows a scary amount of duplicate renders.

SwiftUI has already have a couple of direction changes and most developers consider it unfinished. Take that as a rough measure of complexity for the domain before getting too entangled :)

julik · on April 8, 2024

Rails' "russian doll" caching also integrates the path to the template and (if I remember well) a digest of the template's code, which allows for better cache keying in-context. I have been asked a few times about cache key generation for this purpose but that's not what the article focuses on - generating great cache keys is a separate form of art.

temporarely · on April 8, 2024

> [The OO model is] too chaotic in its generality.

Expand on this, please. Why "chaotic"? Even a "general purpose" model can establish some sort of iterative over the composed structure, no?

jitl · on April 8, 2024

You could use a merkle hash tree where each Merkle node closes over the template inputs (eg comment.id, user.is_admin?) and optionally a hash of the template code (Unison language makes hashing code very easy since code in Unison is content-addressed https://www.unison-lang.org/docs/the-big-idea/)

The Merkle hierarchy is the same as the UI hierarchy, and by using a unified cache key system (just send the node id) you can optimize batch fetching of cached views anywhere in the hierarchy, even batching across requests if you have some QoS to prevent very deep trees in one request from stalling simple requests.

Before adding caching to a UI rendering system, it's worth considering where the "slow" comes from. Is it data dependencies to feed the UI, or actually building the UI render result? It's possible caching at another layer might be more effective - if building the template inputs is uncached but takes 15ms, building the UI from scratch takes 5ms, and getting the UI from cache based on its inputs takes 3ms, the fancy UI cache system doesn't seem worth it - more effort needs to go to optimizing the data access layer. Using too fine-grained a cache for the UI layer can actively harm performance. At best UI building is a little bit of string math, and that can be much faster than a network call in most programming languages. In Javascript at least, rendering all of the example templates in the blog post would take microseconds.

julik · on April 8, 2024

What motivated this article is generating "rich" protobufs for model graphs about 3 levels deep, and we render long lists at $work. Since there were some opportunities for caching I was wondering whether this could be optimised in some way. Interestingly enough, the "use the branch of templates as part of the cache key" made things harder for us, not easier (as we were rendering very similar proto messages over and over)

julik · on April 8, 2024

OP here, if you have any additional questions

tangkikodo · on April 9, 2024

this sounds pretty like what dataloader do used in graphql.

julik · on April 11, 2024

Likely. I haven't used GraphQL in anger yet, but it would be a very reasonable use

haskellandchill · on April 8, 2024

Now frame this in type theory, zippers? The derivative of a data type is its one holed contexts?

Or is it not amenable to that perspective all I can find is from a different viewpoint: “ The dynamic trees problem, first posed by Sleator and Tarjan is to maintain a forest of trees subject to the insertion and deletion of edges, also known as links and cuts. Dynamic trees are used as a building block in a multitude of applications, including maximum flows, dynamic connectivity and minimum spanning trees, and minimum cuts, making them a fruitful line of work with a rich history.”