More

monstrado · 2025-09-03T16:42:34 1756917754

I think the point of ACP being an open protocol is so that other editors (e.g. VSCode, Neovim) can implement it as a receiver and integration with ClaudeCode/GeminiCLI/... would just work.

monstrado · on Jan 17, 2024

I leveraged FoundationDB and RecordLayer to build a transactional catalog system for all our data services at a previous company, and it was honestly just an amazing piece of software. Adding gRPC into the mix for the serving layer felt so natural since schemas / records are defined using Protobuf with RecordLayer.

The only real downside is that the onramp for running FoundationDB at scale is quite a bit higher than a traditional distributed database.

sidcool · on Jan 17, 2024

Sounds cool. Any write up on this? How did you approach the design? What was the motivation to use foundation db? How much did you/your team needed to learn while doing it?

monstrado · on Jan 17, 2024

No write up, but the main reason was reusing the existing database we were comfortable deploying at the time. We were already using FDB for an online aggregation / mutation store for ad-hoc time-series analytics...albeit, a custom layer that we wrote (not RecordLayer).

When RecordLayer launched, I tested it out by building a catalog system that we could evolve and add new services with a single repository of protobuf schemas.

sidcool · on Jan 18, 2024

Thanks. What are the typical use cases for FDB? What can it do that, say, Cassandra can't?

extractionmech · on Jan 18, 2024

can you do a concise +/- on FDB? I’ve always thought it was a fantastic architecture but never tried it. tia

rahul342 · on Jan 18, 2024

Curious, when you started on the your project, how did you discover and decide on using FoundationDB?

monstrado · on Oct 23, 2023

One reason would be if you're already fluent in ClickHouse's SQL dialect. Although they maintain great standard SQL compatibility, they also have a great deal of special functions/aggregates/etc that are ClickHouse specific.

Other reasons include their wide range of input formats, special table functions (e.g. query a URL).

monstrado · on July 3, 2023

I built an online / mutable time-series database using FDB a few years back at a previous company. Not only was it rock solid, but it scaled linearly pretty effortlessly. It truly is one of novel modern pieces of technologies out there, and I wish there were more layers built on top of it.

monstrado · on Jan 12, 2023

As an engineer who admires the work done by DuckDB, I'm disappointed that the co-founder of its evolution is spreading FUD about competitors before its even in the competitive conversation.

> Stability. It OOMS, your CTO mentioned that last week.

I ran ClickHouse clusters for years with zero stability issues (even as a beginner at the time) at an extremely large volume video game studio with real-time needs. Using online materialized views, I was able to construct rollups of vital KPIs at millisecond level while maintaining multi-thousand QPS. Stability was never a concern of ours, and quite frankly, we were kind of blown away.

> Scale. The distributed plan is broken and I'm not sure Clickhouse even has shuffle.

First, I hate the word "broken" with zero explanation what you mean by this. Based on your language, I'm assuming you're just suggesting the distributed plans aren't as efficient as possible, a limitation that the engineers are not shy to admit.

> SQL. It is very non-standard.

I would argue the language is more a superset than "non-standard". Most everything for us just worked, and often I found areas of SQL that I could reduce significantly due to the "non-standard" extras they've added. For example: Did you know they have built-in aggregate functions for computing retention?!

> Knobs. Lots of knobs that are poorly documented. It's unclear which are mandatory. You have to restart for most.

Yes, there are a lot of knobs. ClickHouse works wonderfully out of the box with the default knobs, but you're free to tinker because that's how flexible the technology is.

You worked at Google for over a decade? You should know. Google's tech is notorious for having a TON of knobs for their internal technology (e.g. BigTable). Just because the knobs are there doesn't mean they must be tuned, it just means the engineers thought ahead. Also, the vast majority of configuration changes I've made never required a restart...I'm not even sure why you pointed this out.

(Disclaimer: I have been using ClickHouse successfully for several years)

monstrado · on Oct 4, 2022

Disclaimer: I work at ClickHouse

At a previous company, I wrote a simple TCP server to receive LineProtocol, parse it and write to ClickHouse. I was absolutely blown away by how fast I could chart data in Grafana [1]. The compression was stellar, as well...I was able to store and chart years of history data. We basically just stopped sending data to Influx and migrated everything over to the ClickHouse backend.

[1] https://grafana.com/grafana/plugins/grafana-clickhouse-datas...

monstrado · on Aug 21, 2022

This is exactly what the engineers behind FoundationDB (FDB) wanted when they open sourced. For those who don't know, FDB provides a transactional (and distributed) ordered key-value store with a somewhat simple but very powerful API.

Their vision was to build the hardest parts of building a database, such as transactions, fault-tolerance, high-availability, elastic scaling, etc. This would free users to build higher-level (Layers) APIs [1] / libraries [2] on top.

The beauty of these layers is that you can basically remove doubt about the correctness of data once it leaves the layer. FoundationDB is one of the most (if not the) most tested [3] databases out there. I used it for over 4 years in high write / read production environments and never once did we second guess our decision.

I could see this project renamed to simply "fdb-sqlite-layer"

[1] https://github.com/FoundationDB/fdb-document-layer

[2] https://github.com/FoundationDB/fdb-record-layer

[3] https://www.youtube.com/watch?v=OJb8A6h9jQQ

zasdffaa · on Aug 21, 2022

> Their vision was to build the hardest parts of building a database, such as transactions, fault-tolerance, high-availability, elastic scaling, etc. This would free users to build higher-level (Layers) APIs [1] / libraries [2] on top.

That is very interesting and simple and valuable insight that seems to be missing from the wiki page. But also from the wiki page <https://en.wikipedia.org/wiki/FoundationDB>, this:

--

The design of FoundationDB results in several limitations:

Long transactions- FoundationDB does not support transactions running over five seconds.

Large transactions - Transaction size cannot exceed 10 MB of total written keys and values.

Large keys and values - Keys cannot exceed 10 kB in size. Values cannot exceed 100 kB in size.

--

Those (unless worked around) would be absolute blockers to several systems I've worked on.

monstrado · on Aug 21, 2022

This project (mvSQLite) appears to have found a way around the 5s transaction limit as well as the size, so that's really promising. That being said, I believe the new RedWood storage engine in FDB 7.0+ is making inroads in eliminating some of these limitations, and this project should also benefit from that new storage engine...(prefix compression is a big one).

jiggawatts · on Aug 21, 2022

A simple workaround is to store “bulk data” in an external system like blob storage and reference that from the DB.

chii · on Aug 22, 2022

but now transactional guarantees only extend to the id stored in the DB, and not on the external storage.

Therefore, it's possible that the id is invalid (for the external storage) when referenced in the future. I think doing so only adds complexity as system grows.

It would be better to chunk your blob data to fit the DB, imho. It beats introducing external blob storage in the long run.

jiggawatts · on Aug 22, 2022

> but now transactional guarantees only extend to the id stored in the DB, and not on the external storage.

Depends! If the ID is a cryptographic hash, then as long as the blob is uploaded first, then the DB can't be inconsistent with the blob[1].

A Merkle Tree also allows "updates" by chunking the data into some convenient size, say 64 MB, and then building a new tree for each update and sticking that into the database.

[1] With the usual caveats that nobody is manually mucking about with the blob store, that it hasn't "lost" any blobs due to corruption, etc, etc...

CRConrad · on Aug 22, 2022

> With the usual caveats that nobody is manually mucking about with the blob store, that it hasn't "lost" any blobs due to corruption, etc, etc...

Yeah, with those caveats. But how do you make sure they apply? If someone does manually muck about with the blob store, or it does lose blobs due to corruption, then your transaction is retroactively "un-atomicized" with no trace thereof in the actual DB.

zasdffaa · on Aug 22, 2022

If corruption happens then any guarantees by the hardware are voided, and the software guarantees (of durability) which are built o the hardware guarantees are equally voided. So corruption -> dead in the water.

But otherwise I agree.

chii · on Aug 22, 2022

But if you upload to the blob store first, then add to the transaction (in your db insert) with the id (hash or not), what happens if the db transaction fails? You now have to work out a way to delete off the blob external store. Or change your application so that that it doesn't matter if it's left on the blob store ('cept for money).

victor106 · on Aug 21, 2022

How do you get started with FDB? I found it very powerful but couldn’t find good set of instructions on how to setup and scale.

monstrado · on Aug 21, 2022

Running it locally is as easy as downloading and installing. Scaling FDB is a bit more of a challenge partially due to their process-per-core design decision, which coincidently helps make FDB as bullet proof as it is.

foobiekr · on Aug 21, 2022

Where I previously was runs it in production. it's not hard to scale but at some point you will need to have multiple clusters (maxes out in practice at like 50 instances).

It's basically trouble free unless you run below 10% free space on any instance, where things go bad.

monstrado · on Aug 21, 2022

Not sure if I hit those limits, we were at around 100 nodes and over 170-180 processes. The biggest thing we recognized was tuning the number of recruited proxies and other stateless roles. We were doing up to around 400k tps once we tuned those.

tehbeard · on Aug 21, 2022

How bad are we talking? Performance degradation? Or losing transactions and data?

losfair · on Aug 21, 2022

FDB refuses to process writes (degraded write availability) when usable disk space goes below 10%. Nothing will be lost though.

foobiekr · on Aug 27, 2022

The problem here is when you try and recover the cluster by adding nodes - since FDB manages itself using transactions, this becomes very, very slow and painful if you've allowed multiple nodes to get into this state (which, because of balancing, is kind of how you get there).

Basically, fdb is great as long as you avoid this situation. If you do, woe unto you trying to being the cluster back online by adding nodes and hoping for rebalancing to fix things. It will, it is just very, very slow. I don't know if that's true in the current version.

tehbeard · on Aug 22, 2022

Good to know. and it seems to be tuneable with `knob_min_available_space_ratio`[0], as 10% free space on a 4TB drive would be 400GB.... Not exactly hurting for space there.

[0] https://forums.foundationdb.org/t/brand-new-macos-installati...

monstrado · on June 8, 2022

Curious when the folks at ClickHouse are going to decide to spin its custom rewrite of ZooKeeper in C++ (ClickHouse-Keeper [1]) out into its own separate project.

[1] https://clickhouse.com/docs/en/operations/clickhouse-keeper/

hodgesrm · on June 9, 2022

Funny, the committers were at an Amsterdam meetup last night and we were talking about ClickHouse Keeper. I don't think that's very high up the list of priorities for them right now. The focus is more on ensuring it is bombproof--we're in the "long tail" part of the deployment where you stamp out obscure edge cases in increasingly large deployments.

(I'm not a committer but have a lively interest in the topic as my company supports a couple hundred customers running on ClickHouse.)

monstrado · on March 30, 2022

Is this the project you guys referenced using Apache Arrow for?

bboreham · on March 30, 2022

Maybe you're thinking of this - the data structure used by datasources for Grafana dashboards:

https://grafana.com/docs/grafana/latest/developers/plugins/d...

netingle · on March 30, 2022

I don't think so! I think thats being used in Tempo, but I'm not sure.

number101010 · on March 31, 2022

We are definitely investigating columnar formats in Tempo to store traces. We expect it to drastically accelerate search as well as open up more complex querying and eventually metrics from distributed tracing data.

However, we are currently primarily targeting Parquet as our columnar format in object storage.

Expect an announcement soon!

monstrado · on Dec 16, 2021

The introduction of parallelized Parquet reads coupled with s3Cluster is really awesome. I feel ClickHouse is one step closer to unlocking the ephemeral SQL compute cluster (e.g. Presto, Hive) use case. I could imagine one day it having a HiveMetaStore read-only database option for querying existing data in companies...very fast, I might add.