Hacker Newsnew | past | comments | ask | show | jobs | submit | monstrado's commentslogin

I think the point of ACP being an open protocol is so that other editors (e.g. VSCode, Neovim) can implement it as a receiver and integration with ClaudeCode/GeminiCLI/... would just work.


I leveraged FoundationDB and RecordLayer to build a transactional catalog system for all our data services at a previous company, and it was honestly just an amazing piece of software. Adding gRPC into the mix for the serving layer felt so natural since schemas / records are defined using Protobuf with RecordLayer.

The only real downside is that the onramp for running FoundationDB at scale is quite a bit higher than a traditional distributed database.


Sounds cool. Any write up on this? How did you approach the design? What was the motivation to use foundation db? How much did you/your team needed to learn while doing it?


No write up, but the main reason was reusing the existing database we were comfortable deploying at the time. We were already using FDB for an online aggregation / mutation store for ad-hoc time-series analytics...albeit, a custom layer that we wrote (not RecordLayer).

When RecordLayer launched, I tested it out by building a catalog system that we could evolve and add new services with a single repository of protobuf schemas.


Thanks. What are the typical use cases for FDB? What can it do that, say, Cassandra can't?


can you do a concise +/- on FDB? I’ve always thought it was a fantastic architecture but never tried it. tia


Curious, when you started on the your project, how did you discover and decide on using FoundationDB?


One reason would be if you're already fluent in ClickHouse's SQL dialect. Although they maintain great standard SQL compatibility, they also have a great deal of special functions/aggregates/etc that are ClickHouse specific.

Other reasons include their wide range of input formats, special table functions (e.g. query a URL).


I built an online / mutable time-series database using FDB a few years back at a previous company. Not only was it rock solid, but it scaled linearly pretty effortlessly. It truly is one of novel modern pieces of technologies out there, and I wish there were more layers built on top of it.


As an engineer who admires the work done by DuckDB, I'm disappointed that the co-founder of its evolution is spreading FUD about competitors before its even in the competitive conversation.

> Stability. It OOMS, your CTO mentioned that last week.

I ran ClickHouse clusters for years with zero stability issues (even as a beginner at the time) at an extremely large volume video game studio with real-time needs. Using online materialized views, I was able to construct rollups of vital KPIs at millisecond level while maintaining multi-thousand QPS. Stability was never a concern of ours, and quite frankly, we were kind of blown away.

> Scale. The distributed plan is broken and I'm not sure Clickhouse even has shuffle.

First, I hate the word "broken" with zero explanation what you mean by this. Based on your language, I'm assuming you're just suggesting the distributed plans aren't as efficient as possible, a limitation that the engineers are not shy to admit.

> SQL. It is very non-standard.

I would argue the language is more a superset than "non-standard". Most everything for us just worked, and often I found areas of SQL that I could reduce significantly due to the "non-standard" extras they've added. For example: Did you know they have built-in aggregate functions for computing retention?!

> Knobs. Lots of knobs that are poorly documented. It's unclear which are mandatory. You have to restart for most.

Yes, there are a lot of knobs. ClickHouse works wonderfully out of the box with the default knobs, but you're free to tinker because that's how flexible the technology is.

You worked at Google for over a decade? You should know. Google's tech is notorious for having a TON of knobs for their internal technology (e.g. BigTable). Just because the knobs are there doesn't mean they must be tuned, it just means the engineers thought ahead. Also, the vast majority of configuration changes I've made never required a restart...I'm not even sure why you pointed this out.

(Disclaimer: I have been using ClickHouse successfully for several years)


Disclaimer: I work at ClickHouse

At a previous company, I wrote a simple TCP server to receive LineProtocol, parse it and write to ClickHouse. I was absolutely blown away by how fast I could chart data in Grafana [1]. The compression was stellar, as well...I was able to store and chart years of history data. We basically just stopped sending data to Influx and migrated everything over to the ClickHouse backend.

[1] https://grafana.com/grafana/plugins/grafana-clickhouse-datas...


This is exactly what the engineers behind FoundationDB (FDB) wanted when they open sourced. For those who don't know, FDB provides a transactional (and distributed) ordered key-value store with a somewhat simple but very powerful API.

Their vision was to build the hardest parts of building a database, such as transactions, fault-tolerance, high-availability, elastic scaling, etc. This would free users to build higher-level (Layers) APIs [1] / libraries [2] on top.

The beauty of these layers is that you can basically remove doubt about the correctness of data once it leaves the layer. FoundationDB is one of the most (if not the) most tested [3] databases out there. I used it for over 4 years in high write / read production environments and never once did we second guess our decision.

I could see this project renamed to simply "fdb-sqlite-layer"

[1] https://github.com/FoundationDB/fdb-document-layer

[2] https://github.com/FoundationDB/fdb-record-layer

[3] https://www.youtube.com/watch?v=OJb8A6h9jQQ


> Their vision was to build the hardest parts of building a database, such as transactions, fault-tolerance, high-availability, elastic scaling, etc. This would free users to build higher-level (Layers) APIs [1] / libraries [2] on top.

That is very interesting and simple and valuable insight that seems to be missing from the wiki page. But also from the wiki page <https://en.wikipedia.org/wiki/FoundationDB>, this:

--

The design of FoundationDB results in several limitations:

Long transactions- FoundationDB does not support transactions running over five seconds.

Large transactions - Transaction size cannot exceed 10 MB of total written keys and values.

Large keys and values - Keys cannot exceed 10 kB in size. Values cannot exceed 100 kB in size.

--

Those (unless worked around) would be absolute blockers to several systems I've worked on.


This project (mvSQLite) appears to have found a way around the 5s transaction limit as well as the size, so that's really promising. That being said, I believe the new RedWood storage engine in FDB 7.0+ is making inroads in eliminating some of these limitations, and this project should also benefit from that new storage engine...(prefix compression is a big one).


A simple workaround is to store “bulk data” in an external system like blob storage and reference that from the DB.


but now transactional guarantees only extend to the id stored in the DB, and not on the external storage.

Therefore, it's possible that the id is invalid (for the external storage) when referenced in the future. I think doing so only adds complexity as system grows.

It would be better to chunk your blob data to fit the DB, imho. It beats introducing external blob storage in the long run.


> but now transactional guarantees only extend to the id stored in the DB, and not on the external storage.

Depends! If the ID is a cryptographic hash, then as long as the blob is uploaded first, then the DB can't be inconsistent with the blob[1].

A Merkle Tree also allows "updates" by chunking the data into some convenient size, say 64 MB, and then building a new tree for each update and sticking that into the database.

[1] With the usual caveats that nobody is manually mucking about with the blob store, that it hasn't "lost" any blobs due to corruption, etc, etc...


> With the usual caveats that nobody is manually mucking about with the blob store, that it hasn't "lost" any blobs due to corruption, etc, etc...

Yeah, with those caveats. But how do you make sure they apply? If someone does manually muck about with the blob store, or it does lose blobs due to corruption, then your transaction is retroactively "un-atomicized" with no trace thereof in the actual DB.


If corruption happens then any guarantees by the hardware are voided, and the software guarantees (of durability) which are built o the hardware guarantees are equally voided. So corruption -> dead in the water.

But otherwise I agree.


But if you upload to the blob store first, then add to the transaction (in your db insert) with the id (hash or not), what happens if the db transaction fails? You now have to work out a way to delete off the blob external store. Or change your application so that that it doesn't matter if it's left on the blob store ('cept for money).


How do you get started with FDB? I found it very powerful but couldn’t find good set of instructions on how to setup and scale.


Running it locally is as easy as downloading and installing. Scaling FDB is a bit more of a challenge partially due to their process-per-core design decision, which coincidently helps make FDB as bullet proof as it is.


Where I previously was runs it in production. it's not hard to scale but at some point you will need to have multiple clusters (maxes out in practice at like 50 instances).

It's basically trouble free unless you run below 10% free space on any instance, where things go bad.


Not sure if I hit those limits, we were at around 100 nodes and over 170-180 processes. The biggest thing we recognized was tuning the number of recruited proxies and other stateless roles. We were doing up to around 400k tps once we tuned those.


How bad are we talking? Performance degradation? Or losing transactions and data?


FDB refuses to process writes (degraded write availability) when usable disk space goes below 10%. Nothing will be lost though.


The problem here is when you try and recover the cluster by adding nodes - since FDB manages itself using transactions, this becomes very, very slow and painful if you've allowed multiple nodes to get into this state (which, because of balancing, is kind of how you get there).

Basically, fdb is great as long as you avoid this situation. If you do, woe unto you trying to being the cluster back online by adding nodes and hoping for rebalancing to fix things. It will, it is just very, very slow. I don't know if that's true in the current version.


Good to know. and it seems to be tuneable with `knob_min_available_space_ratio`[0], as 10% free space on a 4TB drive would be 400GB.... Not exactly hurting for space there.

[0] https://forums.foundationdb.org/t/brand-new-macos-installati...


Curious when the folks at ClickHouse are going to decide to spin its custom rewrite of ZooKeeper in C++ (ClickHouse-Keeper [1]) out into its own separate project.

[1] https://clickhouse.com/docs/en/operations/clickhouse-keeper/


Funny, the committers were at an Amsterdam meetup last night and we were talking about ClickHouse Keeper. I don't think that's very high up the list of priorities for them right now. The focus is more on ensuring it is bombproof--we're in the "long tail" part of the deployment where you stamp out obscure edge cases in increasingly large deployments.

(I'm not a committer but have a lively interest in the topic as my company supports a couple hundred customers running on ClickHouse.)


Is this the project you guys referenced using Apache Arrow for?


Maybe you're thinking of this - the data structure used by datasources for Grafana dashboards:

https://grafana.com/docs/grafana/latest/developers/plugins/d...


I don't think so! I think thats being used in Tempo, but I'm not sure.


We are definitely investigating columnar formats in Tempo to store traces. We expect it to drastically accelerate search as well as open up more complex querying and eventually metrics from distributed tracing data.

However, we are currently primarily targeting Parquet as our columnar format in object storage.

Expect an announcement soon!


The introduction of parallelized Parquet reads coupled with s3Cluster is really awesome. I feel ClickHouse is one step closer to unlocking the ephemeral SQL compute cluster (e.g. Presto, Hive) use case. I could imagine one day it having a HiveMetaStore read-only database option for querying existing data in companies...very fast, I might add.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: