ZooKeeper is rock solid. Moving off it is a mistake, IMO.
My tinfoil hat theory is that the whole impetus for KRaft is Confluent Cloud's multi-tenanted clusters have so many partitions that it starts to exceed ZK's capacities, so Confluent have built KRaft for Confluent.
And yeah, the migration approach is nutso. Also very annoying, the KRaft metadata topics being changed to be super-secret for... ...some good reason, I'm sure.
But it entirely removes the ability to respond to changed cluster metadata that you have with ZK, where you can watch znodes.
We’ve been running a 3 node cluster for several years, and a significant minority of the times I’ve been paged is because ZK got into a bad state that was fixed by a restart (what bad state exactly? Don’t know, don’t care, don’t have two spare weeks to spend figuring it out). Note that we have proper liveness checks on individual instances, so the issue is more complicated than that.
Migrated to 3.3 with KRaft about half a year ago, and we haven’t had a single issue since. It just runs and we resize the disks from time to time.
That has not been my experience. I've been running several small cluster (3 and 5 node) Confluent packaged for the last 3 years, and zookeeper ~20 times has gotten into this state where a node isn't in the cluster, and the way to "fix" it is to restart the current leader node. Usually I have to play "whack-a-mole" until I've restarted enough leaders that it comes up. Sometimes I've not been able to get the node back into the cluster without shutting down the whole cluster and restarting it.
Once it's running it's fine, until updates are done. But this getting into a weird state sure doesn't sit well with me.
This thread is an excellent example of the author's point: Kafka is polarizing.
Personally, in my experience with Kafka and Zookeeper at Airbnb back in the day (we also used ZK for general-purpose service discovery), they both were... temperamental. They'd chug along just fine for a bit, seemingly handling outages that e.g. RDS would have thrown a fit over, and then suddenly they'd be cataclysmically down in extremely complicated ways and be very difficult to bring back up. Even just using them required teaching a more complex mental model than most cloud-hosted offerings of similar things, and you ended up in this path dependency trap of "we already invested so much in Kafka, so if you want to send a message, use Kafka" when for like 95+% of use cases something easy like SQS would've been fine and simpler. TBQH I don't think either Kafka or ZK ever quite paid back their operational overhead cost, and personally I wouldn't recommend using either unless you absolutely need to.
> ZooKeeper is rock solid. Moving off it is a mistake, IMO.
I’m agnostic about Kafka but ZooKeeper is problematic for many use cases based on personal experience and I wouldn’t recommend it. It can be “rock solid” and still not very good. I’ve seen ZK replaced with alternatives at a few different organizations now because it didn’t work well in practice, and what it was replaced with worked much better in every case.
ZooKeeper works, sort of, but I wouldn’t call it “good” in some objective sense.
To be fair, a lot of people use ZK wrong, then complaint about it.
For example, if you use it like a general purpose KV store like Redis, you'll have a bad time.
Another often encountered mistake is people, thinking it doesn't need to store much data, deploy ZK to a server with slow disk/network. Big mistake, as every write to ZK need to be broadcasted and synced to disk, a bottle-neck in disk and network IOPS will kill your ensembles.
This has also been my experience when I saw unreliable ZKs; they're sharing the OS, ZK, and maybe even some other services on the same disk, and sometimes they're even running software RAID or something on top of that.
I don't think teams who can't run ZK will have much luck running other distributed systems. (Maybe KRaft, if they're Kafka experts.) Most of the alternatives proposed here have been "let someone else run the hard part." (Which isn't a bad choice, but it's not technically a solution.)
My tinfoil hat theory is that the whole impetus for KRaft is Confluent Cloud's multi-tenanted clusters have so many partitions that it starts to exceed ZK's capacities, so Confluent have built KRaft for Confluent.
And yeah, the migration approach is nutso. Also very annoying, the KRaft metadata topics being changed to be super-secret for... ...some good reason, I'm sure.
But it entirely removes the ability to respond to changed cluster metadata that you have with ZK, where you can watch znodes.
I'm not at all a fan tbh.