1) scala/java ... is that fundamentally difficult?
2) zookeeper is being eliminated as a dependenct from kafka
3) durable disk management ... I mean, it's data, and it goes on a disk.
Look, do you want a distributed fault-tolerant system that doesn't run on specialized / expensive hardware? Well, sorry, those systems are hard. I get this a lot for Cassandra.
You either have the stones for it as a technical org to run software like that, or you pay SAAS overhead for it. A Go binary is not going to magically solve this.
EVEN IF you go SaaS, you still need monitoring and a host of other aspects (perf testing, metrics, etc) to keep abreast of your overall system.
And what's with pretending that S3 doesn't have ingress/egress charges? Last I checked those were more or less in like with EBS networking charges and inter-region costs, but I haven't looked in like a year.
And if this basically ties you to AWS, then why not just ... pay for AWS managed Kafka from Confluent?
The big fake sell from this is that it magically makes Kafka easy because it ... uses Go and uses S3. From my experience, those and "disk management" aren't the big headaches with Kafka and Cassandra masterless distributed systems. They are maybe 5% of the headaches or less.
> 1) scala/java ... is that fundamentally difficult?
It's certainly at least more so as you have a highly configurable VM in-between where you're forced to learn java-isms to manage (can't just lean on your unix skills)
> 3) durable disk management ... I mean, it's data, and it goes on a disk.
Most MQ don't store things to disk besides memory flushing to recovery from crash, in most cases the data is cleared as soon as the message is acked/expired.
Look, I'm not saying not to use Kafka, I'm just pointing out the evaluation criteria. There are certainly better options if you just want a MQ, especially if you want to support MQ patterns like fanout.
The reality is if you're doing <20k TPS on a MQ (most are) and don't need replay/persistance, then ./redis-server will suffice and operationally it will be much much easier.
But... go is gc as well. Most JVM gripes are about the knobs on GC, but Go is still a fundamentally GC'd language, so you'd have issues with that.
So... Go was the rewrite? Scylla at least rewrote Cassandra in C++ with some nice low-to-hardware improvements. Rust? ok. C++? ok. Avoid the GC pauses and get thread-per-core and userspace networking to bypass syscall boundaries.
And look, this thing is not going to steal the market share of Kafka. Kafka will continue to get supported, patched, and whenever the next API version of AWS comes out (it needs one), will this get updated for that?
Yeah, Kafka is "enterprisey" because ... it's java? Well no, Kafka is scalable, flexibly deployable (there's a reason big companies like the JVM), has a company behind it, is tunable, has support options, can be SaaS'd, has a knowledge database (REEEAAALLLLY important for distributed systems).
All those SQLite/RocksDB projects that slapped a raft protocol on top of them are in the same boat compared to Scylla or Cassandra or Dynamo. Distributed systems are HARD and need a mindshare of really smart experienced people that sustain them over time. Because when Kafka/Cassandra type systems get properly implemented, they are important systems moving / storing / processing a ton of data. I've seen hundred node Cassandra systems, those things aren't supposed to go down, ever. They are million dollar a year (maybe month) systems.
The big administration lifts in them like moving clouds, upgrading a cluster, recovering from region losses or intercontinental network outages are known quantities. Is some Go binary adhoc rewrite going to have all that? Documented with many people that know how to do it?
2) zookeeper is being eliminated as a dependenct from kafka
3) durable disk management ... I mean, it's data, and it goes on a disk.
Look, do you want a distributed fault-tolerant system that doesn't run on specialized / expensive hardware? Well, sorry, those systems are hard. I get this a lot for Cassandra.
You either have the stones for it as a technical org to run software like that, or you pay SAAS overhead for it. A Go binary is not going to magically solve this.
EVEN IF you go SaaS, you still need monitoring and a host of other aspects (perf testing, metrics, etc) to keep abreast of your overall system.
And what's with pretending that S3 doesn't have ingress/egress charges? Last I checked those were more or less in like with EBS networking charges and inter-region costs, but I haven't looked in like a year.
And if this basically ties you to AWS, then why not just ... pay for AWS managed Kafka from Confluent?
The big fake sell from this is that it magically makes Kafka easy because it ... uses Go and uses S3. From my experience, those and "disk management" aren't the big headaches with Kafka and Cassandra masterless distributed systems. They are maybe 5% of the headaches or less.