Seems like overkill no? Otel collectors are fairly cheap, why add expensive Kafka into the mix. If you need to buffer why not just dump to s3 or similar data store as a temporary storage array.
> If you need to buffer why not just dump to s3 or similar data store as a temporary storage array.
At that point it's very easy to sleepwalk into implementing your own database on top of s3, which is very hard to get good semantics out of - e.g. it offers essentially no ordering guarantees, and forget atomicity. For telemetry you might well be ok with fuzzy data, but if you want exact traces every time then Kafka could make sense.
Yeah, and to use S3 efficiently you also need to batch your messages into large blobs of at least 10s of MB, which further complicates the matter, especially if you don't want to lose those messages buffers.
if your otel collector is being overwhelmed. In such cases you have a lot of backlogged data not able to be ingested. So you dead letter queue it to s3 for freeing up buffers.
The approach here is to only send data to s3 as a last ditch resort.
If you're ok with losing some data when your collectors are overwhelmed, surely you'd just drop overflowing data in that case? Why go to all the effort of building a fallback ingestion path if it's not going to be reliable?
it's very hard to think s3 work as a buffer. Every datastore can work for almost all storage usecases buffer/queue/db when the scale is low but the latter were designed to work at scale