Hacker Newsnew | past | comments | ask | show | jobs | submit | goyalanuj's commentslogin

Whatever data we need in realtime, we do stream it to the Kafka cluster.

We don't do it for the production database because we don't need it in realtime.


We have fixed it! Thank you!


Thank you Jonathan!

As Jonathan mentioned, we made this decision around 9 months back and at that time Kinesis wasn't as mature and had less flexibility around retention period etc.

Kafka is very reliable (as I had seen it handling billions of events a day at LinkedIn) and has a huge open-source community around it. At IFTTT, we always prefer to use and contribute to open source ( http://engineering.ifttt.com/oss/2015/07/23/open-source/ ).


I'm assuming that you run Kafka within AWS. Much of the hardware requirements/suggestions I've seen for Kafka are all for non-virtualized environments. If you can get into it, could you share some details...

- What is the size of your Kafka cluster

- What instances types do you use?

- Do you use EBS or use ephemeral storage?

- How much do you over-provision to deal with instance loss?

- Any other gotchas/considerations?

Thanks!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: