Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This level of architecture management on big server CPUs is amazing! I occasionally handle problems like this on a small scale, like minimizing wake time and peripheral power management on an 8 bit microcontroller, but there the entire scope is digestible once you get into it, and the kernel is custom-designed for the application.

However, in my case, and I expect in yours, requirements engineering is the place where you can make the greatest improvements. For example, I can save a few cycles and a few microwatts by sequencing my interrupts optimally or moving some of the algorithm to a look-up table, but if I can, say, establish that an LED indicator flash that might need to be 2x as bright but only lasts for a couple milliseconds every second is as visible as a 500ms LED on/off blink cycle, that's a 100x power savings that I can't hope to reach with micro-optimizations.

What are your application-level teams doing to reduce the data requirements? General-purpose NUMA fabrics are needed to move data in arbitrary ways between disc/memory/NICs, but your needs aren't arbitrary - you basically only require a pipeline from disc to memory to the NIC. Do you, for example, keep the first few seconds of all your content cached in memory, because users usually start at the start of a stream rather than a few minutes in? Alternatively, if 1000 people all start the same episode of Stranger Things within the same minute, can you add queues at the external endpoints or time shift them all together so it only requires one disk read for those thousand users?



> Alternatively, if 1000 people all start the same episode of Stranger Things within the same minute

It would be fascinating to hear from Netflix on some serious details of the usage patterns they see and particular optimizations that they do for that, but I doubt there's so much they can do given the size of the streams, the 'randomness' of what people watch and when they watch, and for the fact that the linked slides say the servers have 18x2TB NVME drives per-server and 256GB.

I wouldn't be surprised if the Netflix logo opener exists once on disk instead of being the first N seconds of every file though.


In previous talks Netflix has mentioned that due to serving so many 1000s of people from each box, that they basically do 0 caching in memory, all of the system memory is needed for buffers that are enroute to users, and they purposely avoid keeping any buffer cache beyond what is needed for sendfile()




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: