Somehow I was building distributed systems earlier in my career before I learned about queuing theory and learned this the hard way.
Nowadays with DB stuff I tend to get assigned new infra leads who see a DB cluster at 50% CPU utilization and think they can go down two instances sizes without severely impacting latency.
For me it was seeing a machine at 70% utilization and thinking they could squeeze another small service onto the box. The first time it didn’t sound right, and after that I knew it was a bad idea. By the third time I was willing to throw a veto if they wouldn’t listen to reason.
And the thing is, even if you stuff a low priority service onto a bunch of boxes, and convince the OS to honor that priority fairly, the fact that the service runs reasonably at all gets baked in as an expectation. Maybe it’s the hedonic treadmill, one kid expects dessert with every meal because they’ve always gotten it, and another knows it’s a special occasion. But anything given is jealously guarded when you have to take it away. Even a “best effort” batch process that is supposed to finish on some interval, is missed when it no longer does. And somehow it’s always your fault.
I’m sure the grocery store employees who are assigned as backup tellers constantly get grief for not getting their other tasks done “on time”.
This is, by the way, one of those problems the Cloud solved, because management couldn’t use friction with the procurement and operations departments to try to pin you into oversubscribing machines. Cloud computing usually leaves the barn door open so that you can requisition another server and ask forgiveness later. And I think at some level developers all know this, so we went along with the dope deal. Because fuck Operations and Accounting for making us the scapegoats to problems they cause.
Nowadays with DB stuff I tend to get assigned new infra leads who see a DB cluster at 50% CPU utilization and think they can go down two instances sizes without severely impacting latency.