> nobody of any competency runs stuff on AWS complacently.
I agree with this. My point is simply that we, as an industry, are not a very competent bunch when it comes to risk management ; and that's especially true when compared to TSOs.
That doesn't mean nobody knows what they do in our industry or that shit never hits the fan elsewhere, but I would argue that it's an outlier behaviour, whereas it's the norm in more secure industries.
> As the meme goes "one does not simply just run something on AWS"
The meme has currency for a reason, unfortunately.
---
That being said, my original point was that utilities losing clients after a storm isn't the consequence of bad (or no) risk assessment ; it's the consequence of them setting up acceptable loss thresholds depending on the likelihood of an event happening, and making sure that the network as a whole can respect these SLOs while strictly respecting safety criteria.
Nobody was suggesting that loss of utilities is a result of bad risk management. We are saying that all competent businesses run risk management and for most businesses, the cost of AWS being down is less than the cost of going multi cloud.
This is particularly true when Amazon hand out credits like candy. So you just need to moan to your AWS account manager about the service interruption and you’ll be covered.
I agree with this. My point is simply that we, as an industry, are not a very competent bunch when it comes to risk management ; and that's especially true when compared to TSOs.
That doesn't mean nobody knows what they do in our industry or that shit never hits the fan elsewhere, but I would argue that it's an outlier behaviour, whereas it's the norm in more secure industries.
> As the meme goes "one does not simply just run something on AWS"
The meme has currency for a reason, unfortunately.
---
That being said, my original point was that utilities losing clients after a storm isn't the consequence of bad (or no) risk assessment ; it's the consequence of them setting up acceptable loss thresholds depending on the likelihood of an event happening, and making sure that the network as a whole can respect these SLOs while strictly respecting safety criteria.