I experienced this few years ago. I was hired with a team of consultants (I know...

I experienced this few years ago. I was hired with a team of consultants (I know.. I know..) because the system in question would have data randomly disappear from their databases every few weeks and it would take the operation team a month to notice, so by then the issue was harder to rectify.

The engineering team in question had proposed to rewrite an old Java app with AWS Amolify and replace their Postgres database with Dynodb. The whole thing was then ducktaped with Lambdas and spread across multiple regions. They had not bothered writing a single line of test, never mind having any kind of build and deployment pipeline.. they didn’t even have basic error reporting.

After doing a deeper dive, we discovered that engineers didn’t have a local environment but only had access to staging, however they could easily connect to the production database from local by just updating environmental variables. It turns out that one of them had forgotten to switch it back after debugging something on production from his machine and had forgotten to switch it back.. he had a background cleanup job running in intervals which was wiping data..

It was a complete nightmare, the schema less nature of Dynamo made it harder to understand what is happening and the React UI would crash every 15 minutes due to an infinite loop.

The operation team had learned to use the Chrome console and clear the local storage manually before the window feeezes..