When Monitoring falls in love with Observability
Users have funny and unpredictable ways of using a product. That is a fact, not an assumption. Can we even think of all those wacky patterns upfront? Functional and performance tests are our way of trying to mitigate them, but are we aware of how our system behaves on production? What happens when the product we build goes finally “live”? Do we observe what is happening with it? Are we learning from our observations?
I’ll share a story about how my daily job used to consist of monitoring our system in production and learning from it. Some of the lessons that we found along the way are:
– oracles for our performance tests
– learning about system behavior
– observing (potential) errors in the systems
– preventing bugs from getting to users before we fix them
I’ll show you which tools we used and how we used them and what we could see by observing them all together, because one tool is never enough.