Software used to be simple enough so that one could monitor its state in order to know if something was wrong and to immediately know why it was broken. One could create a set of alerts based on some known errors or known metrics and immediately react appropriately when one of those alerts were triggered. But in today’s systems, there are hundreds or thousands of potential causes. They are unknown. No amount of dashboards nor alerts can solve this fundamental problem. We need to change our approach and have the ability to ask any questions to our systems to observe them while they’re running.
Google Cloud Platform offers a basic set of monitoring capabilities baked into the platform. Virtually anything that runs on top of GCP will get basic monitoring through health checks, basic alerting and basic log management.
Workflows answer the simple question, “what are my systems doing?” by providing you with a simple way to gain a deeper understanding of your infrastructure.
There are plenty of existing algorithms for anomaly detection. Each of them have their own strength, but they often require a quite significant sample size in order to get an accurate detection.
In our latest release, Unomaly 3.0, we made significant improvement to our search functionality to make finding specific information easier by adding multiple values for the same filter type.
We built a small tool called uno, that's like uniq but for logs which helps you filter out the normal and only outputs things that are new - i.e. anomalies.
So, Unomaly had a hack week! After the last release, we decided to take the whole following week and let everyone work on whatever they want! The purpose was to encourage people to get creative and explore ideas ...