Photo by Nick Loggie on Unsplash

I have been researching this for a long time now to understand how big tech giants monitor the quality of their services and the health of their immense infrastructure. I was quite sure that monotonous pings were not the solution. Then, what was it? What different approach do they(Google) take to track the quality and customer satisfaction of their services?

Continuous Monitoring 🔍

It is an automated way of checking the uptime and health of various computing resources. A proactive approach to improve the DevOps lifecycle by showing the areas where special care is required. It can be divided into two broad categories


Continuous Delivery pipeline

Helm has become the de-facto package manager for application in Kubernetes. I personally like calling it configuration management for containers.

Spinnaker is an open-source, multi-cloud continuous delivery platform to deploy/upgrade/scale/rollback applications in VMs(AWS EC2, GCE), container orchestrations(K8s, Openstack), etc.

In this article, we’ll see how can we use Spinnaker to deploy any Helm chart i.e deploy a container on Kubernetes.

What is a Helm Chart?

A Helm Chart is a collection of files that describe a related set of Kubernetes resources. Helm can be used to install any containerized application very easily like Nginx, Prometheus, Grafana.

Below is the command to deploy any chart.

helm…


Are you tired of searching Alerts in Slack, Teams, Emails sent by Alertmanager? Did you get overwhelmed by non-relevant alerts during your on-call and spend enough time to look for resolved messages?

If you are already using Prometheus, Alertmanager, and Grafana then let us expand the scope to visualize the alerts too in Grafana!

MTTR (mean time to recovery or mean time to restore) is the average time it takes to recover from a product or system failure. Reference: Atlassian

OPEN ALERTS OF ALERTMANAGER
OPEN ALERTS OF ALERTMANAGER

Thanks to CampToCamp for grafana-prometheus-alertmanager-datasource which uses the Alertmanager API to make it a datasource for Grafana. We can use…


What if I told you finding a needle in a haystack is very easy! At least for metrics collected by monitoring tools such as Prometheus, OpenTSDB

Prometheus has become the most common, simple yet powerful open-source monitoring tool. It can be used to pull metrics from a machine, web-server, database, load balancer, messaging queues, CI, loggers, other monitoring systems, and cloud providers. It has more than 1000s of exporters.

We’ll see how to make a Grafana dashboard that allows end-user to easily drill down to their specific metrics based on the region, product, environment, application, the exporter used is node-exporter.

Grafana dashboard can act as a single pane of glass for an engineer, manager, or C-level person.

Shubham Choudhary

DevOps & Software engineer, OSS Enthusiast, mission happiness

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store