Your operations (Ops) and site reliability engineering (SRE) teams need the appropriate tooling if you are running workloads in Kubernetes to guarantee the high reliability of the Kubernetes cluster and workloads running in it.
Table of Contents:
Kube-ops-view offers a unified operational perspective for various Kubernetes clusters. SRE and Ops teams will find it to be a helpful tool.
Kube-ops-view offers a read-only system dashboard. Among the excellent functions provided by Kube-ops-view.
- Change among several Kubernetes clusters.
- Draw nodes and display their general status (“Ready”).
- Show resource use and node capacity (CPU, memory).
- Formation and destruction of animated pods
- Using screen tokens, project dashboards can be displayed on TVs.
It is a Kubernetes native mobile dashboard app. Cabin UI is created with React Native and works on iOS and Android smartphones.
It is a mobile assistant that offers fine-grained operations for manipulating Kubernetes resources. The cabin app is touch-friendly.
- Support for Google Kubernetes Engine is seamless (GKE). GKE clusters can be created immediately from your cell phone.
- On the move, you may view Charts repositories and launch charts with a single click to early support for Help charts.
- Access pod logs, search resources by label, and initiate rolling updates by altering the image of your deployments, among other things.
Kube-shell is a built-in shell for interacting with the Kubernetes CLI. It has some extremely cool features like
- Command auto-completion, auto-suggestions, and in-line documentation.
- You can access the history of executed commands using the up/down arrow keys.
- Current kubeconfig context, simple switching between clusters/namespaces.
Telepresence is an open-source solution that allows you to debug a service locally while maintaining connectivity with its dependencies hosted in a remote Kubernetes cluster and remote cloud resources such as a database.
There is already a robust local development environment for Kubernetes-powered services. The live debugging feature is new, but it is evolving quickly.
A monitoring and troubleshooting tool for Kubernetes and Docker is called Weave Scope. Your infrastructure and applications are automatically organized into logical topologies that your SRE and Ops team can use to understand, monitor, and control your containerized, microservices-based application.
Weave Scope may enable drill-down views between nodes and processes, including deployments, services, replica sets, pods, and containers.
You can also apply filters depending on CPU and memory consumption, or use search to locate node types, containers, and processes by name, label, or path.
Bloomberg’s engineering team created PowerfulSeal after being inspired by Chaos Monkey.
Killing specific pods and nodes can wreak trouble in your Kubernetes clusters. It has two modes of operation: interactive and autonomous.
Google’s Marmot is a workflow execution engine for processing DevOps/SRE workflows. It was developed to handle infrastructure changes but may also be utilized with Kubernetes.
It is especially ideal for any operation that must be completed in steps with specific pacing and may necessitate state health checks.
So, if you’re deploying a new service version on Kubernetes with many instances, you’d do an incremental but controlled rollout.
You can manage disaster recovery for your Kubernetes resources and volumes with the help of Ark. Ark offers a straightforward and reliable method for restoring Persistent Volumes and Kubernetes resources from a series of checkpoints. An object storage service houses the backup files (e.g., Amazon S3).
- Recovery from disasters with a shorter TTR (time to respond).
- Kubernetes API object transfer across cloud providers.
- Setting up the development and testing environments via replicating the production environment (+ CI).
Sysdig is a container troubleshooting tool that collects Linux kernel system calls and events. Simply put, Sysdig is a cluster-wide version of strace + tcpdump + htop + iftop + lsof + wireshark.
Sysdig Inspect is a user interface for visualizing Sysdig data. SRE and Ops teams can use Sysdig Inspect to help with container problems and security investigations.
Sysdig Another tool created using the detailed information gathered by Sysdig is called Falco.
Falco keeps track of behavior and is built to spot unusual behavior in your application. For instance, you can find out about events like
- Running inside a container is a shell.
- Running in privileged mode is a container.
- The host is mounting a sensitive device inside a container.
The Kubernetes ecosystem is rapidly expanding. Numerous open-source and commercial technologies are available to assist you in being more productive and efficient when operating missing-critical Kubernetes clusters and services.