The Avengers by Alan Silvestri from Avengers: Infinity War (Original Motion Picture Soundtrack) 🎵
I plan for my brain to 100% check out starting tomorrow. I’m too excited about disappearing for a while. Mentally I’ve checked out today (and I had today off). In fact, I’m so checked out I don’t even have the patience to find a stealth reference to a pop star to sneak into this blog paragraph since I’m not allowed to cheat in a bet that I’m definitely about to lose in 4 weeks. Sigh.
That said I’ve been exploring 3 interesting repos – I still have yet to actually run them. I do still have to in my “spare time” migrate an old website of mine to another AWS account and off of wp-engine by end of year (*bleep*). Mainly I just really want to break Kubernetes with someone or alone – but I don’t want any of the boring stuff when I finally get to do this.
I want fun stuff. And by fun stuff I mean absolutely ridiculous, fun stuff. So I went searching…for fun ways to break Kubernetes.
Kubethanos
Link: https://github.com/berkay-dincer/kubethanos
Owner: Berkay-dincer
Don’t confuse this with a Thanos deployment on Kubernetes which is highly available Prometheus deployments for monitoring (sweet header image though). No, Kubethanos does exactly what Thanos does in the Marvel Infinity War movie. Brief reminder – he snaps and half the world disappears.
Kubethanos is my love language, which is to say, it was designed to delete half the pods on your cluster. I’ve thought about it and…If one wants an adrenaline rush then it depends on what cluster you target. Now some of you are going to read that and say “Molly, you can’t deploy this in production” and while I hear you…I would like to unpack that statement. First, one <can> restrict Kubethanos to specific namespaces.
--namespaces=!kubesystem,foo-bar // A namespace or a set of namespaces to restrict kubethanos
If we’re looking at that and saying, “Molly, that’s still too scary” well then one probably does not have enough namespaces in their, very likely multi-tenant cluster because that’s what so many Kubernetes clusters even are, and in the act of reading this blog one may realize how thinking about disaster recovery and chaos engineering can already help teams prioritize without even running the scripts. “What would happen if I ran kubethanos” before one even runs it is a good exercise.
Secondly, let’s say stage or dev environments are your only option and you want to go that direction. If you’ve been anywhere close to a Kubernetes cluster the chances are you’ve seen “a perfectly normal test in stage” somehow for no reason impact production which means some part of that deployment’s application code (or networking) had some weird co-dependency with another production system. GitOps, by the way does not solve all the problems around co-dependencies, not if you’ve got embedded configurations that still point to production systems in application code accidentally. My point being – run it in stage and find out.
Kubethanos also lets you exclude specific pods. If you ARE more like me you may read that and say “WEAK,” but it is understandable, which is to say, if Kubethanos truly did randomly delete half of ALL pods there is a great chance a cluster would cease to function if it hit the kube-system namespace. There are some basic services that needs to stay alive for kubernetes to function and in true randomness I’m not even sure that’s a fair chaos example that is productive to make prioritization efforts against. So I’m grateful for the full list of configurations here. In any case, if you’ve given this a try before do feel free to let me know what you thought. It was also last updated in 2020 so I’m wondering what i’m going to find from that experience as well. I’m pretty excited about the intent regardless and the mission of the repo.
KubeInvaders
Link: https://github.com/lucky-sideburn/KubeInvaders
Owner: Eugenio Marzo
KubeInvaders spawned out of Eugenio’s desire to make chaos engineering more fun. It’s straight forward “Space Invaders, but the aliens are pods.” It can be installed manually, but better, it can be installed via a Helm chart.
I appreciate though that it also has metrics. For example “Current Replicas State Delay is a metric that shows how much time the cluster takes to come back at the desired state of pods replicas.” You can also shuffle your position of k8s pods or nodes and auto-jump between namespaces.
Probably the coolest integration is that it also works with Kube-Linter which analyzes YAML files and Helm charts and checks them for sensible defaults against security misconfigurations.
Enforcing customers to own their own DR needs a measured approach – one does not simply get to “We need a company wide DR program, chaos engineering team, and to delete entire clusters or availability zones using Fault Injector Simulator.” There are low-hanging fruits like these tools (and may others) where users can benefit from consulting one-off engagements to make their Helm charts and YAML files better – many people are still learning what some Kubernetes terms even are and even do. They may not have PodDisruptionBudget(s) [PDBs] or PriorityClasses for example. If you’re looking for even a place to start, I’m thinking play is where to start.
“In a cluster where not all users are trusted, a malicious user could create Pods at the highest possible priorities, causing other Pods to be evicted/not get scheduled. An administrator can use ResourceQuota to prevent users from creating pods at high priorities.” Thinking on this statement from the Kubernetes documentation, while KubeInvaders lets you kill random pods – imagine scenarios where someone effectively cowbirds your cluster (Reminder: Birds are terrible, but also, they are great examples of avian cyber criminals.) KubeInvaders is a great example to help show the value add of wanting to…live…as a baby bird pod. It’s really that simple.
KubeCraftAdmin
Link: https://github.com/erjadi/kubecraftadmin
Owner: Eric Jadi
Why put Minecraft on Kubernetes when you could instead put Minecraft on Kubernetes but also then use Minecraft to admin Kubernetes (Note: If you put it on same cluster though that’s what we like to call a cyclical dependency, friends.) There are many ways I don’t want to manage clusters, and this is one of them. Actually managing a cluster using Minecraft to control resources to perform work sounds extremely painful. That said, KubeCraftAdmin sounds like a GREAT WAY to create chaos.
As specified from the author: Pigs are Pods, Cows are ReplicaSets, Chickens are Services, Horses are Deployments. You have to just love when all the pigs and cows spawn in and right there in the camera it says “kube-system:deployment:coredns.”
Players have the opportunity to…make good choices…based on their own understanding of what coredns performs, what attacking in Minecraft does (delete pods? rolling restart? you decide)…and how many copies of that cow they have (replicas) against their coredns tuning for StrategyType and RollingUpdateStrategy.
I can imagine this being a really fun way to find out what truly happens when one breaks <whatever they want>. I can also imagine taking this a step further and adding a multiplayer PVP component with betting and two farms and two clusters…Because why not.
F*ck Around and Find Out – Then Fix It
I hope to see more tools like this – regardless of what happens when I finally play with them (keep in mind this blog was written while researching for myself, and I have yet to find the devil in the details trying), I am grateful these repository owners took the time to write these ideas out of their own inspiration.
If you make something like this – let me know. I want to see it. We need more play.
Header Image by Clem Onojeghuo from Unsplash