Mark Bennett
- Apr 30, 2023
- 2 min read

Plainly Explained: Using Robusta & ChatGPT to Improve Alerting and Troubleshooting in Kubernetes

This is going to be broken down into fairly explicit instruction based on my first presentation at a local Kubernetes 757 Meetup group that Ryan Renn and I are the organizers for.

First things first, this is targeted at audiences who want to know when something in their Kubernetes cluster is "broken". Broken can mean many things, but let's consider pod statuses like "Error" and "CrashLoopBackOff". This also targets people who might be new to Kubernetes, have some understanding at a conceptual level, and can deploy workloads to clusters in some way. No judgement for how you deploy. The objective here is easier alerting faster and using commoditized tools to help us reduce mean-time-to-resolution.

PREREQUISITES:

You must have a Slack organization for which you have elevated/admin permissions
You must have python3, pip, Helm and google-cloud-sdk installed on your workstation
You must have an OpenAI account

If you do not have the above, go get them and come back. I do not advocate running any of this for the first time in a Live or Production environment. It will probably work, but I recommend testing this out in a Sandbox environment first.

Create a Kubernetes cluster. Doesn't matter how, really
Generate an OpenAI API key and document it for later
Install robusta-cli on your workstation

pip3 install -U robusta-cli --no-cache

4. Generate a config with Robusta. Most of the defaults are fine.

robusta gen-config

5. Configure the Slack Integration

6. Choose a channel to send alerts to

7. Don't configure MsTeams because it's MsTeams :)

8. Configure the Robusta UI Sink

9. Add your email

10. Add an organization name

11. Configure Prometheus

12. Read and accept the EULA

13. Answer the prompt for sending Exception Reports

14. Connect to your cluster (command is considering GKE cluster in GCP)

gcloud container clusters get-credentials my-special-cluster --zone us-central1-c --project bright-lighthouse-348293

15. Use Helm to add Robusta

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update

16. Use an editor to modify the generated_values.yaml file

BEFORE

AFTER - added chat_gpt_token at the top and playbooks at the bottom

globalConfig:
  signing_key: signing_key_value
  account_id: account_id_value
  chat_gpt_token: chat_gpt_token_value

playbookRepos:
  chatgpt_robusta_actions:
    url: "https://github.com/robusta-dev/kubernetes-chatgpt-bot.git"

customPlaybooks:
# Add the 'Ask ChatGPT' button to all Prometheus alerts
- triggers:
  - on_prometheus_alert: {}
  actions:
  - chat_gpt_enricher: {}

17. Install robusta on your cluster with the generated_values.yaml

helm install robusta robusta/robusta -f ./generated_values.yaml --set clusterName=my-special-cluster

18. Make sure the Robusta pods are on the cluster

kubectl get pods -A | grep robusta

19. Check the Robusta logs if you want to

robusta logs

20. Deploy a crashing pod to test

kubectl apply -f https://gist.githubusercontent.com/robusta-lab/283609047306dc1f05cf59806ade30b6/raw

21. Verify the pod you deployed is crashing

kubectl get pods -A

22. Trigger the alert if you're impatient

robusta playbooks trigger prometheus_alert alert_name=KubePodCrashLooping namespace=default pod_name=example-pod

23. Receive an alert in Slack that a pod is crashing and click the "Ask ChatGPT" button in Slack to get troubleshooting help.

Bonus: If you have a team which is at a different level of experience, fork the repo and modify chat_gpt.py as you require.

By the time I finished writing this, I realized this is much more involved than I thought, but the instructions are still fairly explicit. I think it's a neat tool, especially for people who don't know how or where to get started with Kubernetes. This solution is not perfect. ChatGPT will not solve all your problems, but it can make life easier by decreasing the time between alert and troubleshooting. If you like what you see, go check out https://robusta.dev.

Plainly Explained: Using Robusta & ChatGPT to Improve Alerting and Troubleshooting in Kubernetes

Recent Posts