AI-Driven Kubernetes for GitOps and Agent Systems

Mode: Online

Workshop Duration: Duration: 2 Days ~ 4 hours each session

Dates: 11th & 12th Sep 2025

Timings: 2:00 PM – 6:00 PM IST (each day)

Overview:

This intensive workshop is meticulously designed for experienced technology professionals with prior Kubernetes knowledge who want to master advanced Kubernetes use cases.

You will gain practical, hands-on experience deploying, managing, and scaling AI/ML workloads in a GitOps framework, and explore how Kubernetes can serve as a runtime for agentic AI systems.

Key Objectives:

By the end of the workshop, participants will:

✅ Deploy and scale GenAI and ML workloads on Kubernetes
✅ Implement observability and autoscaling with custom Prometheus metrics
✅ Package and deliver workloads using Helm and ArgoCD
✅ Extend Kubernetes with Operators and Custom Resource Definitions (CRDs)
✅ Run Kubernetes as a runtime for agentic AI systems
✅ Apply Kubernetes-native support for GPUs and AI enhancements

Prerequisites:

Strong understanding of Kubernetes fundamentals (Pods, Services, Deployments, YAML)
Hands-on experience with containerized applications and microservices
Familiarity with Git and CLI tools
Basic knowledge of AI/ML models and containerized APIs

Course Outline

Day 1 (4 Hours)

Module 1: Deploying GenAI APIs on Kubernetes

Run a containerized LLM inference service on Kubernetes

FastAPI-based GenAI app (e.g., summarizer/chatbot)
Write YAML specs (Deployments, Services, Ingress)
Create base Helm chart for the app

Hands-on Lab**:

Deploy FastAPI+Ollama model on KIND with Ingress
Test endpoint with curl or Postman

Module 2: Observability + Autoscaling for AI APIs

Monitor and auto-scale AI APIs

Prometheus + Grafana stack via Helm
Export FastAPI metrics using Prometheus Middleware
Configure Prometheus Adapter for custom metrics
Use HPA with request count or latency

Hands-on Lab**:

Visualize LLM metrics
Set up autoscaler to respond to load

Module 3: GitOps Deployment with Helm + ArgoCD

Automate GenAI delivery across environments

Helm values for staging vs production
Use Kustomize for overlays
ArgoCD (or FluxCD) for GitOps-based syncing
Argo Image Updater for CI/CD triggers

Hands-on Lab**:

Deploy GenAI app to two environments via GitOps
Trigger new release by updating Docker tag

Day 2 (4 Hours)

Module 4: AI-Augmented Kubernetes with kubectl-ai + MCP

Use LLM-powered agents to *assist* platform users with operational tasks, YAML generation, error resolution, and lifecycle automation.

What is AI-Augmented Kubernetes?
Introduction to `kubectl-ai`
Kubernetes MCP Servers

Hands-on Lab**:

Using `kubectl-ai`
Deploying Kubernetes MCP Server

Module 5: Deploying Agentic AI Systems on Kubernetes

Run agent-based workflows (not tied to Argo)

Architecture of agentic systems on K8s
LangGraph-based multi-agent app:
- Nodes = Agents (Docker containers or Jobs)
- Edges = Communication via queues or volume
- Optional: Use Redis, RabbitMQ, or Kubernetes Jobs as workflow steps
Use Job, CronJob, StatefulSet, and sidecar patterns for agent deployment

Hands-on Lab**:

Deploy a LangGraph agent workflow (build/test/deploy pipeline)
Agents: Code Generator, Reviewer, Image Builder Scanner (Trivy) Publisher

Module 6: Native Kubernetes AI/ML Features (Showcase + Demos)

Demo + Visuals**:

How GPU requests work via device plugin
How to schedule a training job using JobSet + Kueue