Kubernetes Threat Model (16%)¶

This domain covers how to systematically identify, categorize, and mitigate threats in a Kubernetes environment. Understanding trust boundaries, attack surfaces, common attack vectors, and established threat modeling frameworks is essential for the KCSA exam.

Threat Modeling Methodology¶

Threat modeling is a structured process for identifying security threats and designing countermeasures. In a Kubernetes context, it involves analyzing the cluster architecture, identifying assets worth protecting, and understanding how attackers could compromise them.

General Threat Modeling Steps¶

Identify assets — What are you protecting? (cluster state, secrets, application data, customer data)
Define trust boundaries — Where does trust change? (external vs. internal network, namespace boundaries, node boundaries)
Identify threats — What could go wrong? (use STRIDE, MITRE ATT&CK, or attack trees)
Assess risk — How likely and impactful is each threat? (probability x impact)
Define mitigations — What controls reduce or eliminate each threat? (RBAC, network policies, encryption, monitoring)
Validate — Test that mitigations are effective (penetration testing, security audits, CIS benchmarks)

Exam Tip

The exam tests your ability to identify threats and match them with appropriate mitigations. Focus on understanding which Kubernetes features address which threats rather than memorizing frameworks by rote.

Kubernetes Trust Boundaries¶

Trust boundaries are points in the system where the level of trust changes. Crossing a trust boundary typically requires authentication or authorization.

graph TB
    subgraph External["External (Untrusted)"]
        INET["Internet / End Users"]
        EXTAPI["External APIs"]
    end
    subgraph ClusterBoundary["Cluster Boundary"]
        subgraph ControlPlane["Control Plane"]
            API["API Server"]
            ETCD["etcd"]
            SCHED["Scheduler"]
            CM["Controller Manager"]
        end
        subgraph NodeBoundary["Worker Node"]
            KL["Kubelet"]
            subgraph NS1["Namespace A"]
                POD1["Pod A"]
            end
            subgraph NS2["Namespace B"]
                POD2["Pod B"]
            end
        end
    end
    INET -->|"Trust Boundary 1"| API
    API -->|"Trust Boundary 2"| ETCD
    API -->|"Trust Boundary 3"| KL
    POD1 -.->|"Trust Boundary 4"| POD2

Key Trust Boundaries in Kubernetes¶

Trust Boundary	Description	Controls
External to Cluster	Entry point from untrusted networks into the cluster	Ingress controllers, firewalls, WAF, TLS termination
User to API Server	Human or CI/CD systems accessing the Kubernetes API	Authentication (OIDC, certs), RBAC authorization
API Server to etcd	The only component that should access etcd	Mutual TLS, network isolation
Control Plane to Nodes	API server communicating with kubelets	TLS, kubelet authentication, NodeRestriction
Between Namespaces	Logical separation of workloads	NetworkPolicies, RBAC, ResourceQuotas
Pod to Pod	Communication between application components	NetworkPolicies, mTLS (service mesh)
Container to Host	Container process accessing node resources	SecurityContext, seccomp, AppArmor, SELinux
Container to Container	Containers within the same pod share resources	Separate pods for different trust levels

Attack Surfaces¶

The attack surface is the sum of all points where an attacker could interact with the system. Reducing the attack surface minimizes potential entry points.

Kubernetes Attack Surface Categories¶

Control Plane Attack Surface:

API server exposed ports and endpoints
etcd data store
Scheduler and controller manager endpoints
Cloud provider metadata API (169.254.169.254)
Dashboard and monitoring UIs

Workload Attack Surface:

Container images (vulnerabilities, malware)
Application code (injection, logic flaws)
Exposed services (Ingress, LoadBalancer, NodePort)
Mounted volumes and secrets
Service-to-service communication

Infrastructure Attack Surface:

Worker node OS and kernel
Container runtime (containerd, CRI-O)
Network infrastructure (CNI plugin, service mesh)
Supply chain (image registries, CI/CD pipelines)
Cloud provider APIs and IAM

Reducing the Attack Surface¶

Minimize exposed services (use ClusterIP instead of NodePort/LoadBalancer when possible)
Use minimal base images (distroless, scratch)
Remove unnecessary tools from containers (no shell, no curl, no wget)
Disable unused API server features and admission controllers
Block access to the cloud metadata API from pods
Close unused ports on nodes
Disable the Kubernetes Dashboard or restrict access severely

Common Attack Vectors¶

Privilege Escalation¶

An attacker gains higher privileges than initially granted.

Vector	Description	Mitigation
Overly permissive RBAC	User/SA has more permissions than needed	Least privilege RBAC, audit permissions
`allowPrivilegeEscalation: true`	Container process can gain elevated capabilities	Set `allowPrivilegeEscalation: false`
Writable hostPath mounts	Pod writes to host filesystem (e.g., `/etc/cron.d`)	Block hostPath in Pod Security Standards
Privileged containers	Container has full host access	Use `restricted` Pod Security Standard
Access to ServiceAccount tokens	Stolen token used to call API server	Disable auto-mount, use bound tokens
`create pods` RBAC permission	User creates a pod with elevated privileges	Combine RBAC with admission policies

Container Escape¶

An attacker breaks out of the container isolation boundary to access the host.

Vector	Description	Mitigation
Kernel vulnerabilities	Exploit kernel bugs from within container	Patch nodes, use seccomp/AppArmor
Privileged containers	Full access to host devices and namespaces	Never use `privileged: true` for workloads
Host PID/IPC namespace	Container shares namespaces with host	Block `hostPID`/`hostIPC`
Docker socket mount	Container accesses container runtime	Never mount `/var/run/docker.sock`
Writable hostPath	Write to host filesystem	Block hostPath volumes
Capability abuse	Excessive Linux capabilities	Drop ALL, add only required ones

Denial of Service (DoS)¶

An attacker exhausts cluster resources to disrupt services.

Vector	Description	Mitigation
Resource exhaustion	Pod consumes all CPU/memory on a node	LimitRange, ResourceQuota, resource requests/limits
Fork bombs	Container creates unlimited processes	Set `pids-limit` on container runtime
Storage exhaustion	Logs or temp files fill up disk	Ephemeral storage limits, log rotation
API server flooding	Excessive API requests	API Priority and Fairness, rate limiting
Network flooding	Overwhelming network bandwidth	NetworkPolicies, rate limiting at ingress

Compromised Container¶

An attacker gains code execution inside a running container.

Vector	Description	Mitigation
Application vulnerability	RCE via code flaw (e.g., Log4Shell)	Image scanning, patching, WAF
Dependency vulnerability	Vulnerable library in container image	Dependency scanning, SBOM
Lateral movement	Compromised container accesses other services	NetworkPolicies, mTLS, least privilege SA
Data exfiltration	Sensitive data stolen from container	Encrypt data, restrict egress, runtime monitoring
Cryptomining	Resources hijacked for mining	Resource limits, runtime anomaly detection

Supply Chain Attacks¶

An attacker compromises the software supply chain to inject malicious code.

Vector	Description	Mitigation
Malicious base image	Attacker publishes a trojanized image	Use trusted registries, verify image signatures
Compromised dependency	Malicious code in a library	Dependency scanning, lock files, SBOM
Tampered CI/CD pipeline	Attacker modifies build process	Signed commits, pipeline integrity checks
Registry compromise	Attacker pushes malicious image to registry	Image signing (cosign/Sigstore), admission policies
Typosquatting	Malicious image with a name similar to a popular one	Use fully qualified image names with digest

STRIDE Threat Model¶

STRIDE is a threat classification model developed by Microsoft. Each letter represents a category of threat.

Category	Description	Kubernetes Example	Mitigation
Spoofing	Impersonating another identity	Stolen ServiceAccount token used to access API	RBAC, short-lived tokens, mTLS
Tampering	Unauthorized modification of data	Modifying ConfigMaps or Secrets	RBAC, audit logging, immutable ConfigMaps
Repudiation	Denying an action occurred	Admin modifies cluster without audit trail	Audit logging, centralized log collection
Information Disclosure	Exposing data to unauthorized parties	Secrets readable by unauthorized pods	RBAC on Secrets, encryption at rest, NetworkPolicies
Denial of Service	Making a service unavailable	Pod consuming all node resources	ResourceQuotas, LimitRanges, PDBs
Elevation of Privilege	Gaining unauthorized higher access	Container escape via privileged mode	Pod Security Standards, SecurityContext

MITRE ATT&CK for Kubernetes¶

The MITRE ATT&CK framework maps real-world adversary techniques to a structured matrix. Microsoft published a Kubernetes-specific threat matrix based on this framework.

Key Tactic Categories¶

Tactic	Description	Example Techniques
Initial Access	How attackers first enter the cluster	Compromised image, exposed dashboard, vulnerable application, misconfigured kubelet
Execution	Running malicious code	Exec into container, create new container, deploy malicious job/cron
Persistence	Maintaining access after initial compromise	Backdoor container, create a new privileged pod, modify startup scripts, writable hostPath
Privilege Escalation	Gaining higher privileges	Privileged container, access cloud metadata, ServiceAccount token theft, hostPath mount
Defense Evasion	Avoiding detection	Clear container logs, deploy to kube-system namespace, use pod anti-affinity
Credential Access	Stealing credentials	List Secrets, access ServiceAccount tokens, access cloud metadata
Discovery	Mapping the environment	Access API server, list pods/services/namespaces, query DNS
Lateral Movement	Moving to other workloads	Access other pods via network, exploit ServiceAccount permissions
Collection	Gathering target data	Read sensitive data from volumes, exfiltrate Secrets
Impact	Disrupting or destroying resources	Data destruction, resource hijacking (cryptomining), denial of service

Microsoft Threat Matrix

Microsoft's "Threat Matrix for Kubernetes" maps real-world attacks to the ATT&CK framework specifically for Kubernetes environments. It is a valuable study resource for understanding how attacks progress through a cluster.

Threat Modeling Example¶

Consider a web application deployed in Kubernetes with a frontend, backend API, and database:

graph LR
    USER["User"] -->|HTTPS| ING["Ingress"]
    ING --> FE["Frontend Pod"]
    FE -->|HTTP| BE["Backend Pod"]
    BE -->|TCP 5432| DB["Database Pod"]
    BE -->|HTTPS| EXT["External API"]

Threats and Mitigations:

Threat	Category	Mitigation
SQL injection via frontend	Tampering	Input validation, parameterized queries, WAF
Backend accessing database secrets	Information Disclosure	Dedicated SA, RBAC-restricted Secret access
Compromised frontend accessing database	Lateral Movement	NetworkPolicy: frontend can only reach backend
Attacker accessing API server from pod	Privilege Escalation	Disable SA token auto-mount, restrict RBAC
Malicious container image in CI/CD	Supply Chain	Image scanning, signing, admission policy
DDoS on ingress	Denial of Service	Rate limiting, CDN, horizontal pod autoscaling

Important Links¶

Practice Questions¶

An attacker compromises a frontend pod. Using the MITRE ATT&CK framework, describe the possible attack progression through the cluster.

Consider tactics from Initial Access through Impact.

Answer

A typical attack progression using MITRE ATT&CK tactics:

Initial Access — Attacker exploits a vulnerability in the frontend application (e.g., RCE via deserialization flaw)
Execution — Attacker runs commands inside the compromised container
Discovery — Attacker queries the Kubernetes API (if SA token is mounted), lists pods, services, and secrets. Queries DNS to discover services
Credential Access — Reads the mounted ServiceAccount token from /var/run/secrets/kubernetes.io/serviceaccount/token. Attempts to list Secrets via the API server
Lateral Movement — Uses network access to reach the backend pod (if no NetworkPolicy exists). Uses stolen credentials to access other services
Privilege Escalation — If RBAC allows, creates a privileged pod or accesses the cloud metadata API for cloud IAM credentials
Collection/Impact — Accesses database through the backend, exfiltrates data, or deploys cryptominer

Mitigations: Disable SA token auto-mount, apply NetworkPolicies, enforce restricted Pod Security Standard, use runtime monitoring (Falco), enable audit logging.

What is the difference between STRIDE and MITRE ATT&CK? When would you use each?

Compare the two frameworks and their applicability.

Answer

STRIDE is a threat classification model used during the design phase of threat modeling. It categorizes threats into six types (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege). It helps answer: "What types of threats could affect this system?"

MITRE ATT&CK is a knowledge base of real-world adversary techniques organized by tactics (stages of an attack). It maps specific, observed attack techniques to a progression matrix. It helps answer: "How do real attackers behave, and what techniques do they use?"

When to use each:

Use STRIDE during system design and architecture review to systematically identify threat categories for each component
Use MITRE ATT&CK for operational security — understanding attack patterns, building detection rules, conducting red team exercises, and validating defenses against known techniques
They are complementary: STRIDE helps identify what could go wrong; ATT&CK shows how it happens in practice

A pod specification includes hostPID: true and a privileged container. What threats does this create?

Analyze the security implications using threat modeling concepts.

Answer

This configuration creates severe threats across multiple categories:

Container Escape (Elevation of Privilege): The combination of hostPID: true and privileged: true gives the container full access to the host. The container can see all host processes, access the host filesystem via /proc/1/root, and execute commands on the host. This effectively eliminates container isolation.

Lateral Movement: With host access, the attacker can read kubelet credentials, access other containers on the node, and potentially pivot to other nodes.

Information Disclosure: Access to host processes exposes environment variables of all containers on the node, which may contain secrets.

Denial of Service: With host-level access, the attacker can kill critical node processes (kubelet, container runtime) to disrupt the node.

Mitigations:

Enforce restricted or at minimum baseline Pod Security Standard (both block hostPID and privileged)
Use admission controllers to prevent these configurations
Never grant create pods permission without corresponding admission policy enforcement

Why is blocking access to the cloud metadata API (169.254.169.254) important in Kubernetes?

Consider what information the metadata API exposes and how it can be exploited.

Answer

The cloud metadata API (available at 169.254.169.254 on AWS, GCP, and Azure) provides instance-level information including:

IAM credentials/tokens — Temporary cloud provider credentials assigned to the node. These can be used to access cloud services (S3, GCS, Azure Blob Storage) and potentially escalate privileges in the cloud environment
Instance metadata — Hostname, region, instance type, network configuration
User data/startup scripts — May contain secrets or configuration data

In a Kubernetes cluster, any pod can reach the metadata API by default because it is accessible from the node's network. A compromised container can:

Query the metadata API to obtain the node's IAM credentials
Use those credentials to access cloud resources outside the cluster
Potentially escalate to admin-level cloud access

Mitigations:

Use NetworkPolicies to block egress to 169.254.169.254
Enable metadata API restrictions (e.g., AWS IMDSv2 with hop limit = 1, GKE Workload Identity)
Use pod-level cloud identity (IAM Roles for Service Accounts on EKS, Workload Identity on GKE)

Classify the following threats using the STRIDE model: (1) A user deletes audit logs, (2) A pod accesses Secrets in another namespace, (3) An attacker uses a stolen kubeconfig.

Assign each threat to the correct STRIDE category and suggest a mitigation.

Answer

A user deletes audit logs — Repudiation. The user destroys evidence of their actions, making it impossible to determine what happened. Mitigation: Ship audit logs to an immutable, external log aggregation system (e.g., SIEM) where users cannot delete them. Use append-only storage.
A pod accesses Secrets in another namespace — Information Disclosure. Sensitive data is exposed to an unauthorized workload. Mitigation: Restrict Secret access via RBAC (do not grant get/list on Secrets across namespaces). Use dedicated ServiceAccounts per workload with namespace-scoped permissions.
An attacker uses a stolen kubeconfig — Spoofing. The attacker impersonates a legitimate user using their stolen credentials. Mitigation: Use short-lived tokens (OIDC with token expiration), enable audit logging to detect unusual access patterns, implement multi-factor authentication, rotate credentials regularly.