Monitoring, Logging and Runtime Security (20%)¶
This domain covers detecting and responding to security threats at runtime. You need to know how to use Falco for behavioral monitoring, configure API server audit logging, enforce immutable containers, perform behavioral analytics, and investigate runtime security incidents. At 20% of the exam weight, this is one of the most critical domains.
Key Concepts¶
Falco¶
Falco is an open-source runtime security tool originally created by Sysdig and now a CNCF project. It detects unexpected application behavior and threats at runtime by monitoring Linux system calls using kernel modules or eBPF.
Falco Architecture¶
Falco operates by:
- Capturing system calls from the kernel (via kernel module or eBPF probe)
- Evaluating syscalls against a set of rules
- Alerting when a rule condition is matched
Falco Rules¶
Falco rules define what behavior to detect. Each rule has a condition (written in Falco's filter syntax), an output template, and a priority level.
# Example: Detect shell execution in a container
- rule: Terminal shell in container
desc: >
A shell was used as the entrypoint/exec point into a container
with an attached terminal.
condition: >
spawned_process and container
and shell_procs and proc.tty != 0
and container_entrypoint
and not user_expected_terminal_shell_in_container_conditions
output: >
A shell was spawned in a container with an attached terminal
(evt_type=%evt.type user=%user.name user_uid=%user.uid
user_loginuid=%user.loginuid process=%proc.name
proc_exepath=%proc.exepath parent=%proc.pname
command=%proc.cmdline terminal=%proc.tty exe_flags=%evt.arg.flags
%container.info)
priority: NOTICE
tags: [container, shell, mitre_execution]
Key Falco Rule Elements¶
| Element | Description |
|---|---|
rule | Name of the rule |
desc | Human-readable description |
condition | Filter expression that triggers the rule |
output | Template for the alert message |
priority | Severity: EMERGENCY, ALERT, CRITICAL, ERROR, WARNING, NOTICE, INFORMATIONAL, DEBUG |
tags | Labels for categorization |
enabled | Whether the rule is active (default: true) |
Common Falco Condition Macros¶
| Macro | Meaning |
|---|---|
spawned_process | A new process was created |
container | The event occurred inside a container |
shell_procs | The process is a shell (bash, sh, zsh, etc.) |
sensitive_files | Access to sensitive files like /etc/shadow, /etc/passwd |
open_write | A file was opened for writing |
open_read | A file was opened for reading |
outbound | An outbound network connection was made |
inbound | An inbound network connection was received |
Writing Custom Falco Rules¶
# /etc/falco/falco_rules.local.yaml
# Detect reading of sensitive files
- rule: Read sensitive file in container
desc: Detect reading of sensitive files inside containers
condition: >
open_read and container
and (fd.name startswith /etc/shadow or
fd.name startswith /etc/password or
fd.name startswith /root/.ssh)
output: >
Sensitive file read in container
(file=%fd.name user=%user.name command=%proc.cmdline
container_id=%container.id container_name=%container.name
image=%container.image.repository)
priority: WARNING
tags: [filesystem, container]
# Detect package management in a running container
- rule: Package management in container
desc: Detect package management tools running in a container
condition: >
spawned_process and container
and (proc.name in (apt, apt-get, yum, dnf, apk, pip, npm))
output: >
Package management detected in container
(command=%proc.cmdline container=%container.name
image=%container.image.repository)
priority: ERROR
tags: [process, container, mitre_persistence]
# Detect writing to /etc directory
- rule: Write below etc in container
desc: Detect writing to /etc directory inside containers
condition: >
open_write and container
and fd.name startswith /etc
and not proc.name in (systemd, dockerd)
output: >
File written below /etc in container
(file=%fd.name user=%user.name command=%proc.cmdline
container=%container.name image=%container.image.repository)
priority: ERROR
tags: [filesystem, container, mitre_persistence]
Managing Falco¶
# Check Falco status
sudo systemctl status falco
# Start/stop/restart Falco
sudo systemctl start falco
sudo systemctl stop falco
sudo systemctl restart falco
# View Falco logs
sudo journalctl -u falco
sudo cat /var/log/syslog | grep falco
# Test Falco with a rule violation
kubectl exec -it <pod-name> -- bash
# Inside the pod:
cat /etc/shadow
# Check Falco output for the alert
sudo tail -f /var/log/syslog | grep falco
# Validate Falco rules syntax
falco --validate /etc/falco/falco_rules.local.yaml
# Run Falco with custom rules file
falco -r /etc/falco/falco_rules.local.yaml
Falco Configuration¶
# /etc/falco/falco.yaml (key settings)
# Rule files to load (order matters - later files override earlier ones)
rules_file:
- /etc/falco/falco_rules.yaml
- /etc/falco/falco_rules.local.yaml
# Output channels
stdout_output:
enabled: true
file_output:
enabled: true
filename: /var/log/falco/events.log
syslog_output:
enabled: true
# Log level
log_level: info
# Priority filter: only output events at this level or above
priority: debug
Exam Tip
Falco documentation is accessible during the exam. Focus on understanding how to modify existing rules and write simple custom rules in /etc/falco/falco_rules.local.yaml. Know the priority levels and the most common macros (spawned_process, container, open_write, open_read). Always restart Falco after changing rules.
Common Pitfall
Custom rules should go in /etc/falco/falco_rules.local.yaml, not in the main /etc/falco/falco_rules.yaml. The main file is overwritten during Falco upgrades. The local file is loaded after the main file and can override rules.
Audit Logging¶
Kubernetes API server audit logging records a chronological set of activities affecting the cluster. Audit logs are essential for security investigations and compliance.
Audit Policy¶
The audit policy defines what events are recorded and what data is included.
# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Do not log requests to the following endpoints
- level: None
nonResourceURLs:
- /healthz*
- /version
- /readyz*
- /livez*
# Do not log watch requests by the system
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: ""
resources: ["endpoints", "services", "services/status"]
# Log Secret access at Metadata level (do not log request/response body)
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps", "tokenreviews"]
# Log pod creation and deletion at RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["pods"]
verbs: ["create", "delete"]
# Log RBAC changes at RequestResponse level
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]
# Log everything else at Request level
- level: Request
omitStages:
- RequestReceived
Audit Backends¶
| Backend | Description |
|---|---|
| Log backend | Writes events to a file on disk |
| Webhook backend | Sends events to an external HTTP API |
# /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
containers:
- command:
- kube-apiserver
# Log backend
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
- --audit-log-path=/var/log/kubernetes/audit/audit.log
- --audit-log-maxage=30
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
# Webhook backend (optional, can be used alongside log backend)
- --audit-webhook-config-file=/etc/kubernetes/audit-webhook.yaml
- --audit-webhook-batch-max-wait=5s
Analyzing Audit Logs¶
# View recent audit events
sudo tail -100 /var/log/kubernetes/audit/audit.log | jq .
# Find all events for a specific user
sudo cat /var/log/kubernetes/audit/audit.log | \
jq 'select(.user.username == "system:serviceaccount:default:compromised-sa")'
# Find all Secret access events
sudo cat /var/log/kubernetes/audit/audit.log | \
jq 'select(.objectRef.resource == "secrets")'
# Find all failed authentication attempts
sudo cat /var/log/kubernetes/audit/audit.log | \
jq 'select(.responseStatus.code >= 400)'
# Find all create/delete operations on pods
sudo cat /var/log/kubernetes/audit/audit.log | \
jq 'select(.objectRef.resource == "pods" and (.verb == "create" or .verb == "delete"))'
Audit Event Structure¶
{
"apiVersion": "audit.k8s.io/v1",
"kind": "Event",
"level": "RequestResponse",
"auditID": "12345-67890",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/default/pods",
"verb": "create",
"user": {
"username": "admin",
"groups": ["system:masters"]
},
"objectRef": {
"resource": "pods",
"namespace": "default",
"name": "test-pod",
"apiVersion": "v1"
},
"responseStatus": {
"code": 201
},
"requestReceivedTimestamp": "2024-01-15T10:30:00.000000Z",
"stageTimestamp": "2024-01-15T10:30:00.500000Z"
}
Immutable Containers¶
Immutable containers cannot modify their filesystem at runtime. This prevents attackers from installing tools, modifying configuration, or persisting malware.
Enforcing Immutability with readOnlyRootFilesystem¶
apiVersion: v1
kind: Pod
metadata:
name: immutable-app
spec:
containers:
- name: app
image: nginx:1.25
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
# Mount writable volumes only where needed
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /var/cache/nginx
- name: run
mountPath: /var/run
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir: {}
- name: run
emptyDir: {}
Exam Tip
When enabling readOnlyRootFilesystem, many applications require writable directories for temporary files, caches, or PID files. Use emptyDir volumes for these paths. Common paths that need to be writable: /tmp, /var/run, /var/cache, /var/log.
Common Pitfall
Setting readOnlyRootFilesystem: true without providing writable volumes for directories your application needs will cause the application to crash. Test thoroughly and check application logs if a pod enters CrashLoopBackOff after enabling this setting.
Making Existing Deployments Immutable¶
# Patch a deployment to use readOnlyRootFilesystem
kubectl patch deployment myapp -n production --type='json' -p='[
{
"op": "add",
"path": "/spec/template/spec/containers/0/securityContext/readOnlyRootFilesystem",
"value": true
}
]'
# Add emptyDir volumes for writable directories
kubectl edit deployment myapp -n production
# Add volumes and volumeMounts as needed
Behavioral Analytics¶
Behavioral analytics involves establishing baselines of normal activity and detecting deviations that may indicate a security breach.
Indicators of Compromise in Kubernetes¶
| Indicator | Description | Detection Method |
|---|---|---|
| Unexpected process execution | Shells, package managers, or tools running in production containers | Falco rules |
| Unusual network connections | Outbound connections to unknown IPs, unusual ports | NetworkPolicies, Falco |
| File system modifications | Writing to sensitive directories in immutable containers | readOnlyRootFilesystem, Falco |
| Privilege escalation attempts | Attempts to gain root access or additional capabilities | Falco, audit logs |
| Abnormal API access patterns | Unusual ServiceAccount activity, excessive API calls | Audit logs |
| Cryptomining indicators | High CPU usage, connections to mining pools | Resource monitoring, Falco |
Detecting Anomalous Behavior¶
# Check for pods running with excessive privileges
kubectl get pods -A -o json | jq '.items[] | select(
.spec.containers[].securityContext.privileged == true
) | .metadata.namespace + "/" + .metadata.name'
# Check for pods with host network access
kubectl get pods -A -o json | jq '.items[] | select(
.spec.hostNetwork == true
) | .metadata.namespace + "/" + .metadata.name'
# Look for unusual processes in containers
kubectl exec <pod> -- ps aux
# Check for unexpected network connections
kubectl exec <pod> -- netstat -tlnp 2>/dev/null || \
kubectl exec <pod> -- ss -tlnp
# Review container logs for suspicious activity
kubectl logs <pod> --tail=100
kubectl logs <pod> --previous # logs from previous container instance
Investigating Container Runtime Issues¶
When a security incident is detected, systematic investigation is required.
Investigation Workflow¶
# 1. Identify the suspicious pod
kubectl get pods -A -o wide | grep <suspicious-indicator>
# 2. Get pod details
kubectl describe pod <pod-name> -n <namespace>
# 3. Check container processes
kubectl exec <pod-name> -n <namespace> -- ps aux
# 4. Check network connections
kubectl exec <pod-name> -n <namespace> -- netstat -tlnp
# 5. Check filesystem changes (if not read-only)
kubectl exec <pod-name> -n <namespace> -- find / -mmin -60 -type f 2>/dev/null
# 6. Check environment variables for leaked secrets
kubectl exec <pod-name> -n <namespace> -- env
# 7. Review pod logs
kubectl logs <pod-name> -n <namespace> --tail=200
kubectl logs <pod-name> -n <namespace> --previous
# 8. Check events related to the pod
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name>
# 9. Review audit logs for actions by the pod's ServiceAccount
SA_NAME=$(kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.serviceAccountName}')
sudo cat /var/log/kubernetes/audit/audit.log | \
jq "select(.user.username == \"system:serviceaccount:<namespace>:${SA_NAME}\")"
# 10. Check Falco alerts
sudo cat /var/log/syslog | grep falco | grep <container-id>
Containment Actions¶
# Isolate the pod by removing all network access
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: isolate-compromised-pod
namespace: <namespace>
spec:
podSelector:
matchLabels:
<label-key>: <label-value>
policyTypes:
- Ingress
- Egress
EOF
# Scale down the affected deployment
kubectl scale deployment <deployment-name> -n <namespace> --replicas=0
# Delete the compromised pod (if not managed by a controller)
kubectl delete pod <pod-name> -n <namespace>
# Revoke the pod's ServiceAccount permissions
kubectl delete rolebinding <binding-name> -n <namespace>
kubectl delete clusterrolebinding <binding-name>
Practice Exercises¶
Exercise 1: Write a Falco Rule
Write a custom Falco rule that detects when any process inside a container attempts to read files under /etc/kubernetes/. The rule should have ERROR priority and include the container name, image, and command in the output.
Solution
Add the following to /etc/falco/falco_rules.local.yaml:
- rule: Read Kubernetes config in container
desc: >
Detect attempts to read Kubernetes configuration files
from within a container
condition: >
open_read and container
and fd.name startswith /etc/kubernetes/
output: >
Kubernetes config file read in container
(file=%fd.name user=%user.name command=%proc.cmdline
container=%container.name image=%container.image.repository
pid=%proc.pid)
priority: ERROR
tags: [filesystem, container, k8s_config]
# Validate the rule syntax
sudo falco --validate /etc/falco/falco_rules.local.yaml
# Restart Falco to load the new rule
sudo systemctl restart falco
# Test: exec into a pod and try to read kubernetes config
kubectl exec -it <pod> -- cat /etc/kubernetes/admin.conf
# Check Falco output
sudo tail -f /var/log/syslog | grep falco
Exercise 2: Create an Audit Policy
Create an audit policy that:
- Does not log requests to health check endpoints (
/healthz,/readyz,/livez) - Logs all Secret operations at
Metadatalevel - Logs namespace creation and deletion at
RequestResponselevel - Logs everything else at
Requestlevel
Configure the API server to use this policy.
Solution
# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Do not log health check endpoints
- level: None
nonResourceURLs:
- /healthz*
- /readyz*
- /livez*
# Log Secret operations at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Log namespace create/delete at RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["namespaces"]
verbs: ["create", "delete"]
# Log everything else at Request level
- level: Request
omitStages:
- RequestReceived
Update the API server manifest:
# /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
containers:
- command:
- kube-apiserver
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
- --audit-log-path=/var/log/kubernetes/audit/audit.log
- --audit-log-maxage=30
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
volumeMounts:
- name: audit-policy
mountPath: /etc/kubernetes/audit-policy.yaml
readOnly: true
- name: audit-log
mountPath: /var/log/kubernetes/audit
volumes:
- name: audit-policy
hostPath:
path: /etc/kubernetes/audit-policy.yaml
type: File
- name: audit-log
hostPath:
path: /var/log/kubernetes/audit
type: DirectoryOrCreate
Exercise 3: Make a Deployment Immutable
A Deployment named api-server in the production namespace runs an nginx container. Make it immutable by:
- Setting
readOnlyRootFilesystem: true - Adding writable volumes for nginx's required paths
- Ensuring the container runs as non-root
Solution
Edit the deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
namespace: production
spec:
replicas: 2
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
securityContext:
runAsNonRoot: true
runAsUser: 101
runAsGroup: 101
seccompProfile:
type: RuntimeDefault
containers:
- name: nginx
image: nginx:1.25
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
ports:
- containerPort: 8080
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /var/cache/nginx
- name: run
mountPath: /var/run
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir: {}
- name: run
emptyDir: {}
Exercise 4: Investigate a Security Incident
A Falco alert indicates that a shell was spawned inside the pod web-frontend in the production namespace. Investigate the incident and take containment actions.
Solution
# 1. Get pod details
kubectl describe pod web-frontend -n production
# 2. Check what processes are running
kubectl exec web-frontend -n production -- ps aux
# 3. Check for unexpected network connections
kubectl exec web-frontend -n production -- netstat -tlnp 2>/dev/null
# 4. Look for recently modified files
kubectl exec web-frontend -n production -- find / -mmin -30 -type f 2>/dev/null
# 5. Check container logs
kubectl logs web-frontend -n production --tail=200
# 6. Check the ServiceAccount permissions
SA=$(kubectl get pod web-frontend -n production -o jsonpath='{.spec.serviceAccountName}')
kubectl auth can-i --list --as=system:serviceaccount:production:${SA}
# 7. Review Falco alerts for this container
CONTAINER_ID=$(kubectl get pod web-frontend -n production -o jsonpath='{.status.containerStatuses[0].containerID}' | sed 's/containerd:\/\///')
sudo cat /var/log/syslog | grep falco | grep ${CONTAINER_ID}
# 8. Review audit logs
sudo cat /var/log/kubernetes/audit/audit.log | \
jq "select(.user.username == \"system:serviceaccount:production:${SA}\")" | \
jq '{verb, resource: .objectRef.resource, name: .objectRef.name}'
# 9. Containment: Isolate the pod
kubectl label pod web-frontend -n production quarantine=true
cat <<'EOF' | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: quarantine-web-frontend
namespace: production
spec:
podSelector:
matchLabels:
quarantine: "true"
policyTypes:
- Ingress
- Egress
EOF
# 10. Delete the compromised pod (deployment will recreate a clean one)
kubectl delete pod web-frontend -n production
Exercise 5: Configure Falco Output Channels
Configure Falco to:
- Output alerts to a file at
/var/log/falco/alerts.log - Output alerts to syslog
- Disable stdout output
- Set the minimum priority to
WARNING
Solution
Edit /etc/falco/falco.yaml:
# Set minimum priority
priority: WARNING
# Disable stdout
stdout_output:
enabled: false
# Enable file output
file_output:
enabled: true
keep_alive: false
filename: /var/log/falco/alerts.log
# Enable syslog output
syslog_output:
enabled: true
# Create the log directory
sudo mkdir -p /var/log/falco
# Restart Falco to apply changes
sudo systemctl restart falco
# Verify Falco is running with new config
sudo systemctl status falco
# Trigger an alert and verify output
kubectl exec -it <pod> -- bash -c "cat /etc/shadow"
# Check the file output
sudo tail /var/log/falco/alerts.log
# Check the syslog output
sudo tail /var/log/syslog | grep falco