← Back to main site
← Blog

Kubernetes Security Best Practices: Top 10 Mistakes and How to Fix Them

Hien Nguyen · 9 min

KubernetesCloud SecurityDevSecOps

Kubernetes Security Best Practices: Top 10 Mistakes and How to Fix Them

Kubernetes gives teams speed and scale, but most real-world breaches don’t start with “zero-days” — they start with misconfigurations.
Below are the top mistakes I repeatedly see in cloud security reviews, plus practical fixes you can apply immediately.

If you want a fast “audit-style” check of your cluster posture, start with RBAC, network policies, and workload security context.


1) Over-permissive RBAC (cluster-admin everywhere)

What happens: teams grant broad roles to “make things work”, and privilege spreads.
Why it matters: one compromised pod/service account can become a cluster takeover.

Fix:

  • Avoid cluster-admin except for break-glass.
  • Prefer namespace-scoped roles.
  • Use “least privilege” RBAC with clear separation of duties.

Checklist:

  • No default service accounts bound to powerful roles
  • Dedicated service accounts per workload
  • Review ClusterRoleBinding regularly

2) No NetworkPolicies (flat network inside the cluster)

What happens: any pod can talk to any pod.
Why it matters: lateral movement becomes trivial after one foothold.

Fix:

  • Apply default deny policies per namespace, then allow only necessary flows.
  • Restrict access to sensitive services (databases, internal admin APIs).

Start pattern:

  • Default deny ingress + egress
  • Allow DNS
  • Allow app-to-app only on required ports

3) Containers running as root / privileged workloads

What happens: workloads run as root, sometimes with privileged: true.
Why it matters: increases blast radius and container escape risk.

Fix:

  • Enforce a baseline securityContext:
    • runAsNonRoot: true
    • allowPrivilegeEscalation: false
    • drop Linux capabilities by default
    • read-only root filesystem where possible

4) Default / exposed Kubernetes API access patterns

What happens: API server is reachable more broadly than needed.
Why it matters: brute forcing, leaked kubeconfigs, and misissued tokens become critical.

Fix:

  • Restrict API endpoint exposure (private endpoint where possible).
  • Enforce MFA/SSO on kubectl access.
  • Use short-lived auth (OIDC) instead of long-lived static credentials.
  • Audit kubeconfig distribution and rotate regularly.

5) Secrets mismanagement (plain text, git leaks, wide access)

What happens: secrets land in Git, environment variables, or shared namespaces.
Why it matters: secrets are often the fastest path to cloud compromise.

Fix:

  • Use a dedicated secret manager (cloud KMS + secret manager) or sealed secrets.
  • Limit access by namespace + RBAC.
  • Rotate secrets and remove stale credentials.
  • Scan repos/CI logs for secret leakage (and fail builds when found).

6) Unpinned images / using latest

What happens: images are pulled by tag, not digest.
Why it matters: supply chain risk + non-reproducible deployments.

Fix:

  • Pin images by digest.
  • Maintain a hardened base image strategy.
  • Add image signing/verification (Sigstore/cosign) when mature.

7) No runtime visibility (you don’t know what’s happening)

What happens: clusters have logs/metrics, but not actionable security signals.
Why it matters: detection and incident response become guesswork.

Fix:

  • Centralize audit logs + workload logs + cloud logs.
  • Monitor:
    • exec into pods
    • abnormal outbound connections
    • unusual process starts
    • Kubernetes audit events (create privileged pods, token requests, etc.)

8) Insecure ingress / weak TLS posture

What happens: outdated TLS config, missing HSTS, wildcard exposure, unprotected admin endpoints.
Why it matters: common entry point for web exploits and credential theft.

Fix:

  • Enforce TLS 1.2+; prefer modern ciphers.
  • Use managed cert rotation (ACME).
  • Protect admin endpoints with auth + IP allowlists.
  • Rate-limit and WAF where appropriate.

9) Poor namespace boundaries (multi-tenant confusion)

What happens: “namespaces are environments”, but sensitive workloads share the same cluster with weak controls.
Why it matters: one weak team/app can impact others.

Fix:

  • Separate high-risk workloads and production data by cluster or strong boundaries.
  • Apply:
    • ResourceQuotas
    • LimitRanges
    • NetworkPolicies
    • strict RBAC per namespace

10) Patch & upgrade debt (cluster and node versions lagging)

What happens: control plane, nodes, and container runtimes drift behind.
Why it matters: known vulns + outdated configs accumulate.

Fix:

  • Set upgrade SLOs (e.g., patch monthly, minor versions quarterly).
  • Bake AMIs/images with updates.
  • Track CVEs impacting kubelet/container runtime.

A simple “minimum baseline” you can enforce this week

  1. Default-deny NetworkPolicies per namespace
  2. Workload securityContext baseline (non-root, no priv-esc, drop caps)
  3. RBAC review + remove broad bindings
  4. Centralize audit logs and alert on privileged pod creation
  5. Pin images and scan in CI

Want a quick cluster posture review?

For a structured Kubernetes Security Review (RBAC, secrets, network policies, workload hardening), see the service page. For case studies and other services, see the main site.

Support my work

If my articles, case studies, or security resources helped you, you can support my work. Your support helps me maintain free content and keep publishing practical security guides.

Revolut

Quick support in seconds.

Bank transfer (EUR)

If you prefer a traditional bank transfer, request IBAN and bank details via the contact form

Support is optional. For consulting or security work, please use the Services or Contact pages.