A Complete Guide to Kubernetes Network Policies for Production Security
By default, Kubernetes allows unrestricted communication between all pods in a cluster. While this simplifies development, it creates significant security risks in production environments. Network Policies provide granular control over pod-to-pod and pod-to-external traffic, implementing the principle of least privilege at the network layer. This guide covers everything you need to implement effective network policies.
Understanding the Security Problem
Without network policies, a compromised pod can communicate freely with every other pod in your cluster. If an attacker exploits a vulnerability in your public-facing web application, they can immediately begin probing your internal services, databases, and other sensitive workloads. Lateral movement becomes trivial.
Network policies act as firewall rules specific to Kubernetes. They define which pods can communicate with each other and which external endpoints pods can reach. By implementing deny-by-default policies with explicit allow rules, you contain the blast radius of any security incident.
Prerequisites for Network Policies
Network policies require a CNI plugin that supports them. Popular options include Calico, Cilium, and Weave Net. The default kubenet plugin does not support network policies, so creating NetworkPolicy resources has no effect. Before implementing policies, verify your CNI supports enforcement by checking your cluster documentation or running a test policy.
Cilium offers extended capabilities beyond standard Kubernetes network policies, including Layer 7 filtering, DNS-aware policies, and better observability. For production clusters handling sensitive workloads, consider Cilium for its advanced security features and performance characteristics.
Implementing Deny-by-Default
The most important first step is establishing a deny-by-default posture. Create a network policy in each namespace that denies all ingress and egress traffic to pods without explicit allow rules. This foundational policy ensures new workloads are isolated until you deliberately configure their network access.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
After applying this policy, all pods in the namespace lose connectivity. You must then create specific policies to allow required communication paths.
Designing Allow Rules
Network policies use label selectors to identify source and destination pods. Design your pod labels with network segmentation in mind. Common patterns include labels for application tier (frontend, backend, database), environment (production, staging), and team ownership.
Start by mapping your application communication flows. Identify which pods need to communicate with which services. Document external dependencies like databases, APIs, and DNS servers. This mapping becomes the blueprint for your allow policies.
For a typical three-tier application, you might create policies allowing frontend pods to reach backend pods on specific ports, backend pods to access database pods, and all pods to reach the DNS service in kube-system. Each policy should be as specific as possible, limiting both source pods and destination ports.
Handling Egress Traffic
Egress policies control outbound traffic from pods. Many teams focus solely on ingress and neglect egress controls, leaving a significant gap. Compromised pods can exfiltrate data or establish command-and-control channels if egress is unrestricted.
At minimum, create egress policies that allow DNS resolution (typically port 53 to kube-dns pods) and specific external dependencies. For workloads that should never reach the internet, deny all egress except internal cluster communication. Use CIDR blocks to allow access to specific external IP ranges when necessary.
Namespace Isolation
Network policies can reference other namespaces using namespaceSelector. This enables patterns like allowing monitoring tools in an observability namespace to scrape metrics from pods across the cluster while maintaining isolation between application namespaces.
Label your namespaces consistently to support these cross-namespace policies. A namespace label like “environment: production” allows policies that permit traffic only from other production namespaces while blocking development and staging environments.
Testing and Validation
Testing network policies requires methodical verification. Deploy test pods and attempt connections that should be blocked and allowed. Tools like netcat or curl can verify connectivity. Several open-source tools provide network policy simulation and visualization, helping you understand policy behavior before applying them to production.
Consider using policy-as-code tools like Kyverno or OPA Gatekeeper to enforce network policy requirements. These admission controllers can reject deployments that lack network policies or verify policies meet security standards.
Monitoring and Troubleshooting
Debugging network policy issues can be challenging. Cilium provides Hubble for flow visibility, showing exactly which connections are allowed or denied. For other CNI plugins, check CNI-specific logs and metrics. Enable flow logging if available to capture blocked connection attempts.
Maintain documentation of your network policy design. As your cluster grows, the reasoning behind specific rules becomes crucial knowledge. Include diagrams showing expected communication patterns and update them as architecture evolves.
Network policies are a critical security layer that takes time to implement correctly. Start with a single namespace, learn the patterns, and gradually expand coverage. The investment in network segmentation pays dividends in reduced attack surface and improved incident containment.
Leave a Reply