Bringing Order to Chaos: Securing Containers

Containers are like Jumanji. You feel as if you are in a game and your every move reveals a series of surprises which may suddenly end up with security disasters. This is because security is considered as an afterthought as always. According to some practitioners, container security is like securing a traditional server. This approach is erroneous. Indeed, it is real chaos.

Why is It Chaos?

Container infrastructure consists of multiple intermingled layers such as container, POD, node, host, cluster, namespace, orchestrator, control plane and so on. That leads us to byzantine machinations and complexity. As an inevitable result of this complex structure, you encounter various intrinsic security problems that must be considered on a layer by layer basis. In this structure, through just a single change applied to anywhere in container environment, you may suddenly and/or accidentally expose your entire cluster to attacks, cause to leak secrets or jeopardize confidential data.

At this point, traditional security tools can not keep pace with containers in terms of automation, deployment and environment. So they are almost ineffective and dysfunctional. All security approaches and perspectives must be reconsidered and reevaluated with regard to container technology. Given these circumstances, once you think you are starting to understand it, you are about to enter chaos. Redde caesari quae sunt caesaris:

One time I tried to explain Kubernetes to someone. Then we both didn’t understand it.
SwiftOnSecurity

Before securing containers, you need to understand its underlying building blocks and layers and the relation between them thoroughly. But I will not tell about the building blocks in this article. You must already be familiar with the terms such as POD, container, namespace, node, control plane, master, worker, swarm, orchestrator, Kubernetes, Docker, runtime, ingress, egress, overlay network and so on. Otherwise, you should not waste your time by reading the rest of this article.

How to Navigate/Proceed

The way to understanding of containers passes through orchestrators. In this securing guide, Kubernetes will be our orchestrator and Docker will be our container runtime. If you start to understand Kubernetes, this will be the first step on the moon for you. Although this article seems so assertive as the title shout out, it is an attempt to secure container environment as far as it can be. It mainly presents a guide about secure installation/configuration, security best practices and hardening of containers, with Kubernetes security and Docker security. In this case, of course, Linux host security is also important but it is mainly out of this post’s scope. Here 111 items, which I wrote after painstaking research and exhausting work, should be applied or used in a controlled manner, according to your needs and business. The outline is as follows:

Infrastructure and Environment Security
Authentication, Authorization, Roles and Permissions
Network and Network Security Management
Docker and Runtime Security
Security Assessment, Pentesting and Auditing of Containers
Endless Opinions on Container Security
Useful Container Security Resources
Enterprise Container Security Tools

Infrastructure and Environment Security

1.) Your prod and non-prod (preprod, test, dev) containers and clusters should not be put into the same nodes. If preprod is equivalent of prod, it should not be put into the same nodes, either. It is critical to make physical segregation for the production environment.

2.) Every node must be equivalent to a bare metal or VM server.

3.) If it is your case, use your managed Kubernetes platforms’ (OpenShift, Pivotal, Platform9 or Rancher, etc) built-in security add-ons and features.

4.) Use TLS/SSL for all outbound-inbound communications in the whole Kubernetes platform.

5.) Apply security patches of Kubernetes and/or underlying platform as soon as possible and do not hesitate to implement them even for one minute.

6.) As of today, by using current perimeter security tools, you should not expose your Kubernetes clusters or nodes to the internet. It means you should not place them in DMZ. You need to place additional API gateways or proxies in front of Kubernetes infrastructure when they are required to face the internet.

7.) Layers of microservices consisting of multiple containers and PODs should run in separate nodes.

8.) After determining your application structure consisting of PODs in a microservice or a node:

The component of the applications should be in separate layers and nodes in the form of WEB, APP and DB in accordance with the layered architecture model.
WEB and APP should not be put in the same node, and even cluster when possible.
WEB layer should never have direct access to databases.
A similar approach as above should be applied for layers that perform the same task.

9.) The value of ‘const’ constants in rand.go where the random structure in Kubernetes is declared must be expanded to include all alphanumeric characters and must be at least 8 characters long. Albeit the random token structure was actually updated on some platform such as OpenShift, it needs to be checked and strengthened for your environment.

10.) Automatic mounting (automounting) of service account tokens each time in PODs should be prevented with the automountServiceAccountToken: false parameter.

11.) Data, log, cache or critical information should not be stored inside your application containers. Separate file servers, storages or databases should be used for this purpose.

12.) Databases (Oracle, MSSQL, MYSQL, MongoDB, PostgreSQL etc.) should never be placed in the container structure.

13.) There should be separate Kubernetes clusters outside of your application clusters for the applications functioning as ‘on-the-fly’ data retention, sorting and queuing such as RabbitMQ, Redis and NoSQL-like.

14.) Your disaster (recovery) Kubernetes environment must have an identical configuration with your current active Kubernetes structure.

15.) ‘Immutable’ and ‘append-only’ flags should be set for critical container files, logs and events.

16.) Nodev, nosuid and noexec options should be set in every mount process.

17.) File systems should be mounted as “read-only”.

18.) There should be an effective, stable and robust pod security policy for your Kubernetes clusters.

19.) Container logs, Docker logs and host logs should be collected as a whole. None of them should be missed.

20.) SELinux or AppArmor must be enabled for your whole container environment.

Authentication, Authorization, Roles and Permissions

21.) Do not let your developers manage your whole Kubernetes environment independently. You should set out some logical layers such as network, system, security. Then, you should divide the management of the whole environment according to those layers, in terms of segregation of duties.

22.) The permissions and roles must be applied to the developers in a most restricted manner so that they cannot interfere with the network and system settings of the namespace or clusters.

23.) An active directory authentication or a centralized enterprise identity mechanism should be enabled for Kubernetes or its underlying control plane platform where involved.

24.) “Kubeadmin” user should be removed or disabled.

25.) Developers should not be given cluster-admin permission.

26.) Any self-provisioning feature should be disabled. Developers should contact your system administrator to request/start a project.

27.) Developers should not have cluster console access.

28.) Roles should be given exclusively to “application-specific” service accounts. For that purpose, a unique serviceAccountName pertaining to the application is defined in the POD configuration where it is installed.

29.) All service accounts within a namespace should not be collectively assigned a role.

30.) The role of listing secrets should not be given to any account.

31.) Developers and the users other than system administrators should not be authorized to create POD.

32.) You should enable RBAC. And risky RBAC permissions should not be given. For example:

resources: [“*”] verbs: [“create”]
resources: [“*”] verbs: [“list”]
resources: [“*”] verbs: [“get”]

33.) The API calls inside Kubernetes and kube-api-server calls/requests:

Should NOT be anonymous. Anonymous requests should be forbidden and all requests must be authenticated and authorized.
Should be accessible only by machines in a cluster and machines that need to administer the cluster.
Should not be used with the --insecure-bind-address option to open up the plain-text port on both non-localhost and localhost.
Should be used with secure ports for both non-localhost and localhost network interfaces.

34.) Permissive RBAC permissions (permissive binding) should never be given. This policy allows ALL service accounts to act as cluster administrators:

kubectl create clusterrolebinding permissive-binding \
  --clusterrole=cluster-admin \
  --user=admin \
  --user=kubelet \
  --group=system:serviceaccounts

35.) Only specific users should be granted RoleBindings or ClusterRoleBindings roles. Such roles should not be assigned to users and groups who are used everywhere in the cluster or environment.

36.) Roles and RoleBindings should be used instead of ClusterRoles and ClusterRoleBindings. Roles should only be specified to the relevant target namespace.

37.) Risky authorization such as “Authorization to authorize” or “authorization to assign the role” should not be granted.

38.) POD Exec kubectl exec right should only be assigned to system administrators and not given to any other user.

39.) The ETCD should not be open to anonymous service calls/requests.

40.) There should be no privileged POD, and POD that runs with a privileged account.

41.) Validating Admission Webhook permission must be granted for only trusted system users. For example, you can use Portieris. It is a Kubernetes admission controller for the enforcement of image security policies. You can create image security policies for each Kubernetes namespace, or at the cluster level, and enforce different rules for different images.

42.) You should disable anonymous access to the kubelet by starting the kubelet with the --anonymous-auth=false flag.

Network and Network Security Management

43.) If it is possible, master node and worker nodes should not be in the same IP segment, controlled access should be provided between them, or there should be put an access mechanism between master mode and worker nodes via an overlay network or another network access control tool.

44.) PODs or containers must be assigned to non-routable IP addresses in the network and they should only be open to network over svc (nodeport, load balancer) or ingress, and NATed to nodeport.

45.) The environment’s network and IP management should take place outside of containers/PODs, that is, containers should not run ifconfig and route.

46.) An overlay network or a network segmentation system should be used for applying communication and network access policies between containers, clusters and nodes. For example, the recipes for Kubernetes network policies from this Github commit by Ahmet Alp Balkan will be a good starting point without using an external tool. You can benefit from the articles below:

47.) Your application’s containers should never be operated or customized as a firewall, proxy, switch or router. For this purpose, other specific systems/tools should be used for networking and service mesh task.

48.) In addition to user-based authorizations, projects access control should be performed based on cluster and/or node-based and regarding their IP addresses.

49.) Developer teams should not be given any node-based access, but only cluster-based IP restrictions should be applied.

50.) Read-only ports of Kubelet should be disabled.

Docker and Runtime Security

51.) Every POD should be created and run with a new service account specific to that POD.

52.) Container images and processes should not be run as root.

53.) Container images should be kept at the minimum base without including OS package managers, shells and so on. You should consider using minimal images such as distroless images.

54.) Any debugging tools should be removed from containers, especially for the production environment.

55.) Secrets should NOT be reachable at the host level, stored in volume mounts or disk, called with the host’s environment variables, and used inside Dockerfiles. There should be an independent secret management system specific to this job. Also, secrets should be changed frequently, subject to rotation. For this purpose, you can use Vault for secrets management, encryption as a service, and privileged access management.

56.) Docker UNIX socket should not be exposed from host to container.

57.) Docker REST API should not be accessible within the container.

58.) A separate partition dedicated to container should be created in Linux host.

59.) Since hostPath type PersistentVolumes does not support read-only access mode, only trusted users should be granted permission to create PersistentVolume objects.

60.) Writeable hostPath directory volumes allow containers to write to the filesystem in ways that let them traverse the host filesystem outside the pathPrefix. To prevent this, readOnly: true must be used on all allowedHostPaths to effectively limit access to the specified pathPrefix.

61.) There should be separate offline container registries serving independently for PROD, TEST and DEV environments.

62.) There should be no SSH daemon in container images.

63.) The variable --insecure-registry = [] should not be used. Only images from trusted repositories should be used.

64.) By enabling trust pinning in the Docker configuration; aka Docker Content Trust (DCT), only repositories signed with a root key defined by your security team should be pulled and run.

65.) The root key used for image verification and signing should be kept offline.

66.) All image signing keys must be changed periodically, subject to rotation.

67.) For containers –log-level: INFO must be enabled at least by default.

68.) SetUID/setGID binaries should be removed from container images.

69.) Container images must be scanned for vulnerabilities regularly. For this purpose, a container-specific vulnerability scanning system should be used. Also, the same library may be used across multiple images which may have different bases. All of these need to be updated and patched by multiple parties.

70.) Container images package versions and hash information should be defined within the FROM tag.

71.) Containers’ Linux capabilities and kernel calls should be limited (limiting container breakout).

72.) Docker should never be run with the --privileged parameter.

73.) Intended or unintended, excessive usage of CPU, memory and storage resources by containers must be limited. To prevent excessive resource consumption and the fork bomb attacks originating from legitimate-looking processes, the maximum number of processes in Docker should be limited with PID cgroup, and the --kernel-memory flag should be used for maximum memory.

74.) All of Docker’s Linux capabilities should be reviewed and unnecessary ones should be dropped. Your Docker’s capabilities may be chown, dac_override, fowner, kill, setpcap, net_bind_service, net_raw, sys_chroot, mknod, setfcap and audit_write.

75.) By using –icc=false option in the Docker daemon, inter-container communication should be kept at a minimum level. And containers should be specifically addressed with -link option when necessary.

76.) Without directly publishing containers to host, a communication port should be defined by --export = port option.

77.) Docker run must be run with the --security-opt = no-new-privileges flag. Or the no-new-privileged: true option can be set in the Docker daemon configuration.

78.) Dockerfiles should always be used with the useradd command and the USER directive.

79.) Access to the Docker user or Docker group must be restricted.

80.) TLS authentication must always be performed for the Docker command-line client.

81.) There must be TLS authentication and verification for Docker APIs.

82.) When copying files from the host to the Docker image, COPY should be used instead of ADD.

83.) Tokens and keys should be kept out of Dockerfile.

84.) Docker’s legacy registry operations and Userland Proxy should be disabled.

85.) Swarm mode is optional and should only be activated when necessary.

86.) A minimal administrative node must be created in a swarm.

87.) Swarm services must be connected to the host’s specific interface.

88.) The secrets of the Swarm cluster must be managed through Docker’s secret management commands or an external secret management system.

89.) In terms of Docker Swarm services, secrets such as a password, SSH private key, SSL certificate, or another piece of data should not be transmitted over a network or stored unencrypted in a Dockerfile or in your application’s source code.

90.) Docker swarm manager must be run in auto-lock mode.

91.) Swarm manager “auto-lock” key should be rotated periodically.

92.) The following officially supported best practices for Dockerfile should be applied:

Security Assessment, Pentesting and Auditing of Containers

93.) Critical Docker files, directories and sockets should be audited by activating ‘auditd’, and the following audit rules should be applied:

-w /usr/bin/docker -p wa
-w /var/lib/docker -p wa
-w /etc/docker -p wa
-w /lib/systemd/system/docker.service -p wa
-w /lib/systemd/system/docker.socket -p wa
-w /etc/default/docker -p wa
-w /etc/docker/daemon.json -p wa
-w /usr/bin/docker-containerd -p wa
-w /usr/bin/docker-runc -p wa

94.) “Docker Bench for Security” tool should be periodically run and missing/incorrect configurations should be fixed. It checks for dozens of common best-practices around deploying Docker containers in production.

95.) To implement CIS Docker Benchmark, you can use InSpec Profile. It implements the CIS Docker 1.13.0 Benchmark in an automated way to provide security best-practice tests around Docker daemon and containers in a production environment.

96.) To implement CIS Benchmark for Kubernetes, you can use Kube-Bench. It automates to check whether each node in their Kubernetes cluster is configured according to security best practices.

97.) During pentesting of container environment, you will need a Burp-like proxing. For this purpose Kubetap may help you. It is an intercepting proxy for Kubernetes Services. It enables an operator to intercept all incoming HTTP traffic for a given Kubernetes Service.

98.) To identify vulnerabilities in running containers, images, hosts and repositories, you can use Runtime Threat Mapper of Deepfence community edition.

99.) With Kube-Scan which is a free Risk Assessment Tool for Kubernetes, you can instantly get the security posture of your Kubernetes clusters.

100.) Use Clair for the static analysis of vulnerabilities in application containers (currently including appc and docker).

101.) During your penetration testing of containers, you can use Kube-hunter for finding security issues in your Kubernetes clusters. It is designed to increase awareness and visibility of the security controls in Kubernetes environments. Also you can use Shopify’s kubeaudit tool to audit Kubernetes clusters for various different security concerns: ‘run the container as a non-root user’, ‘use a read-only root filesystem’, ‘drop scary capabilities’, ‘don’t add new ones’, ‘don’t run privileged’ and so on.

102.) For image vulnerability scanning, you can use Trivy as a comprehensive vulnerability scanner for container images. It detects vulnerabilities of OS packages (Alpine, RHEL, CentOS, etc.) and application dependencies (Bundler, Composer, npm, yarn etc.).

103.) To detect anomalous activity in your applications, you can use Falco as a behavioral activity monitor. It is the de facto Kubernetes threat detection engine. Falco audits a system at the kernel level. Falco then enriches this data with other input streams such as container runtime metrics, and Kubernetes metrics. Falco lets you continuously monitor and detect container, application, host, and network activity—all in one place—from one source of data, with one set of rules.

104.) You can use Cilium to provide API-aware networking and security at the kernel layer. Cilium transparently secures network connectivity and loadbalancing between application workloads such as application containers or processes. It operates at Layer 3/4 to provide traditional networking and security services as well as Layer 7 to protect and secure use of modern application protocols such as HTTP, gRPC and Kafka.

105.) You can use Dagda for static analysis of container security. It performs static analysis of known vulnerabilities, trojans, viruses, malware & other malicious threats in docker images/containers and to monitor the docker daemon and running docker containers for detecting anomalous activities.

106.) To make security risk analysis for Kubernetes resources, you can use kubesec.

107.) On a default installation, there will be approximately 43 RoleBindings\ClusterRoleBindings, 51 Roles\ClusterRoles and 39 subjects. Mapping the number of possible connections between RBAC objects (the subject, bindings and roles) can be burdensome. And it can turn into a nightmare to spot the privileged subjects in the cluster. So, you will need tools and methods for auditing Kubernetes RBAC Policies. For this purpose, this useful article from NCC Group and kubernetes-rbac-audit tool from CyberArk may help you for auditing RBACs.

108.) For pentesting risky permissions in Kubernetes, KubiScan from CyberArk can be used for scanning.

109.) During pentesting of containers, you may need to collect information about a container environment and list potential security issues. For this purpose, ConMachi container scanner can help you by enumerating security configuration from within the target container.

110.) During pentesting of containers, you may need to enumerate and pull images from the Docker registry. By using the go-pillage-registries project, you can take a Docker registry and pillage the manifest and configuration for each image in its catalogue.

111.) To learn Kubernetes or container pentesting methodology, you should look at the resources below:

these CyberArk (part one, part two and part three) articles,
Ana Calin’s presentation titled “Bulletproof Kubernetes – Learn by Hacking!“,
Appsecco’s training course content on attacking and auditing Dockers containers and Kubernetes clusters,
this comprehensive white paper from Wesley McGrew which is about how an attacker looks at Docker and approaches multi-container applications,
and Mark Manning’s presentation titled “Command and KubeCTL: Real-World Kubernetes Security for Pentesters” at Shmoocon 2020 with the details of attack style in his post may help you understand how taking over a host takes place by starting from a simple POD service compromise.

Endless Opinions on Container Security

It is unbelievable to see that you have read all of the above and you got here. Now, you may think that you have a deep level of cognition in container security. But, do not be so quick and reevaluate your situation after reading the useful resources about securing containers at the bottom of this article and then decide whether you have demystified the matter of securing containers or not…

When tackling container security, it is also essential to understand the attack/threat surface of containers. The threat matrix for Kubernetes published by Microsoft and an examination of potential threats to the container environment by Trend Micro may also help you understand the attack/threat surface of containers. Although the tactics seem similar, the attack techniques are different than those that target host system, regarding Kubernetes/containers. They should be understood very well to efficiently protect your container environment…

In conclusion, containers have opened up a whole new world of problems in cybersecurity, that cannot be easily resolved. You will be fighting in an endless battle. In this context, security should accelerate digital transformation and keep pace with business growth. Security needs to adopt containers rapidly and safely. A new security strategy should be defined to embrace containers’ broad context, native controls and scalability, in harmony with your security architecture. In doing so, it should be your main objective to build priorities such as transparent risk profile, compliance, robust network segmentation, vulnerability management, continuous configuration management and runtime threat monitoring.

Useful Container Security Resources

Enterprise Container Security Tools

Alcide: It provides continuous Kubernetes security from CD pipeline, Kubernetes audit logs analyser, and microservices firewall and anomaly detection for Kubernetes environment.
Anchore: It performs deep inspection of container images, generating a detailed software bill-of-materials and allowing you to apply specific policy gates and checks for your entire container workload on premises and in the cloud.
Aqua: It is container security, servesless security and cloud-native security platform. It provides full dev-to-prod security across your entire CI/CD pipeline and runtime environment, giving you end-to-end visibility and protecting your applications against attacks.
Calico Enterprise: It is a zero-trust network security and continuous compliance for Kubernetes platforms.
Deepfence: Layer 7 Security for Kubernetes.
HashiCorp Vault Enterprise: It is a secrets management and sensitive data protection platform.
NeuVector: An end-to-end Kubernetes security platform which provides image vulnerability management, admission controls, container process/file system protection and layer 7 firewall.
Octarine: Continuous security and compliance for the complete lifecycle of Kubernetes applications. It will be part of VMware.
Portshift: Automated Kubernetes runtime and network security for DevOps
Snyk Container: Find and fix vulnerabilities in container images and Kubernetes applications.
StackRox: Kubernetes Security Platform Augments Runtime Security with Streamlined Analysis and Incident Response.
Sysdig Secure: Kubernetes security and compliance for secure DevOps workloads.
Twistlock (Prisma Cloud): A very comprehensive container security and cloud native security platform.
Tenable.io Container Security: It enables DevOps processes by providing visibility into the security of container images – including vulnerabilities, malware and policy violations – through integration with the build process.
WhiteSource for Containers: It provides end-to-end open source management for containers.

The featured painting above is “The Course of Empire: Destruction” by Thomas Cole