Sensitive Data Leak via Docker Images

Imagine a scenario where you’ve just launched a shiny new web application and within minutes, your boss rushes into your office in a state of panic, shouting, ‘We’ve got a data leak!’ But wait, you think to yourself, you’ve been meticulous with security and have followed all the best practices. So what could have gone wrong?
Well, it might surprise you to know that the root cause of this disaster could be something as seemingly harmless as a Docker image. That’s right, Docker images, the very building blocks of our containerized applications, can be a source of sensitive data leaks.

Let’s take a look at an example. Here’s a piece of code that demonstrates how easy it is to leak sensitive data via docker image accidentally:

FROM alpine:3.12
COPY secrets.txt /app/secrets.txt
CMD ["cat", "/app/secrets.txt"]

This simple Dockerfile takes a file containing sensitive information, ‘secrets.txt’ and copies it into the image. It then runs a command to display the contents of the file. This is a perfect example of how a Docker image can be used to leak sensitive information.

In this blog post; we’ll explore the ways in which sensitive data can leak via Docker images and what you can do to prevent it.

Purpose of the Blog Post

The purpose of this blog post is to explore the ways in which sensitive data can leak via Docker images and what you can do to prevent it. We’ll provide an overview of the problem and its consequences & then outline best practices for handling sensitive information in Docker images, as well as strategies for secure image building and distribution. This post is aimed at developers, security professionals and anyone interested in learning more about the risks associated with Docker images and how to mitigate them.

Real–World Scenario

Scenario: You are a software developer working for a financial services company. You are responsible for building and deploying a Docker image for a new payment application. You want to ensure that sensitive information such as customer credit card details, passwords and API keys are not leaked through the Docker image.

Handling sensitive information
You create a .env file to store sensitive information and add it to your .gitignore file to prevent it from being committed to your source code repository.

# .env file
API_KEY=abcdefghijklmnopqrstuvwxyz
PASSWORD=12345678

Secure image building
You use a trusted and up-to-date base image for your application and build your Docker image using a multi-stage build process.

FROM node:14.15.1-alpine AS builder
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build

FROM node:14.15.1-alpine
WORKDIR /app
COPY --from=builder /app/dist /app
ENV NODE_ENV=production
CMD ["npm", "start"]

Secure image distribution
– You push your Docker image to a private Docker registry that is only accessible to authorized users.
– You sign your Docker image using a digital signature to verify the authenticity of the image.

docker login myregistry.com
docker build -t myregistry.com/payment-app:1.0.0 .
docker push myregistry.com/payment-app:1.0.0
docker sign myregistry.com/payment-app:1.0.0

Secure image deployment
You deploy your Docker image to a secure production environment and configure environment variables to securely pass sensitive information to the application.

Kubernetes deployment file

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: payment-
...

Extracting the Layers

Docker images are built up of multiple layers, each of which represents a change to the file system. These layers are combined to form a single, unified image. To extract the layers of a Docker image, you can use the docker save command to save the image to a tar archive & then extract the individual layers using standard tar tools.

Here’s an example of how to extract the layers of a Docker image;

docker save -o myimage.tar myimage:latest    # Save the Docker image to a tar archive
tar -xf myimage.tar    # Extract the tar archive

In this example; the docker save command is used to save the myimage:latest image to a tar archive named myimage.tar. The tar command is then used to extract the contents of the archive.

You can inspect the contents of the individual layers by navigating to the extracted directory and using the ls command. Each layer is stored as a subdirectory with a unique name. You can examine the files in each layer to see the changes that were made to the file system at each step in the build process.

Keep in mind that extracting the layers of a Docker image may not be necessary in most cases, but it can be useful for understanding the structure of the image or for debugging issues during image build & deployment.

How Sensitive Data Can Leak via Docker Images

Accidental Inclusion of Sensitive Information in Images

One of the most common ways sensitive data can leak via Docker images is through the accidental inclusion of sensitive information in the images. This can happen when developers include sensitive files, such as configuration files, in their images without realizing the risks involved. For example, consider the following code:

FROM alpine:3.12
COPY secrets.txt /app/secrets.txt
CMD ["cat", "/app/secrets.txt"]

In this case, the Dockerfile is copying a file containing sensitive information, ‘secrets.txt‘ into the image. This file is then accessible to anyone who has access to the image including those who obtain it through distribution or deployment.

Use of Untrusted or Outdated Base Images

Another risk is the use of untrusted or outdated base images. Base images are the starting point for building Docker images and they can contain vulnerabilities or other security risks that can be exploited by attackers. For example, a base image that is outdated & contains known vulnerabilities could be exploited to gain access to sensitive information.

Misconfigured Storage Volumes & Environment Variables

Misconfigured storage volumes and environment variables can also be a source of sensitive data leaks. For example; when mounting a storage volume in a container it’s possible to accidentally expose sensitive information that is stored on the host machine. Similarly, environment variables that contain sensitive information, such as credentials, can be easily accessed by attackers if not properly secured.

Inadequate Protection of Images during Distribution & Deployment

Finally, inadequate protection of images during distribution and deployment can also result in sensitive data leaks. For example, images that are distributed or deployed over unencrypted channels can be intercepted and their contents disclosed. In addition, images that are not properly signed or verified can be tampered with during distribution potentially compromising sensitive information.

In last, sensitive data can leak via Docker images in many ways, including accidental inclusion of sensitive information in images, use of untrusted or outdated base images, misconfigured storage volumes and environment variables and inadequate protection of images during distribution and deployment. It’s vital for organizations to be aware of these risks and take steps to mitigate them in order to protect their sensitive information.

SSH Preparation

To use Secure Shell (SSH) to access a Docker container, you need to prepare the container with an SSH server. This involves installing an SSH server in the container & configuring it to allow access over SSH. Here’s an example of how to prepare a Docker container for SSH access;

Start a new container from an Ubuntu base image

docker run -it --name ssh-container ubuntu:20.04 bash

Install the OpenSSH server in the container

root@learnoffsec:/# apt-get update
root@learnoffsec:/# apt-get install -y openssh-server

Create an SSH key for the root user in the container

root@learnoffsec:/# ssh-keygen -t rsa -f /root/.ssh/id_rsa

Configure the OpenSSH server to allow root login

root@learnoffsec:/# sed -i 's/^#PermitRootLogin.*/PermitRootLogin yes/' /etc/ssh/sshd_config
root@learnoffsec:/# service ssh restart

Exit the container and save it as an image

root@learnoffsec:/# exit
$ docker commit ssh-container ssh-image

In this example, we start a new container from the ubuntu:20.04 base image and install the OpenSSH server in the container using the apt-get command. Then, we create an SSH key for the root user in the container and configure the OpenSSH server to allow root login. Finally, we exit the container & save it as an image using the docker commit command.

With this setup, you can now use SSH to access the container by specifying the IP address or hostname of the Docker host and the port number of the SSH server. For example;

$ ssh root@192.168.xxx.xxx -p 22

Note that this is just a basic example for setting up SSH access in a Docker container & there are many other configuration options and security considerations that you should take into account for a production environment.

Vulnerabilities of Data Leak via Docker Images

Docker images can contain sensitive data, such as secret keys, passwords, or confidential information that can be leaked if the images are not properly secured. Here are some common vulnerabilities that can lead to sensitive data leaks via Docker images;

Accidental inclusion of sensitive information in images:
Sensitive data can be accidentally included in images if developers forget to remove it from the source code or configuration files. For example; the following code shows how a secret key can be accidentally included in an image;

FROM node:14
WORKDIR /app
COPY . .
ENV API_KEY=secretkey123
CMD [ "npm", "start" ]

Use of untrusted or outdated base images:
Base images, such as those pulled from public repositories, can contain vulnerabilities that can be exploited by attackers to access sensitive data. For example; an outdated version of an operating system base image can have a known vulnerability that an attacker can exploit to gain access to the data in the image..

Misconfigured storage volumes and environment variables:
Misconfigured storage volumes or environment variables can expose sensitive data to unauthorized users. For instance; the following code shows how a misconfigured storage volume can expose sensitive information;

Docker Compose file

version: '3' 
services:
  app:
    image: myapp:latest
    volumes:
      - ./secrets.env:/app/secrets.env
    environment:
      - API_KEY=${API_KEY}
...

Inadequate protection of images during distribution and deployment:
Images can be intercepted and modified during distribution and deployment, potentially exposing sensitive data to attackers. For example; an attacker can intercept an image during deployment & add a malicious code that can steal sensitive data.

Unsecured communication between containers:
Communication between containers can be unsecured if encryption is not used, potentially exposing sensitive data to attackers. For example; the following code shows how sensitive data can be leaked if communication between containers is not encrypted:

Docker Compose file

version: '3'
services:
  app:
    image: myapp:latest
    environment:
      - API_KEY=${API_KEY}
  db:
    image: postgres:latest
    environment:
      - POSTGRES_PASSWORD=password123
    ports:
      - "5432:5432"
...

Use of images from untrusted sources:
Images from untrusted sources, such as public repositories can contain malicious code that can steal sensitive data. For example; an attacker can create an image with malicious code & publish it on a public repository.
Insufficient image signing and verification:
Images can be tampered with during distribution and deployment if image signing and verification are not used. For example; the following code shows how an attacker can modify an image during deployment if image signing and verification are not used:

Deployment Script

docker pull myapp:latest 
docker run -d myapp:latest

Inadequate logging and monitoring:
Insufficient logging and monitoring can make it difficult to detect and prevent sensitive data leaks. For example; the following code shows how an attacker can steal sensitive data if logging & monitoring are not used:

FROM node:14
WORKDIR /app
COPY . .
ENV API_KEY=secretkey123
CMD [ "npm", "start" ]
...

It is important to take proactive measures to prevent these and other vulnerabilities from leading to sensitive data leak via Docker images. This can be achieved by following best practices for secure image building and deployment, using tools for detecting and mitigating sensitive data leaks & implementing robust logging & monitoring systems.

Inadequate Image Signing and Verification:
Docker images can be signed and verified to ensure their integrity and authenticity. However, if the signing and verification process is not done properly, malicious images can be introduced leading to sensitive data leaks via docker images.

Pull an unsigned image

docker pull alpine

Run the unsigned image

docker run alpine echo "This is an unsigned image"

Output

This is an unsigned image 
...

Inadequate Image Layers Permissions:
Docker images consist of multiple layers that can be stacked on top of each other. If the permissions on these layers are not properly set, sensitive data can be leaked via docker images.

FROM alpine    # Create a Dockerfile with sensitive data
RUN echo "mysecret: 12345678" > /etc/secret_file

docker build -t myimage .    # Build the Docker image

docker history myimage    # Inspect the Docker image layers

# Output
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
15b1dc58c61a        3 minutes ago       /bin/sh -c echo "mysecret: 12345678" > /etc/   5B                 
<missing>           3 minutes ago       /bin/sh -c #(nop)  CMD [ "/bin/sh" ]           0B                  
<missing>           3 minutes ago       /bin/sh -c #(nop) ADD file:b7c6dc547f7c32a…   5.55MB  
...

Preventing Sensitive Data Leak via Docker Images

Best Practices for Handling Sensitive Information

To prevent sensitive data leaks via Docker images, it’s important to follow best practices for handling sensitive information. These include:

Keeping sensitive files, such as configuration files, outside of images.
Storing sensitive information in encrypted forms, such as using encrypted environment variables.
Ensuring that sensitive information is only accessible by authorized users and processes.
Here’s an example of how to use encrypted environment variables:

Use encrypted environment variables

FROM alpine:3.12
ENV SECRET_KEY="encrypted_value"
CMD ["echo", "$SECRET_KEY"]
...

Examples of Secure Image Building & Deployment Workflows

Here are a few examples of secure image-building and deployment workflows:

Building and deploying Docker images using a secure, private registry: In this workflow, images are built using a secure, private registry, such as Docker Hub or Google Container Registry & are then deployed using an automated process.

docker build -t myimage:1.0 .    # Build the Docker image

docker push myimage:1.0    # Push the Docker image to a secure, private registry

kubectl apply -f deployment.yml    # Deploy the Docker image using an automated process
...

Building and deploying Docker images using image signing and verification: In this workflow, images are signed using a private key & then verified using a public key to ensure the authenticity of the images.

docker build -t myimage:1.0 .    # Build the Docker image

docker sign myimage:1.0    # Sign the Docker image using a private key

docker verify myimage:1.0    # Verify the Docker image using a public key

kubectl apply -f deployment.yml    # Deploy the Docker image
...

Strategies for Secure Image Building & Distribution

To ensure secure image building and distribution, it’s important to;

Use trusted base images that are updated regularly.
Verify the integrity & authenticity of images before deploying them.
Sign images using a digital signature to ensure that they have not been tampered with.
Use a secure distribution channel, such as a private registry, to distribute images.

Tools for Detecting & Mitigating Sensitive Data Leaks

There are several tools available for detecting and mitigating sensitive data leaks in Docker images. These include:

Docker Security Scanning: This tool scans images for known vulnerabilities and security issues and provides remediation advice to help fix the issues.
Aqua Security: This tool provides a comprehensive solution for securing Docker images, including image signing, runtime protection, and security scans.
Sysdig: This tool provides a complete security solution for Docker containers, including real-time threat detection, security scans & remediation advice.

More Prevention/Mitigation Techniques

Multi-Stage Image

One mitigation strategy for sensitive data leaks via Docker images is the use of multi-stage builds. Multi-stage builds allow you to create multiple images within a single Dockerfile, and then copy the necessary files from one stage to another. This way, you can keep sensitive data out of the final image, reducing the risk of data leaks.
Here’s an example of how to create a multi-stage build in Docker;

FROM node:14 as build-stage
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build

FROM node:14-alpine as production-stage
WORKDIR /app
COPY --from=build-stage /app/dist /app
CMD [ "npm", "start" ]
...

In this example; the first stage (build-stage) uses the node:14 image and performs the build steps, such as copying the source code and running the npm install and build commands. The second stage (production stage) uses the node:14-alpine image, which is a smaller and more secure image, and copies only the necessary files (dist directory) from the build stage. The final image will not contain any sensitive data or build artifacts, reducing the risk of data leaks.

Use Encrypted Environment Variables

Environment variables are a common way to store sensitive information, such as passwords and API keys. Encrypting these variables before they are passed to a Docker container can prevent them from being leaked.

docker run -e ENCRYPTED_API_KEY=<encrypted_api_key> myimage
...

Limit Image Distribution

Only distribute Docker images to trusted sources and make sure that access to images is properly controlled & audited. Use tools like Docker Notary to sign images and verify their authenticity.

docker push myimage:latest    # Docker push command
...

Scan Images for Vulnerabilities

Use tools like Docker Security Scanning to scan images for known vulnerabilities and security issues. This can help detect issues before they become a problem.

docker security scan myimage:latest    # Docker security scan command
...

Regularly Update Base Images

Make sure to use updated and secure base images for your builds, and regularly check for and apply updates. Outdated images can contain known vulnerabilities that can be exploited.

docker pull node:14-alpine    # Docker pull command
...

By following these mitigation strategies, you can help prevent sensitive data leaks via Docker images and ensure the security and confidentiality of your data.

Summary

In this blog post, we have discussed the problem of sensitive data leaks via Docker images, which can occur due to the accidental inclusion of sensitive information in images, the use of untrusted or outdated base images, misconfigured storage volumes & environment variables and inadequate protection of images during distribution & deployment.