DOCKERFILE

A dockerfile is a simple text file which does not have an extension which contains a series of simple instructions, which define how to build a docker image, what base image to use, what files to copy, what commands to run and how the containers should start. Writing a Dockerfile is like scripting the blueprint for a containerized application which tells Docker exactly how to build an image step by step.

INSTRUCTIONPURPOSE OF THE INSTRUCTION
FROMsets the basic image
WORKDISets the working directory inside the container
COPYCopies files from host to container
RUNExecutes commands during image build
ENVSets environment variables
EXPOSE Documents the port the container listens on
USERSets the user to run the container
CMDSets the default command to run

TIPS TO REMEMBER

  • Minimize layers by combining commands:
    RUN apt-get update && apt-get install -y curl
  • Pin versions of dependencies to ensure consistent builds.
  • Use .dockerignore to exclude unnecessary files (like .git, __pycache__, etc.)
  • Use non-root users for better security (USER instruction).
  • Keep images small by using slim base images (e.g., python:3.12-slim).

ARG AND ENV

ARG refers to argument. In a Dockerfile, ARG defines a build-time variable—a value available only during the image build process. It allows you to customize image builds by passing parameters using the --build-arg flag. Unlike ENV, ARG values do not persist in the final image. You can use ARG to control things like base image versions or conditional logic during builds. It’s especially useful for creating flexible and reusable Dockerfiles.

ENV stands for Environment Variable in Docker. The ENV instruction in a Dockerfile sets environment variables that persist in the built image and are available at runtime. These variables can be used by applications running inside the container. You define them like ENV PORT=8080, and they’re accessible via $PORT in scripts or app configs. Unlike ARG, ENV values remain in the final image. They can also be overridden when running a container using -e. This makes ENV ideal for configuration settings like ports, API keys, or environment modes.

DIFFERENCE BETWEEN ARG & ENV

FEATURESARG (ARGUMENT)ENV (ENVIRONMENT VARIABLE)
SCOPEbuild time onlyruntime and build-time
PERSESTANCEis not saved in the final imagesaved in the image and available to containers
DEFAULT VALUEARG VERSION=1.0ENV MODE=production
OVERRIDE METHOD--build-arg during docker build-e flag during docker run
USE CASE Customize builds (e.g., base image version)Configure app behavior (e.g., ports, modes)
SECURITYSafer for secrets (not persisted)Less secure—values remain in image layers

MULTI-STAGE BUILDS

Multi-stage builds in Docker let you use multiple FROM statements in a single Dockerfile to create separate build stages. This allows you to compile or build your app in one stage and copy only the final output into a clean, minimal image. It helps reduce image size, improve security, and keep Dockerfiles organized. You can name stages and selectively copy artifacts using COPY --from=. It’s ideal for production-ready containers without unnecessary build tools or files.

REASONS TO USE MULTI-STAGE BUILDS-

  • Better Security– Fewer packages mean a smaller attack surface.
  • Cleaner Dockerfiles– No need for external scripts or complex cleanup commands.
  • smaller images- only the essentials go into the final image, no compilers, build tools or temp files.
  • Improved Caching– Each stage can be cached independently, speeding up rebuilds.

ADVANTAGES OF USING MULTI-STAGE BUILDS-

  1. Improved Security: By excluding build tools and dependencies from the final image, you reduce the attack surface.
  2. Cleaner Dockerfiles: You can separate build logic from runtime logic, making the Dockerfile easier to read and maintain.
  3. Smaller Image Size: Only the necessary artifacts are copied into the final image, reducing bloat and improving performance.
  4. Better Caching: Each stage can be cached independently, speeding up rebuilds when only part of the Dockerfile changes.
  5. No Need for External Scripts: You can handle complex build workflows entirely within a single Dockerfile.

DISADVANTAGES OF USING MULTI-STAGE BUILDS-

  1. Longer Initial Build Time: The first build may take longer due to multiple stages and dependencies being installed.
  2. Debugging Challenges: Troubleshooting issues across stages can be harder, especially if intermediate stages are not preserved.
  3. Limited Visibility: Intermediate stages are discarded unless explicitly targeted, which can make it harder to inspect build artifacts.
  4. Increased Complexity: Managing multiple stages and copying artifacts between them can be confusing for beginners.
  5. Compatibility Issues: Some older Docker versions or third-party tools may not fully support multi-stage builds.

DOCKER IMAGE LAYERS

Docker images are built in layers, where each Dockerfile instruction (like FROM, RUN, COPY) creates a new read-only layer. These layers stack on top of each other to form the final image. They are immutable and reusable across different images. Docker uses a union filesystem to present them as a single coherent view. This layered approach improves modularity, efficiency, and version control.

FEATURES OF DOCKER IMAGE LAYERS-

  1. Layer per Instruction: Every Dockerfile instruction (FROM, RUN, COPY, etc.) creates a new layer.
  2. Immutable Structure: Each layer is read-only and cannot be changed once created, ensuring consistency across builds.
  3. Layer Reuse: Common base layers (like OS or language runtimes) can be reused across multiple images, saving space.
  4. Union Filesystem: Layers are stacked using a union filesystem, presenting a single coherent view to the container.
  5. Efficient Distribution: Only changed layers are downloaded when pulling updated images, reducing bandwidth usage.

DOCKER CACHING

Docker caching stores previously built image layers to speed up future builds. If a Dockerfile instruction and its context haven’t changed, Docker reuses the cached layer. Once a layer changes, all subsequent layers are rebuilt. This makes build times faster and more efficient. Proper layer ordering in Dockerfiles helps maximize cache reuse.

FEATURES OF DOCKER CACHING-

  1. Build Acceleration: Speeds up image builds by skipping unchanged layers, especially useful in iterative development.
  2. Layer Invalidation Logic: If one layer changes, all subsequent layers are rebuilt—ordering matters!
  3. Context Sensitivity: Cache is sensitive to changes in files, environment variables, and build arguments.
  4. Instruction-Level Caching: Docker caches each instruction’s result, reusing it if the command and context haven’t changed.
  5. Custom Cache Control: Advanced features like --no-cache, --build-arg, and BuildKit cache mounts give fine-grained control.

Leave a Reply

Your email address will not be published. Required fields are marked *