Getting started with Docker: the basics I wish I knew
Docker confused me for longer than I would like to admit. The documentation throws terms like images, containers, volumes, networks, and layers at you without explaining why any of it matters. Here is what I wish someone had told me when I started.
What Docker actually is
Docker runs applications in isolated environments called containers. A container packages an application with everything it needs: the code, runtime, libraries, and system tools. It runs the same way on your laptop, your server, and your coworker's machine.
Think of it as a lightweight virtual machine, except it shares the host operating system's kernel instead of running its own. This makes containers start in seconds and use a fraction of the resources a VM would.
Images vs containers
An image is a template. A container is a running instance of that template. You build an image once and create as many containers from it as you want.
# Pull an image from Docker Hub
docker pull nginx
# Create and start a container from that image
docker run -d -p 8080:80 nginx
# List running containers
docker psThe -d flag runs the container in the background. The -p 8080:80 flag maps port 8080 on your machine to port 80 inside the container.
The Dockerfile
A Dockerfile is a recipe for building an image. Here is a simple one for a Node.js app:
FROM node:22-slim
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install
COPY . .
EXPOSE 3000
CMD ["pnpm", "start"]Build it:
docker build -t my-app .
docker run -d -p 3000:3000 my-appEach line in the Dockerfile creates a layer. Docker caches layers, so if you change your application code but not your dependencies, only the COPY . . layer and everything after it rebuilds. This is why the dependency install comes before copying the source code. Put things that change least at the top of the file.
Multi-stage builds
The Dockerfile above has a problem: the final image contains everything needed to build the app, including dev dependencies, build tools, and source files that are not needed at runtime. Multi-stage builds fix this.
A multi-stage build uses multiple FROM statements. Each one starts a fresh stage with its own filesystem. You copy only what you need from earlier stages into the final image.
# Stage 1: Build with all dependencies
FROM node:22-alpine AS builder
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install
COPY . .
RUN pnpm build
# Stage 2: Production dependencies only
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install --prod
# Stage 3: Minimal runtime image
FROM node:22-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
USER appuser
CMD ["node", "dist/index.js"]The key insight for Node.js: pnpm install --prod in the build stage would break the TypeScript compilation that needs dev dependencies. So you separate "build with everything" from "run with production deps only." The final image has no TypeScript compiler, no test framework, no source files. Typical size reduction: from 1.2GB (single-stage with full node image) to 120-180MB.
For Go, multi-stage builds are even more dramatic. Go produces static binaries, so your final image can use scratch (a completely empty image) and end up at 10-15MB.
Build cache optimization
Understanding Docker's layer cache saves a lot of build time. Docker caches each layer and reuses it if the instruction and its inputs have not changed. But when a layer's cache invalidates, every subsequent layer rebuilds too.
This is why ordering matters. Put things that change rarely at the top:
# 1. Base image (rarely changes)
FROM node:22-alpine
# 2. System dependencies (changes occasionally)
RUN apk add --no-cache dumb-init
# 3. Package manifest (changes when deps change)
COPY package.json pnpm-lock.yaml ./
# 4. Install dependencies (cached if manifests unchanged)
RUN npm install -g pnpm && pnpm install
# 5. Source code (changes most frequently - goes LAST)
COPY . .
RUN pnpm buildIf only your source code changes, steps 1-4 are cached and only step 5 runs. This can save 2-3 minutes per build.
For even better caching, BuildKit (the default since Docker Engine v23.0) supports cache mounts that persist package manager caches between builds:
RUN --mount=type=cache,target=/root/.npm npm ciEven when the layer cache invalidates (because package.json changed), the npm download cache is preserved. Packages that have not changed do not need to be re-downloaded.
Volumes: persistent data
By default, data inside a container is lost when the container stops. Volumes solve this:
# Named volume (Docker manages the storage)
docker run -d -v mydata:/var/lib/postgresql/data postgres
# Bind mount (you specify the host path)
docker run -d -v ./config:/etc/nginx/conf.d nginxUse named volumes for data the application manages (databases, uploads). Use bind mounts for config files you want to edit on the host.
Networking
Every container gets its own network namespace. By default, containers connect to a bridge network. The important thing to know: user-defined networks provide DNS resolution between containers. If two containers are on the same user-defined network, they can reach each other by container name.
The default bridge network does not provide this DNS resolution (a common gotcha). Docker Compose creates a user-defined network automatically, which is one reason everything "just works" in Compose but not with raw docker run.
Docker Compose
Running multiple containers with docker run commands gets tedious fast. Docker Compose lets you define everything in a YAML file:
services:
web:
build: .
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgres://user:pass@db:5432/myapp
depends_on:
db:
condition: service_healthy
db:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=pass
healthcheck:
test: ["CMD", "pg_isready", "-U", "postgres"]
interval: 10s
timeout: 5s
retries: 5
volumes:
pgdata:Start everything with docker compose up -d. Stop everything with docker compose down.
Notice the health check on the database. Without it, depends_on only waits for the container to start, not for the database to be ready. With condition: service_healthy, the web service waits until the database is actually accepting connections. This eliminates the "connection refused" errors that plague naive Docker setups.
Compose profiles
Compose profiles let you define optional services that only start when requested:
services:
web:
build: .
# No profile = always starts
db:
image: postgres:16
# No profile = always starts
prometheus:
image: prom/prometheus
profiles: ["monitoring"]
grafana:
image: grafana/grafana
profiles: ["monitoring"]docker compose up # starts web + db only
docker compose --profile monitoring up # starts web + db + prometheus + grafanaThis keeps your default docker compose up fast while making monitoring, debugging, or testing tools available when you need them.
Compose watch
docker compose watch is a newer feature that synchronizes file changes into running containers without rebuilding:
services:
web:
build: .
develop:
watch:
- action: sync
path: ./src
target: /app/src
- action: rebuild
path: ./package.jsonSource code changes get synced instantly (for hot-reloading frameworks). Dependency changes trigger a full rebuild. This gives you a much tighter development loop than manually rebuilding containers.
Health checks
A container can have a running process but be completely unresponsive. Without health checks, docker ps shows "Up" and everything looks fine, but your application is broken.
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD node -e "fetch('http://localhost:3000/health').then(r => { if (!r.ok) throw r; })"The parameters: check every 30 seconds, fail if the check takes longer than 3 seconds, allow 10 seconds of startup grace period, and mark unhealthy after 3 consecutive failures.
For databases:
HEALTHCHECK --interval=10s --timeout=5s --start-period=30s --retries=3 \
CMD pg_isready -U myuser -d mydb || exit 1Orchestrators like Docker Swarm automatically restart unhealthy containers. Even without an orchestrator, health checks give you visibility into whether your services are actually working.
Common mistakes I made
Not using .dockerignore. Without it, COPY . . copies node_modules, .git, and everything else into the image. Create a .dockerignore file:
node_modules
.git
.env
dist
Running as root. By default, containers run as root. If the container is compromised, the attacker has root access. Always add a non-root user for production:
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuserNot pinning image versions. FROM node:latest means your build might break when a new version releases. Use specific versions like node:22-slim.
Logging to files instead of stdout. Docker captures stdout and stderr and routes them through its logging system. If your app writes logs to files inside the container, Docker's logging never sees them, and the logs disappear when the container is removed. Configure your app to log to stdout.
Ignoring the PID 1 problem. The first process in a container runs as PID 1. On Linux, PID 1 does not get default signal handlers. If Docker sends SIGTERM to stop your container, your app might not receive it, leading to a forced SIGKILL after the timeout (default 10 seconds).
Fix this by using tini or dumb-init:
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/index.js"]Or use the exec form in CMD/ENTRYPOINT (square brackets, not a string), which makes your app PID 1 directly.
Choosing the wrong base image. The size and security differences are significant:
| Image | Size (Node 22) | Notes |
|---|---|---|
node:22 (full) | ~1.1 GB | Everything works, huge attack surface |
node:22-slim | ~220 MB | Minimal packages, good for most apps |
node:22-alpine | ~150 MB | Smallest common choice, uses musl libc |
| Distroless | ~50 MB | No shell, strongest security, hard to debug |
Alpine uses musl instead of glibc, which can cause compatibility issues with native Node.js modules (bcrypt, sharp, etc.). Slim is the safest default for production. Alpine is fine if you test your dependencies and they work.
Scanning for vulnerabilities
Your base image comes with an operating system, and operating systems have CVEs. Scan your images regularly:
# Trivy (free, open source)
trivy image myapp:latest --severity HIGH,CRITICAL
# In CI/CD, fail the build on critical vulnerabilities
trivy image myapp:latest --severity CRITICAL --exit-code 1Trivy scans in about 25 seconds per GB of image. Run it in CI on every build. It catches things like outdated OpenSSL versions in your base image that you would never notice manually.
Docker alternatives
If Docker Desktop's resource usage bothers you (2GB+ RAM baseline on macOS), there are lighter alternatives:
- OrbStack (macOS): Containers start up to 10x faster, uses 200-400MB RAM. Free for personal use.
- Colima (macOS/Linux): Terminal-only, ~400MB RAM. Free and open source.
- Podman: Daemonless, rootless by default. 100% Docker CLI compatible. Backed by Red Hat.
All of these run the same container images. You can alias docker=podman and most workflows work unchanged.
When to use Docker
Docker is worth using whenever your application has dependencies that are annoying to install locally or when you want consistent environments across machines. For web development, Docker Compose makes it trivial to run a database, cache, and other services alongside your app without installing PostgreSQL, Redis, or anything else on your host machine.
If you are just starting, docker init scaffolds a production-quality Dockerfile for your project. Run it in your project directory and it detects the language, asks a few questions, and generates a Dockerfile with multi-stage builds, non-root users, and proper layer ordering.
Sources
Related posts
Automating workflows with n8n
How I use n8n as a self-hosted alternative to Zapier for connecting services and automating repetitive tasks.
Self-hosting with Docker Compose: lessons learned
Practical patterns and mistakes from running self-hosted services with Docker Compose.
How Docker image layers actually work under the hood
A deep dive into Docker image layers, union filesystems, content-addressable storage, copy-on-write, and why understanding this stuff makes you better at writing Dockerfiles.
Enjoying the blog? Subscribe via RSS to get new posts in your reader.
Subscribe via RSS