GitLab CI/CD
Language Overview¶
GitLab CI/CD is a continuous integration and deployment tool built into GitLab. It uses .gitlab-ci.yml files
to define pipelines that automatically build, test, and deploy code. This guide covers GitLab CI/CD best practices
for creating maintainable, efficient pipelines.
Key Characteristics¶
- File Name:
.gitlab-ci.yml - Format: YAML
- Primary Use: CI/CD pipelines, automated testing, deployment automation
- Key Concepts: Pipelines, stages, jobs, runners, artifacts, cache
Quick Reference¶
| Category | Convention | Example | Notes |
|---|---|---|---|
| File Naming | |||
| Pipeline Config | .gitlab-ci.yml |
.gitlab-ci.yml |
At repository root |
| Pipeline Structure | |||
stages |
Pipeline stages | stages: [build, test, deploy] |
Ordered execution phases |
image |
Docker image | image: node:20-alpine |
Default container image |
before_script |
Pre-job commands | Setup commands | Runs before each job |
after_script |
Post-job commands | Cleanup commands | Runs after each job |
| Job Definition | |||
| Job Name | job_name: |
build_app: |
Descriptive job name |
stage |
Job stage | stage: build |
Which stage job belongs to |
script |
Commands to run | script: - npm install |
Required commands |
only / except |
Branch filters | only: [main] |
When job runs (legacy) |
rules |
Conditional logic | rules: - if: $CI_COMMIT_BRANCH |
Modern conditional execution |
| Artifacts | |||
artifacts |
Save files | paths: [dist/] |
Persist build outputs |
expire_in |
Artifact retention | expire_in: 1 week |
Auto-cleanup |
reports |
Test reports | reports: junit: report.xml |
Test result integration |
| Cache | |||
cache |
Cache dependencies | paths: [node_modules/] |
Speed up builds |
key |
Cache key | key: $CI_COMMIT_REF_SLUG |
Cache versioning |
| Variables | |||
| Predefined | $CI_COMMIT_SHA |
GitLab-provided variables | Built-in vars |
| Custom | variables: |
NODE_ENV: production |
User-defined vars |
| Protected | Masked variables | Secure secrets | Settings > CI/CD |
| Best Practices | |||
| Stages | Logical grouping | [build, test, deploy] |
Clear pipeline flow |
| Docker Images | Pin versions | node:20.10.0-alpine |
Avoid latest |
| Rules | Use rules: |
Replace only/except |
Modern syntax |
| Cache | Speed up builds | Cache dependencies | Reduce build time |
Basic Pipeline Structure¶
Simple Pipeline¶
stages:
- build
- test
- deploy
variables:
NODE_VERSION: "18"
build_job:
stage: build
image: node:22-alpine
script:
- npm ci
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 day
test_job:
stage: test
image: node:22-alpine
script:
- npm ci
- npm test
coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
deploy_job:
stage: deploy
image: alpine:latest
script:
- echo "Deploying to production"
- ./deploy.sh
only:
- main
Stages¶
Define Stages¶
## Stages execute in order
stages:
- build
- test
- package
- deploy
- cleanup
## Jobs in same stage run in parallel
## Jobs in next stage wait for previous stage to complete
Jobs¶
Job Configuration¶
job_name:
stage: build
image: node:22-alpine
tags:
- docker
before_script:
- echo "Preparing environment"
script:
- npm ci
- npm run build
after_script:
- echo "Cleaning up"
only:
- main
- develop
except:
- tags
when: on_success
allow_failure: false
timeout: 1h
retry:
max: 2
when:
- runner_system_failure
- stuck_or_timeout_failure
Variables¶
Global Variables¶
variables:
POSTGRES_DB: "testdb"
POSTGRES_USER: "testuser"
POSTGRES_PASSWORD: "testpass"
NODE_ENV: "production"
DOCKER_DRIVER: overlay2
GIT_STRATEGY: clone
GIT_DEPTH: "50"
Job Variables¶
deploy_staging:
stage: deploy
variables:
DEPLOY_ENV: "staging"
API_URL: "https://staging.example.com"
script:
- echo "Deploying to $DEPLOY_ENV"
- ./deploy.sh
Protected Variables¶
Set in GitLab UI under Settings > CI/CD > Variables:
deploy_production:
stage: deploy
script:
- echo "API Key: $API_KEY" # From protected variable
- ./deploy.sh
only:
- main
Artifacts¶
Basic Artifacts¶
build:
stage: build
script:
- npm run build
artifacts:
paths:
- dist/
- build/
expire_in: 1 week
Artifact with Reports¶
test:
stage: test
script:
- npm test
artifacts:
when: always
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
paths:
- coverage/
expire_in: 30 days
Download Artifacts from Another Job¶
deploy:
stage: deploy
dependencies:
- build
script:
- ls dist/ # Artifacts from build job
- ./deploy.sh
Cache¶
NPM Cache¶
.node_cache: &node_cache
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .npm/
test:
<<: *node_cache
stage: test
script:
- npm ci --cache .npm
- npm test
Global Cache¶
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .npm/
policy: pull-push
build:
stage: build
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
policy: pull
script:
- npm ci
- npm run build
Docker-in-Docker (DinD)¶
Building Docker Images¶
build_image:
stage: build
image: docker:latest
services:
- docker:dind
variables:
DOCKER_TLS_CERTDIR: "/certs"
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE:latest
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- docker push $CI_REGISTRY_IMAGE:latest
only:
- main
Conditional Execution¶
Only/Except¶
## Run only on specific branches
deploy_production:
stage: deploy
script:
- ./deploy-prod.sh
only:
- main
## Run except on tags
test:
stage: test
script:
- npm test
except:
- tags
## Run only on merge requests
mr_check:
stage: test
script:
- npm run lint
only:
- merge_requests
Rules (Preferred)¶
deploy:
stage: deploy
script:
- ./deploy.sh
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: always
- if: $CI_COMMIT_BRANCH == "develop"
when: manual
- when: never
test:
stage: test
script:
- npm test
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_BRANCH == "develop"
Templates and Includes¶
Include External Files¶
include:
- local: '.gitlab/ci/build.yml'
- local: '.gitlab/ci/test.yml'
- local: '.gitlab/ci/deploy.yml'
- template: Security/SAST.gitlab-ci.yml
- remote: 'https://example.com/ci-templates/docker.yml'
Anchor and Aliases (YAML)¶
.job_template: &job_definition
image: node:22-alpine
before_script:
- npm ci
retry:
max: 2
test:
<<: *job_definition
stage: test
script:
- npm test
build:
<<: *job_definition
stage: build
script:
- npm run build
Extends¶
.base_job:
image: node:22-alpine
before_script:
- npm ci
retry:
max: 2
test:
extends: .base_job
stage: test
script:
- npm test
build:
extends: .base_job
stage: build
script:
- npm run build
Parallel Jobs¶
Matrix Jobs¶
test:
stage: test
parallel:
matrix:
- NODE_VERSION: ["16", "18", "20"]
OS: ["ubuntu-latest", "alpine"]
image: node:${NODE_VERSION}-${OS}
script:
- npm ci
- npm test
Simple Parallel¶
test:
stage: test
parallel: 3
script:
- npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
Services¶
PostgreSQL Service¶
test:
stage: test
image: node:22-alpine
services:
- postgres:15-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
script:
- npm ci
- npm run test:db
Multiple Services¶
integration_test:
stage: test
services:
- postgres:15-alpine
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: testpass
REDIS_URL: redis://redis:6379
script:
- npm run test:integration
Multi-Project Pipelines¶
Trigger Downstream Pipeline¶
trigger_deploy:
stage: deploy
trigger:
project: mygroup/deployment-project
branch: main
strategy: depend
only:
- main
Parent-Child Pipelines¶
generate_child:
stage: build
script:
- echo "Generating child pipeline config"
- ./generate-pipeline.sh > child-pipeline.yml
artifacts:
paths:
- child-pipeline.yml
trigger_child:
stage: deploy
trigger:
include:
- artifact: child-pipeline.yml
job: generate_child
Complete Pipeline Example¶
Full-Stack Application Pipeline¶
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_BRANCH == "develop"
stages:
- build
- test
- security
- package
- deploy
variables:
DOCKER_DRIVER: overlay2
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
## Reusable templates
.node_base:
image: node:22-alpine
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .npm/
before_script:
- npm ci --cache .npm
## Build stage
build_frontend:
extends: .node_base
stage: build
script:
- cd frontend
- npm run build
artifacts:
paths:
- frontend/dist/
expire_in: 1 day
build_backend:
extends: .node_base
stage: build
script:
- cd backend
- npm run build
artifacts:
paths:
- backend/dist/
expire_in: 1 day
## Test stage
test_frontend:
extends: .node_base
stage: test
script:
- cd frontend
- npm test -- --coverage
coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: frontend/coverage/cobertura-coverage.xml
paths:
- frontend/coverage/
expire_in: 30 days
test_backend:
extends: .node_base
stage: test
services:
- postgres:15-alpine
variables:
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
script:
- cd backend
- npm test -- --coverage
coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: backend/coverage/cobertura-coverage.xml
lint:
extends: .node_base
stage: test
script:
- npm run lint
## Security stage
sast:
stage: security
allow_failure: true
dependency_scanning:
stage: security
allow_failure: true
## Package stage
build_docker_images:
stage: package
image: docker:latest
services:
- docker:dind
variables:
DOCKER_TLS_CERTDIR: "/certs"
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker build -t $CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA frontend/
- docker build -t $CI_REGISTRY_IMAGE/backend:$CI_COMMIT_SHA backend/
- docker push $CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA
- docker push $CI_REGISTRY_IMAGE/backend:$CI_COMMIT_SHA
only:
- main
- develop
## Deploy stage
deploy_staging:
stage: deploy
image: alpine:latest
before_script:
- apk add --no-cache curl
script:
- echo "Deploying to staging"
- ./deploy-staging.sh
environment:
name: staging
url: https://staging.example.com
rules:
- if: $CI_COMMIT_BRANCH == "develop"
deploy_production:
stage: deploy
image: alpine:latest
before_script:
- apk add --no-cache curl
script:
- echo "Deploying to production"
- ./deploy-production.sh
environment:
name: production
url: https://example.com
when: manual
only:
- main
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
Testing¶
Testing Pipelines Locally¶
Use GitLab Runner to test pipelines locally before committing:
## Install GitLab Runner
# macOS
brew install gitlab-runner
# Linux
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash
sudo apt-get install gitlab-runner
## Test pipeline locally
gitlab-runner exec docker test
## Test specific job
gitlab-runner exec docker build
## Test with specific Docker image
gitlab-runner exec docker --docker-image node:22-alpine test
Validating CI Configuration¶
Validate .gitlab-ci.yml syntax:
## Using GitLab CI Lint API
curl --header "PRIVATE-TOKEN: <your_access_token>" \
--header "Content-Type: application/json" \
--data @.gitlab-ci.yml \
"https://gitlab.com/api/v4/projects/<project_id>/ci/lint"
## Using gitlab-ci-lint tool
npm install -g gitlab-ci-lint
gitlab-ci-lint .gitlab-ci.yml
Pipeline Testing Job¶
Add pipeline validation as a job:
## .gitlab-ci.yml
stages:
- validate
- test
- build
- deploy
validate:pipeline:
stage: validate
image: alpine:latest
before_script:
- apk add --no-cache yamllint
script:
- yamllint .gitlab-ci.yml
- echo "Pipeline configuration is valid"
only:
changes:
- .gitlab-ci.yml
validate:dockerfile:
stage: validate
image: hadolint/hadolint:latest-alpine
script:
- hadolint Dockerfile
only:
changes:
- Dockerfile
Unit Testing in CI¶
test:unit:
stage: test
image: node:22-alpine
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
before_script:
- npm ci
script:
- npm run test:unit
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
artifacts:
when: always
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
paths:
- coverage/
expire_in: 30 days
only:
- merge_requests
- main
Integration Testing¶
test:integration:
stage: test
image: node:22-alpine
services:
- name: postgres:15-alpine
alias: postgres
variables:
POSTGRES_DB: test_db
POSTGRES_USER: test_user
POSTGRES_PASSWORD: test_pass
DATABASE_URL: postgresql://test_user:test_pass@postgres:5432/test_db
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
before_script:
- npm ci
script:
- npm run test:integration
artifacts:
when: always
reports:
junit: integration-test-results.xml
expire_in: 7 days
End-to-End Testing¶
test:e2e:
stage: test
image: mcr.microsoft.com/playwright:latest
services:
- name: selenium/standalone-chrome:latest
alias: chrome
variables:
SELENIUM_HOST: chrome
SELENIUM_PORT: 4444
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- playwright/.cache
before_script:
- npm ci
- npx playwright install
script:
- npm run test:e2e
artifacts:
when: always
paths:
- test-results/
- playwright-report/
expire_in: 7 days
only:
- merge_requests
- main
Security Testing¶
## SAST (Static Application Security Testing)
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
- template: Security/Secret-Detection.gitlab-ci.yml
## Container Scanning
container_scanning:
stage: test
image: docker:latest
services:
- docker:dind
variables:
DOCKER_DRIVER: overlay2
CI_APPLICATION_REPOSITORY: $CI_REGISTRY_IMAGE
CI_APPLICATION_TAG: $CI_COMMIT_SHA
script:
- docker build -t $CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG .
- |
docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy:latest \
image --exit-code 1 --severity HIGH,CRITICAL \
$CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG
only:
- merge_requests
- main
Performance Testing¶
test:performance:
stage: test
image: grafana/k6:latest
script:
- k6 run --vus 10 --duration 30s tests/load-test.js
artifacts:
reports:
load_performance: k6-results.json
paths:
- k6-results.json
expire_in: 7 days
only:
- schedules
Parallel Testing¶
Speed up tests with parallel execution:
test:unit:parallel:
stage: test
image: node:22-alpine
parallel: 4
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
before_script:
- npm ci
script:
- npm run test:unit -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
artifacts:
when: always
reports:
junit: junit-shard-${CI_NODE_INDEX}.xml
expire_in: 7 days
Test Coverage Reporting¶
test:coverage:
stage: test
image: node:22-alpine
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
before_script:
- npm ci
script:
- npm run test:coverage
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
paths:
- coverage/
expire_in: 30 days
## Enforce coverage threshold
check:coverage:
stage: test
image: node:22-alpine
needs: [test:coverage]
script:
- |
COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "Coverage below 80% threshold"
exit 1
fi
Review Apps Testing¶
Test in ephemeral environments:
review:deploy:
stage: deploy
image: alpine:latest
script:
- echo "Deploying review app..."
- echo "Review app URL: https://review-$CI_COMMIT_REF_SLUG.example.com"
environment:
name: review/$CI_COMMIT_REF_SLUG
url: https://review-$CI_COMMIT_REF_SLUG.example.com
on_stop: review:stop
only:
- merge_requests
review:test:
stage: test
needs: [review:deploy]
image: curlimages/curl:latest
script:
- curl -f https://review-$CI_COMMIT_REF_SLUG.example.com/health
- echo "Review app health check passed"
only:
- merge_requests
review:stop:
stage: deploy
image: alpine:latest
script:
- echo "Destroying review app..."
environment:
name: review/$CI_COMMIT_REF_SLUG
action: stop
when: manual
only:
- merge_requests
Testing with Child Pipelines¶
Organize tests using child pipelines:
## .gitlab-ci.yml
trigger:tests:
stage: test
trigger:
include: .gitlab/ci/tests.yml
strategy: depend
## .gitlab/ci/tests.yml
stages:
- unit
- integration
- e2e
unit:tests:
stage: unit
image: node:22-alpine
script:
- npm ci
- npm run test:unit
integration:tests:
stage: integration
image: node:22-alpine
services:
- postgres:15-alpine
script:
- npm ci
- npm run test:integration
e2e:tests:
stage: e2e
image: mcr.microsoft.com/playwright:latest
script:
- npm ci
- npx playwright install
- npm run test:e2e
Conditional Testing¶
Run tests based on changes:
test:backend:
stage: test
image: python:3.11-slim
script:
- pip install -r requirements.txt
- pytest
only:
changes:
- backend/**/*
- requirements.txt
test:frontend:
stage: test
image: node:22-alpine
script:
- npm ci
- npm test
only:
changes:
- frontend/**/*
- package.json
- package-lock.json
test:infrastructure:
stage: test
image: hashicorp/terraform:latest
script:
- terraform init
- terraform validate
- terraform plan
only:
changes:
- terraform/**/*
CI/CD Pipeline Test Metrics¶
Monitor pipeline performance:
metrics:pipeline:
stage: .post
image: alpine:latest
script:
- |
echo "Pipeline Duration: $CI_PIPELINE_DURATION seconds"
echo "Pipeline Status: $CI_PIPELINE_STATUS"
echo "Failed Jobs:"
# Log failed jobs for analysis
when: always
only:
- main
Tiered Pipeline Architecture¶
Implement a three-tier testing strategy that balances speed, coverage, and confidence:
Tier 1: Fast Feedback (< 2 minutes)¶
Static analysis and linting that runs on every commit:
stages:
- validate # Tier 1: Fast feedback
- test # Tier 2: Unit tests
- integration # Tier 3: Integration tests
- deploy
# Tier 1: Static Analysis - Fast feedback on every push
lint:yaml:
stage: validate
image: cytopia/yamllint:latest
script:
- yamllint -c .yamllint.yml .
rules:
- changes:
- "**/*.yml"
- "**/*.yaml"
lint:terraform:
stage: validate
image: hashicorp/terraform:latest
script:
- terraform fmt -check -recursive
- terraform init -backend=false
- terraform validate
rules:
- changes:
- "**/*.tf"
- "**/*.tfvars"
lint:ansible:
stage: validate
image: python:3.11-slim
before_script:
- pip install ansible-lint yamllint
script:
- ansible-lint --strict
- yamllint playbooks/ roles/
cache:
key: ansible-lint
paths:
- .cache/pip
rules:
- changes:
- "**/*.yml"
- "playbooks/**/*"
- "roles/**/*"
security:static:
stage: validate
image: aquasec/tfsec:latest
script:
- tfsec . --format json --out tfsec-results.json
artifacts:
reports:
sast: tfsec-results.json
expire_in: 7 days
rules:
- changes:
- "**/*.tf"
security:secrets:
stage: validate
image: trufflesecurity/trufflehog:latest
script:
- trufflehog git file://. --fail --no-update
allow_failure: false
Tier 2: Unit Tests (< 10 minutes)¶
Module-level testing that runs on pull requests:
# Tier 2: Unit Tests - Run on merge requests
test:terraform:
stage: test
image: golang:1.24
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2375
DOCKER_TLS_CERTDIR: ""
before_script:
- cd tests
- go mod download
script:
- go test -v -timeout 20m -parallel 4 ./...
artifacts:
when: always
reports:
junit: tests/report.xml
expire_in: 7 days
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
test:ansible:
stage: test
image: python:3.11-slim
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2375
DOCKER_TLS_CERTDIR: ""
before_script:
- pip install molecule[docker] ansible-lint
script:
- molecule test
cache:
key: molecule-${CI_COMMIT_REF_SLUG}
paths:
- .cache/pip
artifacts:
when: on_failure
paths:
- molecule/default/*.log
expire_in: 7 days
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
test:unit:parallel:
stage: test
image: node:22-alpine
parallel:
matrix:
- PLATFORM: [ubuntu-22.04, debian-11, rhel-9]
before_script:
- npm ci
script:
- npm run test:unit:$PLATFORM
artifacts:
when: always
reports:
junit: junit-${PLATFORM}.xml
expire_in: 7 days
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
Tier 3: Integration & Compliance (< 60 minutes)¶
Full-stack testing that runs nightly or pre-release:
# Tier 3: Integration Tests - Nightly or on main branch
integration:full-stack:
stage: integration
image: golang:1.24
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2375
DOCKER_TLS_CERTDIR: ""
AWS_REGION: us-east-1
before_script:
- cd tests/integration
- go mod download
script:
- go test -v -timeout 60m ./...
artifacts:
when: always
reports:
junit: integration-results.xml
paths:
- tests/integration/logs/
expire_in: 30 days
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_PIPELINE_SOURCE == "schedule"
retry:
max: 1
when:
- runner_system_failure
- stuck_or_timeout_failure
compliance:security:
stage: integration
image: chef/inspec:latest
before_script:
- inspec --version
script:
- inspec exec compliance/security-baseline.rb --reporter cli json:compliance-results.json
artifacts:
reports:
junit: compliance-results.json
paths:
- compliance-results.json
expire_in: 90 days
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_PIPELINE_SOURCE == "schedule"
- if: $CI_COMMIT_TAG
Progressive Enforcement Strategy¶
Gradually introduce and enforce quality gates without disrupting development:
Phase 1: Warning Only (Weeks 1-2)¶
Start with informational feedback that doesn't block merges:
# .gitlab-ci.yml - Phase 1: Warning Only
lint:terraform:warning:
stage: validate
image: hashicorp/terraform:latest
script:
- terraform fmt -check -recursive || echo "⚠️ Terraform formatting issues detected"
- terraform init -backend=false
- terraform validate || echo "⚠️ Terraform validation failed"
allow_failure: true # Don't block merge
rules:
- if: $ENFORCEMENT_PHASE == "warning"
- if: $ENFORCEMENT_PHASE == null # Default to warning
security:scan:warning:
stage: validate
image: aquasec/tfsec:latest
script:
- tfsec . --soft-fail # Report but don't fail
allow_failure: true
artifacts:
reports:
sast: tfsec-results.json
expire_in: 30 days
Phase 2: Advisory with Auto-Fix (Weeks 3-4)¶
Provide automated fixes and merge request comments:
# Phase 2: Advisory with automated fixes
lint:terraform:advisory:
stage: validate
image: hashicorp/terraform:latest
before_script:
- apk add --no-cache git curl jq
script:
- |
# Check formatting
terraform fmt -check -recursive || {
echo "Formatting issues detected. Auto-fixing..."
terraform fmt -recursive
# Create MR comment with suggestions
COMMENT="## 🔧 Terraform Formatting\n\n"
COMMENT+="Formatting issues were detected. Run \`terraform fmt -recursive\` to fix.\n\n"
COMMENT+="<details><summary>Files affected</summary>\n\n$(terraform fmt -check -recursive 2>&1)\n</details>"
curl --request POST \
--header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
--data "body=${COMMENT}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
exit 1 # Fail but with helpful message
}
allow_failure: true # Still advisory, but failing
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event" && $ENFORCEMENT_PHASE == "advisory"
coverage:advisory:
stage: test
image: node:22-alpine
needs: [test:unit]
script:
- |
COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
COMMENT="## 📊 Code Coverage\n\n"
COMMENT+="Current coverage: ${COVERAGE}%\nTarget: 80%\n\n"
COMMENT+="⚠️ Coverage is below threshold. Consider adding more tests."
curl --request POST \
--header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
--data "body=${COMMENT}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
exit 1
fi
allow_failure: true
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event" && $ENFORCEMENT_PHASE == "advisory"
Phase 3: Strict Enforcement (Week 5+)¶
Full merge-blocking enforcement:
# Phase 3: Strict enforcement - blocks merges
lint:terraform:strict:
stage: validate
image: hashicorp/terraform:latest
script:
- terraform fmt -check -recursive
- terraform init -backend=false
- terraform validate
allow_failure: false # Block merge on failure
rules:
- if: $ENFORCEMENT_PHASE == "strict"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH # Always strict on main
security:scan:strict:
stage: validate
image: aquasec/tfsec:latest
script:
- tfsec . --minimum-severity HIGH --force-all-dirs
allow_failure: false
rules:
- if: $ENFORCEMENT_PHASE == "strict"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
coverage:strict:
stage: test
image: node:22-alpine
needs: [test:unit]
script:
- |
COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "❌ Coverage ${COVERAGE}% is below 80% threshold"
exit 1
fi
echo "✅ Coverage ${COVERAGE}% meets threshold"
allow_failure: false
rules:
- if: $ENFORCEMENT_PHASE == "strict"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
Enforcement Rollout Timeline¶
Configure progressive rollout with CI/CD variables:
# .gitlab-ci.yml - Dynamic enforcement based on timeline
variables:
ENFORCEMENT_PHASE: "warning" # Default phase
# Set enforcement phase based on date or manual trigger
.determine_phase: &determine_phase
before_script:
- |
# Automatic progression based on date
CURRENT_DATE=$(date +%s)
ROLLOUT_START=1704067200 # 2024-01-01
WEEKS_ELAPSED=$(( ($CURRENT_DATE - $ROLLOUT_START) / 604800 ))
if [ $WEEKS_ELAPSED -lt 2 ]; then
export ENFORCEMENT_PHASE="warning"
elif [ $WEEKS_ELAPSED -lt 4 ]; then
export ENFORCEMENT_PHASE="advisory"
else
export ENFORCEMENT_PHASE="strict"
fi
echo "Enforcement phase: $ENFORCEMENT_PHASE (Week $WEEKS_ELAPSED)"
Change Detection and Optimization¶
Optimize pipeline execution by testing only what changed:
Path-Based Job Execution¶
# Run jobs only when relevant files change
terraform:vpc:
stage: test
image: hashicorp/terraform:latest
script:
- cd modules/vpc
- terraform init
- terraform validate
- terraform plan
rules:
- changes:
- modules/vpc/**/*
- modules/vpc/*.tf
when: always
- when: never # Don't run if no changes
ansible:webserver:
stage: test
image: python:3.11-slim
before_script:
- pip install molecule[docker]
script:
- cd roles/webserver
- molecule test
rules:
- changes:
- roles/webserver/**/*
- playbooks/webserver.yml
when: always
- when: never
test:backend:
stage: test
script:
- pytest tests/backend/
rules:
- changes:
- backend/**/*.py
- requirements.txt
when: always
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
when: always
- when: never
Monorepo Optimization¶
Efficient testing in monorepo structures:
# .gitlab-ci.yml for monorepo
include:
- local: '/services/api/.gitlab-ci.yml'
rules:
- changes:
- services/api/**/*
- local: '/services/web/.gitlab-ci.yml'
rules:
- changes:
- services/web/**/*
- local: '/infrastructure/.gitlab-ci.yml'
rules:
- changes:
- infrastructure/**/*
# Global lint jobs still run on any change
lint:global:
stage: validate
image: python:3.11-slim
script:
- pip install pre-commit
- pre-commit run --all-files
cache:
key: pre-commit-${CI_COMMIT_REF_SLUG}
paths:
- .cache/pre-commit
Dynamic Pipeline Generation¶
Generate pipelines based on detected changes:
generate:pipeline:
stage: .pre
image: python:3.11-slim
script:
- |
# Detect changed modules
git diff --name-only $CI_MERGE_REQUEST_DIFF_BASE_SHA $CI_COMMIT_SHA > changed_files.txt
# Generate dynamic pipeline
python scripts/generate_pipeline.py changed_files.txt > generated-pipeline.yml
artifacts:
paths:
- generated-pipeline.yml
expire_in: 1 day
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
trigger:dynamic:
stage: validate
needs: [generate:pipeline]
trigger:
include:
- artifact: generated-pipeline.yml
job: generate:pipeline
strategy: depend
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
Parallel Execution Strategies¶
Maximize pipeline efficiency with parallelization:
Matrix-Based Parallel Testing¶
# Test across multiple platforms in parallel
test:multi-platform:
stage: test
image: python:3.11-slim
parallel:
matrix:
- PLATFORM: [ubuntu-20.04, ubuntu-22.04, debian-11]
PYTHON_VERSION: ['3.9', '3.10', '3.11']
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2375
TEST_PLATFORM: $PLATFORM
TEST_PYTHON: $PYTHON_VERSION
script:
- echo "Testing on $PLATFORM with Python $PYTHON_VERSION"
- docker run --rm python:$PYTHON_VERSION-slim python --version
- pytest --platform=$PLATFORM
artifacts:
when: always
reports:
junit: junit-${PLATFORM}-${PYTHON_VERSION}.xml
expire_in: 7 days
# Parallel Terraform module testing
test:terraform:modules:
stage: test
image: golang:1.24
parallel:
matrix:
- MODULE: [vpc, rds, eks, s3]
script:
- cd modules/$MODULE/tests
- go test -v -timeout 20m
artifacts:
when: always
reports:
junit: $MODULE-results.xml
expire_in: 7 days
Sharded Test Execution¶
# Split large test suite across multiple runners
test:sharded:
stage: test
image: node:22-alpine
parallel: 8 # Split into 8 shards
before_script:
- npm ci
script:
- |
echo "Running shard $CI_NODE_INDEX of $CI_NODE_TOTAL"
npm run test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
artifacts:
when: always
reports:
junit: junit-shard-${CI_NODE_INDEX}.xml
coverage_report:
coverage_format: cobertura
path: coverage-shard-${CI_NODE_INDEX}.xml
expire_in: 7 days
# Merge coverage from all shards
coverage:merge:
stage: .post
image: node:22-alpine
needs:
- test:sharded
script:
- npm install -g nyc
- nyc merge coverage/ .nyc_output/coverage.json
- nyc report --reporter=html --reporter=text
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
artifacts:
paths:
- coverage/
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
expire_in: 30 days
Parallel DAG Execution¶
# Use DAG for parallel execution with dependencies
stages:
- validate
- build
- test
- deploy
# Fast parallel validation
lint:yaml:
stage: validate
script: yamllint .
lint:terraform:
stage: validate
script: terraform fmt -check
lint:ansible:
stage: validate
script: ansible-lint
# Build can start as soon as validation passes
build:api:
stage: build
needs: [lint:yaml] # Only needs yaml lint
script: docker build -t api .
build:web:
stage: build
needs: [lint:yaml]
script: docker build -t web .
# Tests run in parallel, each with specific dependencies
test:api:unit:
stage: test
needs: [build:api]
script: pytest api/tests/
test:api:integration:
stage: test
needs: [build:api]
script: pytest api/tests/integration/
test:web:unit:
stage: test
needs: [build:web]
script: npm test
# Deploy only after ALL tests pass
deploy:staging:
stage: deploy
needs:
- test:api:unit
- test:api:integration
- test:web:unit
script: kubectl apply -f k8s/staging/
Artifact Management¶
Optimize artifact storage and retention:
Tiered Retention Policy¶
# Tier 1: Short-term artifacts (7 days)
test:unit:
stage: test
script: npm test
artifacts:
when: always
reports:
junit: junit.xml
paths:
- test-results/
expire_in: 7 days # Short retention for frequent tests
# Tier 2: Medium-term artifacts (30 days)
test:coverage:
stage: test
script: npm run coverage
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
paths:
- coverage/
expire_in: 30 days # Keep coverage reports longer
# Tier 3: Long-term artifacts (90 days)
compliance:audit:
stage: integration
script: inspec exec compliance/
artifacts:
paths:
- compliance-results.json
- audit-evidence/
expire_in: 90 days # Compliance evidence retention
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
# Tier 4: Release artifacts (1 year)
build:release:
stage: build
script: make build
artifacts:
paths:
- dist/
- CHANGELOG.md
expire_in: 365 days # Keep release artifacts
rules:
- if: $CI_COMMIT_TAG
Selective Artifact Storage¶
# Only store artifacts on failure for debugging
test:integration:
stage: test
script: pytest tests/integration/
artifacts:
when: on_failure # Only save when test fails
paths:
- logs/
- screenshots/
- test-results/
expire_in: 14 days
# Always store artifacts but with compression
build:optimized:
stage: build
script:
- make build
- tar -czf dist.tar.gz dist/
artifacts:
paths:
- dist.tar.gz # Compressed artifact
expire_in: 30 days
Artifact Dependencies¶
# Reuse artifacts across jobs
build:
stage: build
script: npm run build
artifacts:
paths:
- dist/
expire_in: 1 day
test:e2e:
stage: test
needs:
- job: build
artifacts: true # Download build artifacts
script:
- npm run test:e2e dist/
deploy:production:
stage: deploy
needs:
- job: build
artifacts: true
script:
- cp -r dist/* /var/www/html/
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
Merge Request Integration¶
Enhance MR visibility with pipeline integration:
Coverage Badges and Metrics¶
# Generate coverage badge
coverage:badge:
stage: .post
image: node:22-alpine
needs: [test:coverage]
script:
- |
COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
COLOR="red"
if (( $(echo "$COVERAGE >= 80" | bc -l) )); then
COLOR="green"
elif (( $(echo "$COVERAGE >= 60" | bc -l) )); then
COLOR="yellow"
fi
# Generate badge
BADGE_URL="https://img.shields.io/badge/coverage-${COVERAGE}%25-${COLOR}"
echo "Coverage badge: $BADGE_URL"
# Post to MR
COMMENT="## 📊 Test Coverage\n\n\n\nCurrent coverage: **${COVERAGE}%**"
curl --request POST \
--header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
--data "body=${COMMENT}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
Inline MR Comments¶
# Post test results as MR comments
test:results:comment:
stage: .post
image: alpine:latest
needs: [test:unit, test:integration]
before_script:
- apk add --no-cache curl jq
script:
- |
# Aggregate test results
UNIT_PASSED=$(cat test-results/unit.json | jq '.stats.passes')
UNIT_FAILED=$(cat test-results/unit.json | jq '.stats.failures')
INTEGRATION_PASSED=$(cat test-results/integration.json | jq '.stats.passes')
INTEGRATION_FAILED=$(cat test-results/integration.json | jq '.stats.failures')
# Create formatted comment
COMMENT="## ✅ Test Results\n\n"
COMMENT+="### Unit Tests\n"
COMMENT+="- ✅ Passed: ${UNIT_PASSED}\n"
COMMENT+="- ❌ Failed: ${UNIT_FAILED}\n\n"
COMMENT+="### Integration Tests\n"
COMMENT+="- ✅ Passed: ${INTEGRATION_PASSED}\n"
COMMENT+="- ❌ Failed: ${INTEGRATION_FAILED}\n"
# Post comment
curl --request POST \
--header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
--header "Content-Type: application/json" \
--data "{\"body\":\"${COMMENT}\"}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: always
Dashboard Links¶
# Add links to external dashboards
dashboard:link:
stage: .post
image: alpine:latest
before_script:
- apk add --no-cache curl
script:
- |
GRAFANA_URL="https://grafana.example.com/d/pipeline?var-pipeline=${CI_PIPELINE_ID}"
SONAR_URL="https://sonar.example.com/dashboard?id=${CI_PROJECT_PATH}"
COMMENT="## 📊 Quality Dashboards\n\n"
COMMENT+="- [Pipeline Metrics](${GRAFANA_URL})\n"
COMMENT+="- [Code Quality](${SONAR_URL})\n"
COMMENT+="- [Test Report](${CI_PROJECT_URL}/-/pipelines/${CI_PIPELINE_ID}/test_report)"
curl --request POST \
--header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
--data "body=${COMMENT}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
Common Pitfalls¶
Cache Key Collisions Across Branches¶
Issue: Using ${CI_COMMIT_REF_SLUG} as cache key causes cache misses when switching branches even for identical dependencies.
Example:
## Bad - Branch-specific cache keys
cache:
key: ${CI_COMMIT_REF_SLUG} # Different key for each branch
paths:
- node_modules/
build:
script:
- npm ci # Reinstalls on every branch switch
- npm run build
Solution: Use lock file hash as cache key.
## Good - Content-based cache keys
cache:
key:
files:
- package-lock.json # ✅ Changes only when dependencies change
paths:
- node_modules/
- .npm/
build:
script:
- npm ci --cache .npm
- npm run build
Key Points:
- Use lock file hashes for dependency caches
- Include package manager cache directory (.npm, .yarn)
- Consider
${CI_COMMIT_REF_SLUG}-${checksum}for branch isolation - Use
policy: pullin most jobs,pull-pushonly in one job
Missing Dependencies Specification¶
Issue: Jobs fail because dependencies or needs is not specified, causing artifacts from previous jobs to be unavailable.
Example:
## Bad - Implicit dependencies
build:
stage: build
script:
- npm run build
artifacts:
paths:
- dist/
deploy:
stage: deploy
script:
- ls dist/ # ❌ dist/ not available!
- ./deploy.sh
Solution: Explicitly declare job dependencies.
## Good - Explicit dependencies
build:
stage: build
script:
- npm run build
artifacts:
paths:
- dist/
deploy:
stage: deploy
dependencies:
- build # ✅ Download artifacts from build job
script:
- ls dist/ # Now available
- ./deploy.sh
Key Points:
- Use
dependencies: [job1, job2]to download specific artifacts - Use
dependencies: []to download no artifacts needscreates both dependency and downloads artifacts- Missing
dependenciesdownloads all artifacts from previous stages
Services Hostname Confusion¶
Issue: Trying to connect to services using localhost instead of service name causes connection failures.
Example:
## Bad - Using localhost for services
test:
services:
- postgres:15
variables:
DATABASE_URL: postgresql://user:pass@localhost:5432/db # ❌ Wrong!
script:
- npm test # Cannot connect to database
Solution: Use service name as hostname.
## Good - Use service name as hostname
test:
services:
- name: postgres:15
alias: database # Optional custom alias
variables:
DATABASE_URL: postgresql://user:pass@database:5432/db # ✅ Service alias
script:
- npm test
Key Points:
- Services are accessible by image name (postgres, redis, mongo)
- Use
aliasto customize service hostname - Service port is the container's internal port (not mapped)
- Wait for service readiness before running tests
Rule Precedence Gotchas¶
Issue: Multiple rules entries create unexpected behavior due to first-match-wins semantics.
Example:
## Bad - First rule always matches
deploy:
rules:
- if: $CI_COMMIT_BRANCH # ❌ Matches any branch!
when: always
- if: $CI_COMMIT_BRANCH == "main"
when: manual
script:
- ./deploy.sh # Runs automatically on all branches, not just main
Solution: Order rules from most specific to least specific.
## Good - Specific rules first
deploy:
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual # ✅ Manual deploy on main
- if: $CI_COMMIT_BRANCH == "develop"
when: always # Auto deploy on develop
- when: never # Don't run on other branches
script:
- ./deploy.sh
Key Points:
- Rules are evaluated top-to-bottom, first match wins
- Always end with a default rule (
when: neverorwhen: on_success) - Use
&&for multiple conditions in one rule - Test rule logic with
--dry-run
Variable Expansion in Non-String Contexts¶
Issue: Variables not expanding in certain YAML contexts like only, except, or numeric values.
Example:
## Bad - Variables don't expand in only/except
deploy:
only:
- $CI_DEFAULT_BRANCH # ❌ Treated as literal string!
script:
- ./deploy.sh
## Bad - Variables in numeric contexts
test:
parallel: $PARALLEL_COUNT # ❌ Not expanded
script:
- npm test
Solution: Use rules for conditional execution and .env syntax for expansion.
## Good - Use rules for branch matching
deploy:
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH # ✅ Expands correctly
script:
- ./deploy.sh
## Good - Expand variables in script
test:
script:
- export COUNT=${PARALLEL_COUNT:-4}
- echo "Running $COUNT parallel tests"
- npm test -- --shard=$CI_NODE_INDEX/$COUNT
Key Points:
- Use
rulesinstead ofonly/exceptfor variable-based conditions - Variables expand in scripts, not in YAML structure
- Use
${VAR:-default}for default values - Numeric YAML keys don't support variable expansion
Anti-Patterns¶
❌ Avoid: No Cache¶
## Bad - Reinstalling dependencies every time
test:
script:
- npm install
- npm test
## Good - Using cache
test:
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
script:
- npm ci
- npm test
❌ Avoid: Hardcoded Secrets¶
## Bad - Hardcoded credentials
deploy:
script:
- ssh user@server "password123"
## Good - Use CI/CD variables
deploy:
script:
- ssh $DEPLOY_USER@$DEPLOY_SERVER
❌ Avoid: No Artifact Expiration¶
## Bad - Artifacts kept forever
build:
artifacts:
paths:
- dist/
## Good - Set expiration
build:
artifacts:
paths:
- dist/
expire_in: 1 week
❌ Avoid: Not Using Rules Instead of only/except¶
## Bad - Using deprecated only/except
deploy:
only:
- main
except:
- schedules
script:
- ./deploy.sh
## Good - Use rules
deploy:
rules:
- if: $CI_COMMIT_BRANCH == "main" && $CI_PIPELINE_SOURCE != "schedule"
script:
- ./deploy.sh
❌ Avoid: Running All Jobs on All Branches¶
## Bad - Expensive jobs run on every branch
build-docker:
script:
- docker build -t myapp .
- docker push myapp # ❌ Pushes on every branch!
deploy-prod:
script:
- ./deploy-production.sh # ❌ Deploys from any branch!
## Good - Restrict jobs to appropriate branches
build-docker:
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_TAG
script:
- docker build -t myapp:$CI_COMMIT_SHORT_SHA .
- docker push myapp:$CI_COMMIT_SHORT_SHA
deploy-prod:
rules:
- if: $CI_COMMIT_TAG =~ /^v\d+\.\d+\.\d+$/
script:
- ./deploy-production.sh
environment:
name: production
❌ Avoid: Not Using extends for Shared Configuration¶
## Bad - Duplicated configuration
test-unit:
image: node:22-alpine
before_script:
- npm ci
script:
- npm run test:unit
test-integration:
image: node:22-alpine
before_script:
- npm ci
script:
- npm run test:integration
## Good - Use extends
.node-base:
image: node:22-alpine
before_script:
- npm ci
test-unit:
extends: .node-base
script:
- npm run test:unit
test-integration:
extends: .node-base
script:
- npm run test:integration
❌ Avoid: Not Using Retry for Flaky Jobs¶
## Bad - Flaky job fails pipeline
integration-tests:
script:
- npm run test:integration # ❌ No retry on failure
## Good - Retry flaky jobs
integration-tests:
script:
- npm run test:integration
retry:
max: 2
when:
- runner_system_failure
- stuck_or_timeout_failure
Security Best Practices¶
Secrets Management¶
Protect sensitive data in CI/CD pipelines:
## Bad - Hardcoded secrets in pipeline
deploy:
script:
- echo "API_KEY=sk-1234567890abcdef" >> .env
- aws configure set aws_access_key_id AKIAIOSFODNN7EXAMPLE # ❌ Exposed!
## Good - Use protected CI/CD variables
deploy:
script:
- echo "API_KEY=$API_KEY" >> .env # API_KEY from protected variable
- aws configure set aws_access_key_id "$AWS_ACCESS_KEY_ID" # ✅ From variables
only:
- main # Protected branch only
## Good - Use masked variables
variables:
DATABASE_URL: ${DB_URL} # Masked in GitLab UI and logs
## Good - Use file-type variables for certificates
deploy:
before_script:
- echo "$SSH_PRIVATE_KEY" > ~/.ssh/id_rsa
- chmod 600 ~/.ssh/id_rsa
Key Points:
- Store secrets in GitLab CI/CD Variables (Settings > CI/CD > Variables)
- Enable "Masked" to hide values in job logs
- Enable "Protected" to restrict to protected branches only
- Use "File" type for certificates and large secrets
- Never commit secrets to
.gitlab-ci.yml - Rotate secrets regularly
Protected Branches and Runners¶
Restrict pipeline execution to authorized users and branches:
## Good - Protected branch deployment
deploy_production:
stage: deploy
script:
- ./deploy-prod.sh
environment:
name: production
only:
- main # Protected branch
when: manual # Require manual approval
## Good - Use protected runners for sensitive jobs
deploy_production:
stage: deploy
tags:
- protected-runner # Runner tagged as protected in GitLab
script:
- ./deploy-prod.sh
only:
- main
Key Points:
- Configure protected branches (Settings > Repository > Protected branches)
- Restrict who can merge to protected branches
- Use protected runners for production deployments
- Require manual approval for critical deployments
- Implement approval rules for merge requests
Docker Image Security¶
Use secure, trusted container images:
## Bad - Using latest tag
test:
image: node:latest # ❌ Unpredictable, potential security issues
script:
- npm test
## Good - Pin specific versions
test:
image: node:20.10.0-alpine # ✅ Specific, minimal image
script:
- npm test
## Good - Use internal registry with scanned images
test:
image: registry.gitlab.com/myorg/secure-node:20.10.0-alpine
script:
- npm test
## Good - Scan images for vulnerabilities
build_image:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker scan $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA # Scan for vulnerabilities
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
Key Points:
- Always pin specific image versions (avoid
latest) - Use minimal base images (alpine, distroless)
- Scan images for vulnerabilities (Trivy, Clair, Snyk)
- Use trusted registries only
- Regularly update base images
- Verify image signatures
Code Injection Prevention¶
Prevent command injection in pipeline scripts:
## Bad - Unvalidated user input
deploy:
script:
- ssh user@$DEPLOY_SERVER "$CI_COMMIT_MESSAGE" # ❌ Injection risk!
- eval $USER_COMMAND # ❌ Never use eval!
## Good - Validate and sanitize inputs
deploy:
script:
- |
if [[ ! "$DEPLOY_ENV" =~ ^(dev|staging|prod)$ ]]; then
echo "Invalid environment"
exit 1
fi
- ./deploy.sh "$DEPLOY_ENV" # Quoted, validated variable
## Good - Use predefined commands
deploy:
variables:
ALLOWED_COMMANDS: "deploy.sh status.sh rollback.sh"
script:
- |
if [[ " $ALLOWED_COMMANDS " =~ " $COMMAND " ]]; then
./"$COMMAND"
else
echo "Unauthorized command"
exit 1
fi
Key Points:
- Never use
evalwith user-controlled input - Validate all variables before use
- Use allow-lists for dynamic values
- Quote all variables in scripts
- Sanitize commit messages and user inputs
- Use parameterized commands
Dependency Security¶
Secure third-party dependencies:
## Good - Pin dependency versions
build:
image: node:20.10.0-alpine
script:
- npm ci # Use package-lock.json (deterministic installs)
- npm audit # Check for vulnerabilities
## Good - Verify checksums
build:
script:
- wget https://example.com/tool.tar.gz
- echo "$EXPECTED_CHECKSUM tool.tar.gz" | sha256sum -c # Verify checksum
- tar -xzf tool.tar.gz
## Good - Use dependency scanning
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
dependency_scan:
stage: test
allow_failure: false # Fail on vulnerabilities
Key Points:
- Pin all dependency versions (
package-lock.json,Gemfile.lock, etc.) - Use
npm ciinstead ofnpm install - Run dependency audits (
npm audit,bundle audit, etc.) - Verify package checksums
- Use GitLab Dependency Scanning
- Monitor for supply chain attacks
Access Control and Least Privilege¶
Implement least privilege for pipeline execution:
## Good - Use service accounts with minimal permissions
deploy_aws:
script:
- aws s3 sync ./dist s3://my-bucket --delete
variables:
AWS_ACCESS_KEY_ID: $AWS_DEPLOY_KEY_ID # Service account with S3-only access
AWS_SECRET_ACCESS_KEY: $AWS_DEPLOY_SECRET
## Good - Restrict runner access
deploy_production:
tags:
- production-runner # Dedicated runner with limited network access
only:
- main
script:
- ./deploy.sh
Key Points:
- Use service accounts with minimum required permissions
- Separate runners by environment (dev, staging, prod)
- Restrict runner network access
- Use RBAC for pipeline access control
- Audit who can trigger pipelines
- Limit access to protected variables
Artifact Security¶
Secure build artifacts:
## Good - Set appropriate artifact expiration
build:
script:
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 week # Auto-cleanup
reports:
coverage: coverage/cobertura-coverage.xml
## Good - Protect sensitive artifacts
deploy:
dependencies:
- build
script:
- |
# Encrypt sensitive artifacts before storage
tar -czf dist.tar.gz dist/
openssl enc -aes-256-cbc -salt -in dist.tar.gz -out dist.tar.gz.enc -k "$ENCRYPTION_KEY"
- ./deploy.sh dist.tar.gz.enc
Key Points:
- Set appropriate artifact expiration times
- Don't store secrets in artifacts
- Encrypt sensitive artifacts
- Use access controls for artifact download
- Validate artifact integrity (checksums)
- Clean up old artifacts regularly
Audit Logging and Monitoring¶
Monitor pipeline activity:
## Good - Log security events
deploy:
before_script:
- echo "Deployment initiated by $GITLAB_USER_LOGIN at $(date)"
- echo "Target environment: $CI_ENVIRONMENT_NAME"
script:
- ./deploy.sh
after_script:
- |
if [ $CI_JOB_STATUS == "success" ]; then
./send-audit-log.sh "Deployment successful"
else
./send-alert.sh "Deployment failed - investigation required"
fi
Key Points:
- Enable audit logging for all environments
- Monitor failed pipeline runs
- Track who triggered deployments
- Alert on security policy violations
- Review access logs regularly
- Maintain pipeline execution history
Network Security¶
Secure pipeline network access:
## Good - Use VPN or private networks for sensitive operations
deploy_database:
before_script:
- openvpn --config production-vpn.conf # Connect to private network
script:
- psql -h $DB_HOST -U $DB_USER -d $DB_NAME < migration.sql
after_script:
- killall openvpn # Disconnect VPN
## Good - Restrict outbound connections
test:
script:
- npm test
variables:
HTTP_PROXY: "http://proxy.internal:8080" # Route through approved proxy
HTTPS_PROXY: "http://proxy.internal:8080"
Key Points:
- Use VPNs or private networks for database access
- Restrict outbound internet access from runners
- Use approved proxies for external connections
- Implement network segmentation
- Monitor network traffic from runners
- Use firewall rules to limit access
Tool Configuration¶
gitlab-ci-local - Local Pipeline Testing¶
Install and configure gitlab-ci-local for testing pipelines locally:
## Install gitlab-ci-local (npm)
npm install -g gitlab-ci-local
## Install gitlab-ci-local (brew)
brew install gitlab-ci-local
## Run entire pipeline
gitlab-ci-local
## Run specific job
gitlab-ci-local build
## List all jobs
gitlab-ci-local --list
## Run with specific file
gitlab-ci-local --file .gitlab-ci.custom.yml
## Dry run
gitlab-ci-local --preview
## Use specific variables
gitlab-ci-local --variable CI_COMMIT_REF_NAME=main
.gitlab-ci-local-variables.yml¶
## .gitlab-ci-local-variables.yml
## Local development variables
CI_PROJECT_NAME: my-project
CI_COMMIT_BRANCH: main
CI_COMMIT_REF_NAME: main
DOCKER_REGISTRY: localhost:5000
DEPLOY_ENV: development
gitlab-ci-lint - Pipeline Validation¶
## Validate .gitlab-ci.yml syntax (requires GitLab instance)
gitlab-ci-lint .gitlab-ci.yml
## Using GitLab API
curl --header "PRIVATE-TOKEN: ${GITLAB_TOKEN}" \
"https://gitlab.com/api/v4/projects/${PROJECT_ID}/ci/lint" \
--form "content@.gitlab-ci.yml"
## Using glab CLI
glab ci lint
VS Code Settings¶
{
"files.associations": {
".gitlab-ci*.yml": "yaml"
},
"[yaml]": {
"editor.defaultFormatter": "redhat.vscode-yaml",
"editor.formatOnSave": true
},
"yaml.schemas": {
"https://gitlab.com/gitlab-org/gitlab/-/raw/master/app/assets/javascripts/editor/schema/ci.json": [
".gitlab-ci.yml",
".gitlab-ci.*.yml"
]
},
"yaml.customTags": [
"!reference sequence"
]
}
Pre-commit Hooks¶
## .pre-commit-config.yaml
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
files: \.gitlab-ci.*\.ya?ml$
- id: check-added-large-files
- repo: https://github.com/adrienverge/yamllint
rev: v1.35.1
hooks:
- id: yamllint
files: \.gitlab-ci.*\.ya?ml$
args: ['-d', '{extends: default, rules: {line-length: {max: 120}}}']
# Optional: gitlab-ci-local validation
- repo: local
hooks:
- id: gitlab-ci-local-lint
name: GitLab CI Local Lint
entry: gitlab-ci-local --preview
language: system
files: \.gitlab-ci\.yml$
pass_filenames: false
yamllint Configuration¶
## .yamllint
extends: default
rules:
line-length:
max: 120
level: warning
indentation:
spaces: 2
indent-sequences: true
comments:
min-spaces-from-content: 1
document-start: disable
truthy:
allowed-values: ['true', 'false']
key-duplicates: enable
EditorConfig¶
## .editorconfig
[.gitlab-ci*.{yml,yaml}]
indent_style = space
indent_size = 2
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
Makefile¶
## Makefile
.PHONY: ci-local ci-list ci-validate
ci-local:
gitlab-ci-local
ci-list:
gitlab-ci-local --list
ci-validate:
gitlab-ci-local --preview
yamllint .gitlab-ci.yml
@echo "✓ GitLab CI configuration is valid"
ci-job:
gitlab-ci-local $(JOB)
## Example: make ci-job JOB=build
ci-debug:
gitlab-ci-local --shell-isolation=false $(JOB)
.gitlab-ci-include-local.yml¶
Template for reusable CI configurations:
## .gitlab-ci/templates/docker.yml
.docker_build:
image: docker:24
services:
- docker:24-dind
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
.docker_push:
extends: .docker_build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
GitLab CI Workflow Validation Job¶
Add to your .gitlab-ci.yml:
validate:ci:
stage: .pre
image: python:3.11-slim
before_script:
- pip install yamllint
script:
- yamllint .gitlab-ci.yml
- echo "✓ Pipeline configuration is valid"
rules:
- changes:
- .gitlab-ci.yml
- .gitlab-ci/**/*
glab CLI Configuration¶
## ~/.config/glab-cli/config.yml
hosts:
gitlab.com:
user: your-username
token: glpat-xxxxxxxxxxxxx
git_protocol: ssh
api_protocol: https
pager:
ci: false
mr: less
editor: vim
browser: firefox
Docker Compose for Local GitLab Runner¶
## docker-compose.gitlab-runner.yml
version: '3.8'
services:
gitlab-runner:
image: gitlab/gitlab-runner:latest
container_name: gitlab-runner-local
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./gitlab-runner-config:/etc/gitlab-runner
restart: unless-stopped
References¶
Official Documentation¶
Best Practices¶
Status: Active