GitLab CI/CD

Language Overview¶

GitLab CI/CD is a continuous integration and deployment tool built into GitLab. It uses .gitlab-ci.yml files to define pipelines that automatically build, test, and deploy code. This guide covers GitLab CI/CD best practices for creating maintainable, efficient pipelines.

Key Characteristics¶

File Name: .gitlab-ci.yml
Format: YAML
Primary Use: CI/CD pipelines, automated testing, deployment automation
Key Concepts: Pipelines, stages, jobs, runners, artifacts, cache

Quick Reference¶

Category	Convention	Example	Notes
File Naming
Pipeline Config	`.gitlab-ci.yml`	`.gitlab-ci.yml`	At repository root
Pipeline Structure
`stages`	Pipeline stages	`stages: [build, test, deploy]`	Ordered execution phases
`image`	Docker image	`image: node:20-alpine`	Default container image
`before_script`	Pre-job commands	Setup commands	Runs before each job
`after_script`	Post-job commands	Cleanup commands	Runs after each job
Job Definition
Job Name	`job_name:`	`build_app:`	Descriptive job name
`stage`	Job stage	`stage: build`	Which stage job belongs to
`script`	Commands to run	`script: - npm install`	Required commands
`only` / `except`	Branch filters	`only: [main]`	When job runs (legacy)
`rules`	Conditional logic	`rules: - if: $CI_COMMIT_BRANCH`	Modern conditional execution
Artifacts
`artifacts`	Save files	`paths: [dist/]`	Persist build outputs
`expire_in`	Artifact retention	`expire_in: 1 week`	Auto-cleanup
`reports`	Test reports	`reports: junit: report.xml`	Test result integration
Cache
`cache`	Cache dependencies	`paths: [node_modules/]`	Speed up builds
`key`	Cache key	`key: $CI_COMMIT_REF_SLUG`	Cache versioning
Variables
Predefined	`$CI_COMMIT_SHA`	GitLab-provided variables	Built-in vars
Custom	`variables:`	`NODE_ENV: production`	User-defined vars
Protected	Masked variables	Secure secrets	Settings > CI/CD
Best Practices
Stages	Logical grouping	`[build, test, deploy]`	Clear pipeline flow
Docker Images	Pin versions	`node:20.10.0-alpine`	Avoid `latest`
Rules	Use `rules:`	Replace `only/except`	Modern syntax
Cache	Speed up builds	Cache dependencies	Reduce build time

Basic Pipeline Structure¶

Simple Pipeline¶

stages:
  - build
  - test
  - deploy

variables:
  NODE_VERSION: "18"

build_job:
  stage: build
  image: node:22-alpine
  script:
    - npm ci
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day

test_job:
  stage: test
  image: node:22-alpine
  script:
    - npm ci
    - npm test
  coverage: '/Lines\s*:\s*(\d+\.\d+)%/'

deploy_job:
  stage: deploy
  image: alpine:latest
  script:
    - echo "Deploying to production"
    - ./deploy.sh
  only:
    - main

Stages¶

Define Stages¶

## Stages execute in order
stages:
  - build
  - test
  - package
  - deploy
  - cleanup

## Jobs in same stage run in parallel
## Jobs in next stage wait for previous stage to complete

Jobs¶

Job Configuration¶

job_name:
  stage: build
  image: node:22-alpine
  tags:
    - docker
  before_script:
    - echo "Preparing environment"
  script:
    - npm ci
    - npm run build
  after_script:
    - echo "Cleaning up"
  only:
    - main
    - develop
  except:
    - tags
  when: on_success
  allow_failure: false
  timeout: 1h
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

Variables¶

Global Variables¶

variables:
  POSTGRES_DB: "testdb"
  POSTGRES_USER: "testuser"
  POSTGRES_PASSWORD: "testpass"
  NODE_ENV: "production"
  DOCKER_DRIVER: overlay2
  GIT_STRATEGY: clone
  GIT_DEPTH: "50"

Job Variables¶

deploy_staging:
  stage: deploy
  variables:
    DEPLOY_ENV: "staging"
    API_URL: "https://staging.example.com"
  script:
    - echo "Deploying to $DEPLOY_ENV"
    - ./deploy.sh

Protected Variables¶

Set in GitLab UI under Settings > CI/CD > Variables:

deploy_production:
  stage: deploy
  script:
    - echo "API Key: $API_KEY"  # From protected variable
    - ./deploy.sh
  only:
    - main

Artifacts¶

Basic Artifacts¶

build:
  stage: build
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
      - build/
    expire_in: 1 week

Artifact with Reports¶

test:
  stage: test
  script:
    - npm test
  artifacts:
    when: always
    reports:
      junit: junit.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
    paths:
      - coverage/
    expire_in: 30 days

Download Artifacts from Another Job¶

deploy:
  stage: deploy
  dependencies:
    - build
  script:
    - ls dist/  # Artifacts from build job
    - ./deploy.sh

Cache¶

NPM Cache¶

.node_cache: &node_cache
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
      - .npm/

test:
  <<: *node_cache
  stage: test
  script:
    - npm ci --cache .npm
    - npm test

Global Cache¶

cache:
  key: ${CI_COMMIT_REF_SLUG}
  paths:
    - node_modules/
    - .npm/
  policy: pull-push

build:
  stage: build
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
    policy: pull
  script:
    - npm ci
    - npm run build

Docker-in-Docker (DinD)¶

Building Docker Images¶

build_image:
  stage: build
  image: docker:latest
  services:
    - docker:dind
  variables:
    DOCKER_TLS_CERTDIR: "/certs"
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE:latest
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
    - docker push $CI_REGISTRY_IMAGE:latest
  only:
    - main

Conditional Execution¶

Only/Except¶

## Run only on specific branches
deploy_production:
  stage: deploy
  script:
    - ./deploy-prod.sh
  only:
    - main

## Run except on tags
test:
  stage: test
  script:
    - npm test
  except:
    - tags

## Run only on merge requests
mr_check:
  stage: test
  script:
    - npm run lint
  only:
    - merge_requests

Rules (Preferred)¶

deploy:
  stage: deploy
  script:
    - ./deploy.sh
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: always
    - if: $CI_COMMIT_BRANCH == "develop"
      when: manual
    - when: never

test:
  stage: test
  script:
    - npm test
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == "main"
    - if: $CI_COMMIT_BRANCH == "develop"

Templates and Includes¶

Include External Files¶

include:
  - local: '.gitlab/ci/build.yml'
  - local: '.gitlab/ci/test.yml'
  - local: '.gitlab/ci/deploy.yml'
  - template: Security/SAST.gitlab-ci.yml
  - remote: 'https://example.com/ci-templates/docker.yml'

Anchor and Aliases (YAML)¶

.job_template: &job_definition
  image: node:22-alpine
  before_script:
    - npm ci
  retry:
    max: 2

test:
  <<: *job_definition
  stage: test
  script:
    - npm test

build:
  <<: *job_definition
  stage: build
  script:
    - npm run build

Extends¶

.base_job:
  image: node:22-alpine
  before_script:
    - npm ci
  retry:
    max: 2

test:
  extends: .base_job
  stage: test
  script:
    - npm test

build:
  extends: .base_job
  stage: build
  script:
    - npm run build

Parallel Jobs¶

Matrix Jobs¶

test:
  stage: test
  parallel:
    matrix:
      - NODE_VERSION: ["16", "18", "20"]
        OS: ["ubuntu-latest", "alpine"]
  image: node:${NODE_VERSION}-${OS}
  script:
    - npm ci
    - npm test

Simple Parallel¶

test:
  stage: test
  parallel: 3
  script:
    - npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

Services¶

PostgreSQL Service¶

test:
  stage: test
  image: node:22-alpine
  services:
    - postgres:15-alpine
  variables:
    POSTGRES_DB: testdb
    POSTGRES_USER: testuser
    POSTGRES_PASSWORD: testpass
    DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
  script:
    - npm ci
    - npm run test:db

Multiple Services¶

integration_test:
  stage: test
  services:
    - postgres:15-alpine
    - redis:7-alpine
  variables:
    POSTGRES_DB: testdb
    POSTGRES_PASSWORD: testpass
    REDIS_URL: redis://redis:6379
  script:
    - npm run test:integration

Multi-Project Pipelines¶

Trigger Downstream Pipeline¶

trigger_deploy:
  stage: deploy
  trigger:
    project: mygroup/deployment-project
    branch: main
    strategy: depend
  only:
    - main

Parent-Child Pipelines¶

generate_child:
  stage: build
  script:
    - echo "Generating child pipeline config"
    - ./generate-pipeline.sh > child-pipeline.yml
  artifacts:
    paths:
      - child-pipeline.yml

trigger_child:
  stage: deploy
  trigger:
    include:
      - artifact: child-pipeline.yml
        job: generate_child

Complete Pipeline Example¶

Full-Stack Application Pipeline¶

workflow:
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == "main"
    - if: $CI_COMMIT_BRANCH == "develop"

stages:
  - build
  - test
  - security
  - package
  - deploy

variables:
  DOCKER_DRIVER: overlay2
  POSTGRES_DB: testdb
  POSTGRES_USER: testuser
  POSTGRES_PASSWORD: testpass

## Reusable templates
.node_base:
  image: node:22-alpine
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
      - .npm/
  before_script:
    - npm ci --cache .npm

## Build stage
build_frontend:
  extends: .node_base
  stage: build
  script:
    - cd frontend
    - npm run build
  artifacts:
    paths:
      - frontend/dist/
    expire_in: 1 day

build_backend:
  extends: .node_base
  stage: build
  script:
    - cd backend
    - npm run build
  artifacts:
    paths:
      - backend/dist/
    expire_in: 1 day

## Test stage
test_frontend:
  extends: .node_base
  stage: test
  script:
    - cd frontend
    - npm test -- --coverage
  coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: frontend/coverage/cobertura-coverage.xml
    paths:
      - frontend/coverage/
    expire_in: 30 days

test_backend:
  extends: .node_base
  stage: test
  services:
    - postgres:15-alpine
  variables:
    DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
  script:
    - cd backend
    - npm test -- --coverage
  coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: backend/coverage/cobertura-coverage.xml

lint:
  extends: .node_base
  stage: test
  script:
    - npm run lint

## Security stage
sast:
  stage: security
  allow_failure: true

dependency_scanning:
  stage: security
  allow_failure: true

## Package stage
build_docker_images:
  stage: package
  image: docker:latest
  services:
    - docker:dind
  variables:
    DOCKER_TLS_CERTDIR: "/certs"
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - docker build -t $CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA frontend/
    - docker build -t $CI_REGISTRY_IMAGE/backend:$CI_COMMIT_SHA backend/
    - docker push $CI_REGISTRY_IMAGE/frontend:$CI_COMMIT_SHA
    - docker push $CI_REGISTRY_IMAGE/backend:$CI_COMMIT_SHA
  only:
    - main
    - develop

## Deploy stage
deploy_staging:
  stage: deploy
  image: alpine:latest
  before_script:
    - apk add --no-cache curl
  script:
    - echo "Deploying to staging"
    - ./deploy-staging.sh
  environment:
    name: staging
    url: https://staging.example.com
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"

deploy_production:
  stage: deploy
  image: alpine:latest
  before_script:
    - apk add --no-cache curl
  script:
    - echo "Deploying to production"
    - ./deploy-production.sh
  environment:
    name: production
    url: https://example.com
  when: manual
  only:
    - main

include:
  - template: Security/SAST.gitlab-ci.yml
  - template: Security/Dependency-Scanning.gitlab-ci.yml

Testing¶

Testing Pipelines Locally¶

Use GitLab Runner to test pipelines locally before committing:

## Install GitLab Runner
# macOS
brew install gitlab-runner

# Linux
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash
sudo apt-get install gitlab-runner

## Test pipeline locally
gitlab-runner exec docker test

## Test specific job
gitlab-runner exec docker build

## Test with specific Docker image
gitlab-runner exec docker --docker-image node:22-alpine test

Validating CI Configuration¶

Validate .gitlab-ci.yml syntax:

## Using GitLab CI Lint API
curl --header "PRIVATE-TOKEN: <your_access_token>" \
     --header "Content-Type: application/json" \
     --data @.gitlab-ci.yml \
     "https://gitlab.com/api/v4/projects/<project_id>/ci/lint"

## Using gitlab-ci-lint tool
npm install -g gitlab-ci-lint
gitlab-ci-lint .gitlab-ci.yml

Pipeline Testing Job¶

Add pipeline validation as a job:

## .gitlab-ci.yml
stages:
  - validate
  - test
  - build
  - deploy

validate:pipeline:
  stage: validate
  image: alpine:latest
  before_script:
    - apk add --no-cache yamllint
  script:
    - yamllint .gitlab-ci.yml
    - echo "Pipeline configuration is valid"
  only:
    changes:
      - .gitlab-ci.yml

validate:dockerfile:
  stage: validate
  image: hadolint/hadolint:latest-alpine
  script:
    - hadolint Dockerfile
  only:
    changes:
      - Dockerfile

Unit Testing in CI¶

test:unit:
  stage: test
  image: node:22-alpine
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
  before_script:
    - npm ci
  script:
    - npm run test:unit
  coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
  artifacts:
    when: always
    reports:
      junit: junit.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
    paths:
      - coverage/
    expire_in: 30 days
  only:
    - merge_requests
    - main

Integration Testing¶

test:integration:
  stage: test
  image: node:22-alpine
  services:
    - name: postgres:15-alpine
      alias: postgres
  variables:
    POSTGRES_DB: test_db
    POSTGRES_USER: test_user
    POSTGRES_PASSWORD: test_pass
    DATABASE_URL: postgresql://test_user:test_pass@postgres:5432/test_db
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
  before_script:
    - npm ci
  script:
    - npm run test:integration
  artifacts:
    when: always
    reports:
      junit: integration-test-results.xml
    expire_in: 7 days

End-to-End Testing¶

test:e2e:
  stage: test
  image: mcr.microsoft.com/playwright:latest
  services:
    - name: selenium/standalone-chrome:latest
      alias: chrome
  variables:
    SELENIUM_HOST: chrome
    SELENIUM_PORT: 4444
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
      - playwright/.cache
  before_script:
    - npm ci
    - npx playwright install
  script:
    - npm run test:e2e
  artifacts:
    when: always
    paths:
      - test-results/
      - playwright-report/
    expire_in: 7 days
  only:
    - merge_requests
    - main

Security Testing¶

## SAST (Static Application Security Testing)
include:
  - template: Security/SAST.gitlab-ci.yml
  - template: Security/Dependency-Scanning.gitlab-ci.yml
  - template: Security/Secret-Detection.gitlab-ci.yml

## Container Scanning
container_scanning:
  stage: test
  image: docker:latest
  services:
    - docker:dind
  variables:
    DOCKER_DRIVER: overlay2
    CI_APPLICATION_REPOSITORY: $CI_REGISTRY_IMAGE
    CI_APPLICATION_TAG: $CI_COMMIT_SHA
  script:
    - docker build -t $CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG .
    - |
      docker run --rm \
        -v /var/run/docker.sock:/var/run/docker.sock \
        aquasec/trivy:latest \
        image --exit-code 1 --severity HIGH,CRITICAL \
        $CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG
  only:
    - merge_requests
    - main

Performance Testing¶

test:performance:
  stage: test
  image: grafana/k6:latest
  script:
    - k6 run --vus 10 --duration 30s tests/load-test.js
  artifacts:
    reports:
      load_performance: k6-results.json
    paths:
      - k6-results.json
    expire_in: 7 days
  only:
    - schedules

Parallel Testing¶

Speed up tests with parallel execution:

test:unit:parallel:
  stage: test
  image: node:22-alpine
  parallel: 4
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
  before_script:
    - npm ci
  script:
    - npm run test:unit -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
  artifacts:
    when: always
    reports:
      junit: junit-shard-${CI_NODE_INDEX}.xml
    expire_in: 7 days

Test Coverage Reporting¶

test:coverage:
  stage: test
  image: node:22-alpine
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
  before_script:
    - npm ci
  script:
    - npm run test:coverage
  coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
    paths:
      - coverage/
    expire_in: 30 days

## Enforce coverage threshold
check:coverage:
  stage: test
  image: node:22-alpine
  needs: [test:coverage]
  script:
    - |
      COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
      echo "Coverage: $COVERAGE%"
      if (( $(echo "$COVERAGE < 80" | bc -l) )); then
        echo "Coverage below 80% threshold"
        exit 1
      fi

Review Apps Testing¶

Test in ephemeral environments:

review:deploy:
  stage: deploy
  image: alpine:latest
  script:
    - echo "Deploying review app..."
    - echo "Review app URL: https://review-$CI_COMMIT_REF_SLUG.example.com"
  environment:
    name: review/$CI_COMMIT_REF_SLUG
    url: https://review-$CI_COMMIT_REF_SLUG.example.com
    on_stop: review:stop
  only:
    - merge_requests

review:test:
  stage: test
  needs: [review:deploy]
  image: curlimages/curl:latest
  script:
    - curl -f https://review-$CI_COMMIT_REF_SLUG.example.com/health
    - echo "Review app health check passed"
  only:
    - merge_requests

review:stop:
  stage: deploy
  image: alpine:latest
  script:
    - echo "Destroying review app..."
  environment:
    name: review/$CI_COMMIT_REF_SLUG
    action: stop
  when: manual
  only:
    - merge_requests

Testing with Child Pipelines¶

Organize tests using child pipelines:

## .gitlab-ci.yml
trigger:tests:
  stage: test
  trigger:
    include: .gitlab/ci/tests.yml
    strategy: depend

## .gitlab/ci/tests.yml
stages:
  - unit
  - integration
  - e2e

unit:tests:
  stage: unit
  image: node:22-alpine
  script:
    - npm ci
    - npm run test:unit

integration:tests:
  stage: integration
  image: node:22-alpine
  services:
    - postgres:15-alpine
  script:
    - npm ci
    - npm run test:integration

e2e:tests:
  stage: e2e
  image: mcr.microsoft.com/playwright:latest
  script:
    - npm ci
    - npx playwright install
    - npm run test:e2e

Conditional Testing¶

Run tests based on changes:

test:backend:
  stage: test
  image: python:3.11-slim
  script:
    - pip install -r requirements.txt
    - pytest
  only:
    changes:
      - backend/**/*
      - requirements.txt

test:frontend:
  stage: test
  image: node:22-alpine
  script:
    - npm ci
    - npm test
  only:
    changes:
      - frontend/**/*
      - package.json
      - package-lock.json

test:infrastructure:
  stage: test
  image: hashicorp/terraform:latest
  script:
    - terraform init
    - terraform validate
    - terraform plan
  only:
    changes:
      - terraform/**/*

CI/CD Pipeline Test Metrics¶

Monitor pipeline performance:

metrics:pipeline:
  stage: .post
  image: alpine:latest
  script:
    - |
      echo "Pipeline Duration: $CI_PIPELINE_DURATION seconds"
      echo "Pipeline Status: $CI_PIPELINE_STATUS"
      echo "Failed Jobs:"
      # Log failed jobs for analysis
  when: always
  only:
    - main

Tiered Pipeline Architecture¶

Implement a three-tier testing strategy that balances speed, coverage, and confidence:

Tier 1: Fast Feedback (< 2 minutes)¶

Static analysis and linting that runs on every commit:

stages:
  - validate    # Tier 1: Fast feedback
  - test        # Tier 2: Unit tests
  - integration # Tier 3: Integration tests
  - deploy

# Tier 1: Static Analysis - Fast feedback on every push
lint:yaml:
  stage: validate
  image: cytopia/yamllint:latest
  script:
    - yamllint -c .yamllint.yml .
  rules:
    - changes:
        - "**/*.yml"
        - "**/*.yaml"

lint:terraform:
  stage: validate
  image: hashicorp/terraform:latest
  script:
    - terraform fmt -check -recursive
    - terraform init -backend=false
    - terraform validate
  rules:
    - changes:
        - "**/*.tf"
        - "**/*.tfvars"

lint:ansible:
  stage: validate
  image: python:3.11-slim
  before_script:
    - pip install ansible-lint yamllint
  script:
    - ansible-lint --strict
    - yamllint playbooks/ roles/
  cache:
    key: ansible-lint
    paths:
      - .cache/pip
  rules:
    - changes:
        - "**/*.yml"
        - "playbooks/**/*"
        - "roles/**/*"

security:static:
  stage: validate
  image: aquasec/tfsec:latest
  script:
    - tfsec . --format json --out tfsec-results.json
  artifacts:
    reports:
      sast: tfsec-results.json
    expire_in: 7 days
  rules:
    - changes:
        - "**/*.tf"

security:secrets:
  stage: validate
  image: trufflesecurity/trufflehog:latest
  script:
    - trufflehog git file://. --fail --no-update
  allow_failure: false

Tier 2: Unit Tests (< 10 minutes)¶

Module-level testing that runs on pull requests:

# Tier 2: Unit Tests - Run on merge requests
test:terraform:
  stage: test
  image: golang:1.24
  services:
    - docker:dind
  variables:
    DOCKER_HOST: tcp://docker:2375
    DOCKER_TLS_CERTDIR: ""
  before_script:
    - cd tests
    - go mod download
  script:
    - go test -v -timeout 20m -parallel 4 ./...
  artifacts:
    when: always
    reports:
      junit: tests/report.xml
    expire_in: 7 days
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

test:ansible:
  stage: test
  image: python:3.11-slim
  services:
    - docker:dind
  variables:
    DOCKER_HOST: tcp://docker:2375
    DOCKER_TLS_CERTDIR: ""
  before_script:
    - pip install molecule[docker] ansible-lint
  script:
    - molecule test
  cache:
    key: molecule-${CI_COMMIT_REF_SLUG}
    paths:
      - .cache/pip
  artifacts:
    when: on_failure
    paths:
      - molecule/default/*.log
    expire_in: 7 days
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

test:unit:parallel:
  stage: test
  image: node:22-alpine
  parallel:
    matrix:
      - PLATFORM: [ubuntu-22.04, debian-11, rhel-9]
  before_script:
    - npm ci
  script:
    - npm run test:unit:$PLATFORM
  artifacts:
    when: always
    reports:
      junit: junit-${PLATFORM}.xml
    expire_in: 7 days
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Tier 3: Integration & Compliance (< 60 minutes)¶

Full-stack testing that runs nightly or pre-release:

# Tier 3: Integration Tests - Nightly or on main branch
integration:full-stack:
  stage: integration
  image: golang:1.24
  services:
    - docker:dind
  variables:
    DOCKER_HOST: tcp://docker:2375
    DOCKER_TLS_CERTDIR: ""
    AWS_REGION: us-east-1
  before_script:
    - cd tests/integration
    - go mod download
  script:
    - go test -v -timeout 60m ./...
  artifacts:
    when: always
    reports:
      junit: integration-results.xml
    paths:
      - tests/integration/logs/
    expire_in: 30 days
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
    - if: $CI_PIPELINE_SOURCE == "schedule"
  retry:
    max: 1
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

compliance:security:
  stage: integration
  image: chef/inspec:latest
  before_script:
    - inspec --version
  script:
    - inspec exec compliance/security-baseline.rb --reporter cli json:compliance-results.json
  artifacts:
    reports:
      junit: compliance-results.json
    paths:
      - compliance-results.json
    expire_in: 90 days
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
    - if: $CI_PIPELINE_SOURCE == "schedule"
    - if: $CI_COMMIT_TAG

Progressive Enforcement Strategy¶

Gradually introduce and enforce quality gates without disrupting development:

Phase 1: Warning Only (Weeks 1-2)¶

Start with informational feedback that doesn't block merges:

# .gitlab-ci.yml - Phase 1: Warning Only
lint:terraform:warning:
  stage: validate
  image: hashicorp/terraform:latest
  script:
    - terraform fmt -check -recursive || echo "⚠️ Terraform formatting issues detected"
    - terraform init -backend=false
    - terraform validate || echo "⚠️ Terraform validation failed"
  allow_failure: true  # Don't block merge
  rules:
    - if: $ENFORCEMENT_PHASE == "warning"
    - if: $ENFORCEMENT_PHASE == null  # Default to warning

security:scan:warning:
  stage: validate
  image: aquasec/tfsec:latest
  script:
    - tfsec . --soft-fail  # Report but don't fail
  allow_failure: true
  artifacts:
    reports:
      sast: tfsec-results.json
    expire_in: 30 days

Phase 2: Advisory with Auto-Fix (Weeks 3-4)¶

Provide automated fixes and merge request comments:

# Phase 2: Advisory with automated fixes
lint:terraform:advisory:
  stage: validate
  image: hashicorp/terraform:latest
  before_script:
    - apk add --no-cache git curl jq
  script:
    - |
      # Check formatting
      terraform fmt -check -recursive || {
        echo "Formatting issues detected. Auto-fixing..."
        terraform fmt -recursive

        # Create MR comment with suggestions
        COMMENT="## 🔧 Terraform Formatting\n\n"
        COMMENT+="Formatting issues were detected. Run \`terraform fmt -recursive\` to fix.\n\n"
        COMMENT+="<details><summary>Files affected</summary>\n\n$(terraform fmt -check -recursive 2>&1)\n</details>"

        curl --request POST \
          --header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
          --data "body=${COMMENT}" \
          "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"

        exit 1  # Fail but with helpful message
      }
  allow_failure: true  # Still advisory, but failing
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event" && $ENFORCEMENT_PHASE == "advisory"

coverage:advisory:
  stage: test
  image: node:22-alpine
  needs: [test:unit]
  script:
    - |
      COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
      echo "Coverage: $COVERAGE%"

      if (( $(echo "$COVERAGE < 80" | bc -l) )); then
        COMMENT="## 📊 Code Coverage\n\n"
        COMMENT+="Current coverage: ${COVERAGE}%\nTarget: 80%\n\n"
        COMMENT+="⚠️ Coverage is below threshold. Consider adding more tests."

        curl --request POST \
          --header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
          --data "body=${COMMENT}" \
          "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"

        exit 1
      fi
  allow_failure: true
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event" && $ENFORCEMENT_PHASE == "advisory"

Phase 3: Strict Enforcement (Week 5+)¶

Full merge-blocking enforcement:

# Phase 3: Strict enforcement - blocks merges
lint:terraform:strict:
  stage: validate
  image: hashicorp/terraform:latest
  script:
    - terraform fmt -check -recursive
    - terraform init -backend=false
    - terraform validate
  allow_failure: false  # Block merge on failure
  rules:
    - if: $ENFORCEMENT_PHASE == "strict"
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH  # Always strict on main

security:scan:strict:
  stage: validate
  image: aquasec/tfsec:latest
  script:
    - tfsec . --minimum-severity HIGH --force-all-dirs
  allow_failure: false
  rules:
    - if: $ENFORCEMENT_PHASE == "strict"
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

coverage:strict:
  stage: test
  image: node:22-alpine
  needs: [test:unit]
  script:
    - |
      COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
      echo "Coverage: $COVERAGE%"

      if (( $(echo "$COVERAGE < 80" | bc -l) )); then
        echo "❌ Coverage ${COVERAGE}% is below 80% threshold"
        exit 1
      fi

      echo "✅ Coverage ${COVERAGE}% meets threshold"
  allow_failure: false
  rules:
    - if: $ENFORCEMENT_PHASE == "strict"
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

Enforcement Rollout Timeline¶

Configure progressive rollout with CI/CD variables:

# .gitlab-ci.yml - Dynamic enforcement based on timeline
variables:
  ENFORCEMENT_PHASE: "warning"  # Default phase

# Set enforcement phase based on date or manual trigger
.determine_phase: &determine_phase
  before_script:
    - |
      # Automatic progression based on date
      CURRENT_DATE=$(date +%s)
      ROLLOUT_START=1704067200  # 2024-01-01

      WEEKS_ELAPSED=$(( ($CURRENT_DATE - $ROLLOUT_START) / 604800 ))

      if [ $WEEKS_ELAPSED -lt 2 ]; then
        export ENFORCEMENT_PHASE="warning"
      elif [ $WEEKS_ELAPSED -lt 4 ]; then
        export ENFORCEMENT_PHASE="advisory"
      else
        export ENFORCEMENT_PHASE="strict"
      fi

      echo "Enforcement phase: $ENFORCEMENT_PHASE (Week $WEEKS_ELAPSED)"

Change Detection and Optimization¶

Optimize pipeline execution by testing only what changed:

Path-Based Job Execution¶

# Run jobs only when relevant files change
terraform:vpc:
  stage: test
  image: hashicorp/terraform:latest
  script:
    - cd modules/vpc
    - terraform init
    - terraform validate
    - terraform plan
  rules:
    - changes:
        - modules/vpc/**/*
        - modules/vpc/*.tf
      when: always
    - when: never  # Don't run if no changes

ansible:webserver:
  stage: test
  image: python:3.11-slim
  before_script:
    - pip install molecule[docker]
  script:
    - cd roles/webserver
    - molecule test
  rules:
    - changes:
        - roles/webserver/**/*
        - playbooks/webserver.yml
      when: always
    - when: never

test:backend:
  stage: test
  script:
    - pytest tests/backend/
  rules:
    - changes:
        - backend/**/*.py
        - requirements.txt
      when: always
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
      when: always
    - when: never

Monorepo Optimization¶

Efficient testing in monorepo structures:

# .gitlab-ci.yml for monorepo
include:
  - local: '/services/api/.gitlab-ci.yml'
    rules:
      - changes:
          - services/api/**/*
  - local: '/services/web/.gitlab-ci.yml'
    rules:
      - changes:
          - services/web/**/*
  - local: '/infrastructure/.gitlab-ci.yml'
    rules:
      - changes:
          - infrastructure/**/*

# Global lint jobs still run on any change
lint:global:
  stage: validate
  image: python:3.11-slim
  script:
    - pip install pre-commit
    - pre-commit run --all-files
  cache:
    key: pre-commit-${CI_COMMIT_REF_SLUG}
    paths:
      - .cache/pre-commit

Dynamic Pipeline Generation¶

Generate pipelines based on detected changes:

generate:pipeline:
  stage: .pre
  image: python:3.11-slim
  script:
    - |
      # Detect changed modules
      git diff --name-only $CI_MERGE_REQUEST_DIFF_BASE_SHA $CI_COMMIT_SHA > changed_files.txt

      # Generate dynamic pipeline
      python scripts/generate_pipeline.py changed_files.txt > generated-pipeline.yml
  artifacts:
    paths:
      - generated-pipeline.yml
    expire_in: 1 day
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

trigger:dynamic:
  stage: validate
  needs: [generate:pipeline]
  trigger:
    include:
      - artifact: generated-pipeline.yml
        job: generate:pipeline
    strategy: depend
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Parallel Execution Strategies¶

Maximize pipeline efficiency with parallelization:

Matrix-Based Parallel Testing¶

# Test across multiple platforms in parallel
test:multi-platform:
  stage: test
  image: python:3.11-slim
  parallel:
    matrix:
      - PLATFORM: [ubuntu-20.04, ubuntu-22.04, debian-11]
        PYTHON_VERSION: ['3.9', '3.10', '3.11']
  services:
    - docker:dind
  variables:
    DOCKER_HOST: tcp://docker:2375
    TEST_PLATFORM: $PLATFORM
    TEST_PYTHON: $PYTHON_VERSION
  script:
    - echo "Testing on $PLATFORM with Python $PYTHON_VERSION"
    - docker run --rm python:$PYTHON_VERSION-slim python --version
    - pytest --platform=$PLATFORM
  artifacts:
    when: always
    reports:
      junit: junit-${PLATFORM}-${PYTHON_VERSION}.xml
    expire_in: 7 days

# Parallel Terraform module testing
test:terraform:modules:
  stage: test
  image: golang:1.24
  parallel:
    matrix:
      - MODULE: [vpc, rds, eks, s3]
  script:
    - cd modules/$MODULE/tests
    - go test -v -timeout 20m
  artifacts:
    when: always
    reports:
      junit: $MODULE-results.xml
    expire_in: 7 days

Sharded Test Execution¶

# Split large test suite across multiple runners
test:sharded:
  stage: test
  image: node:22-alpine
  parallel: 8  # Split into 8 shards
  before_script:
    - npm ci
  script:
    - |
      echo "Running shard $CI_NODE_INDEX of $CI_NODE_TOTAL"
      npm run test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
  artifacts:
    when: always
    reports:
      junit: junit-shard-${CI_NODE_INDEX}.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage-shard-${CI_NODE_INDEX}.xml
    expire_in: 7 days

# Merge coverage from all shards
coverage:merge:
  stage: .post
  image: node:22-alpine
  needs:
    - test:sharded
  script:
    - npm install -g nyc
    - nyc merge coverage/ .nyc_output/coverage.json
    - nyc report --reporter=html --reporter=text
  coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
  artifacts:
    paths:
      - coverage/
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
    expire_in: 30 days

Parallel DAG Execution¶

# Use DAG for parallel execution with dependencies
stages:
  - validate
  - build
  - test
  - deploy

# Fast parallel validation
lint:yaml:
  stage: validate
  script: yamllint .

lint:terraform:
  stage: validate
  script: terraform fmt -check

lint:ansible:
  stage: validate
  script: ansible-lint

# Build can start as soon as validation passes
build:api:
  stage: build
  needs: [lint:yaml]  # Only needs yaml lint
  script: docker build -t api .

build:web:
  stage: build
  needs: [lint:yaml]
  script: docker build -t web .

# Tests run in parallel, each with specific dependencies
test:api:unit:
  stage: test
  needs: [build:api]
  script: pytest api/tests/

test:api:integration:
  stage: test
  needs: [build:api]
  script: pytest api/tests/integration/

test:web:unit:
  stage: test
  needs: [build:web]
  script: npm test

# Deploy only after ALL tests pass
deploy:staging:
  stage: deploy
  needs:
    - test:api:unit
    - test:api:integration
    - test:web:unit
  script: kubectl apply -f k8s/staging/

Artifact Management¶

Optimize artifact storage and retention:

Tiered Retention Policy¶

# Tier 1: Short-term artifacts (7 days)
test:unit:
  stage: test
  script: npm test
  artifacts:
    when: always
    reports:
      junit: junit.xml
    paths:
      - test-results/
    expire_in: 7 days  # Short retention for frequent tests

# Tier 2: Medium-term artifacts (30 days)
test:coverage:
  stage: test
  script: npm run coverage
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
    paths:
      - coverage/
    expire_in: 30 days  # Keep coverage reports longer

# Tier 3: Long-term artifacts (90 days)
compliance:audit:
  stage: integration
  script: inspec exec compliance/
  artifacts:
    paths:
      - compliance-results.json
      - audit-evidence/
    expire_in: 90 days  # Compliance evidence retention
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

# Tier 4: Release artifacts (1 year)
build:release:
  stage: build
  script: make build
  artifacts:
    paths:
      - dist/
      - CHANGELOG.md
    expire_in: 365 days  # Keep release artifacts
  rules:
    - if: $CI_COMMIT_TAG

Selective Artifact Storage¶

# Only store artifacts on failure for debugging
test:integration:
  stage: test
  script: pytest tests/integration/
  artifacts:
    when: on_failure  # Only save when test fails
    paths:
      - logs/
      - screenshots/
      - test-results/
    expire_in: 14 days

# Always store artifacts but with compression
build:optimized:
  stage: build
  script:
    - make build
    - tar -czf dist.tar.gz dist/
  artifacts:
    paths:
      - dist.tar.gz  # Compressed artifact
    expire_in: 30 days

Artifact Dependencies¶

# Reuse artifacts across jobs
build:
  stage: build
  script: npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day

test:e2e:
  stage: test
  needs:
    - job: build
      artifacts: true  # Download build artifacts
  script:
    - npm run test:e2e dist/

deploy:production:
  stage: deploy
  needs:
    - job: build
      artifacts: true
  script:
    - cp -r dist/* /var/www/html/
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

Merge Request Integration¶

Enhance MR visibility with pipeline integration:

Coverage Badges and Metrics¶

# Generate coverage badge
coverage:badge:
  stage: .post
  image: node:22-alpine
  needs: [test:coverage]
  script:
    - |
      COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
      COLOR="red"
      if (( $(echo "$COVERAGE >= 80" | bc -l) )); then
        COLOR="green"
      elif (( $(echo "$COVERAGE >= 60" | bc -l) )); then
        COLOR="yellow"
      fi

      # Generate badge
      BADGE_URL="https://img.shields.io/badge/coverage-${COVERAGE}%25-${COLOR}"
      echo "Coverage badge: $BADGE_URL"

      # Post to MR
      COMMENT="## 📊 Test Coverage\n\n![Coverage](${BADGE_URL})\n\nCurrent coverage: **${COVERAGE}%**"

      curl --request POST \
        --header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
        --data "body=${COMMENT}" \
        "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Inline MR Comments¶

# Post test results as MR comments
test:results:comment:
  stage: .post
  image: alpine:latest
  needs: [test:unit, test:integration]
  before_script:
    - apk add --no-cache curl jq
  script:
    - |
      # Aggregate test results
      UNIT_PASSED=$(cat test-results/unit.json | jq '.stats.passes')
      UNIT_FAILED=$(cat test-results/unit.json | jq '.stats.failures')
      INTEGRATION_PASSED=$(cat test-results/integration.json | jq '.stats.passes')
      INTEGRATION_FAILED=$(cat test-results/integration.json | jq '.stats.failures')

      # Create formatted comment
      COMMENT="## ✅ Test Results\n\n"
      COMMENT+="### Unit Tests\n"
      COMMENT+="- ✅ Passed: ${UNIT_PASSED}\n"
      COMMENT+="- ❌ Failed: ${UNIT_FAILED}\n\n"
      COMMENT+="### Integration Tests\n"
      COMMENT+="- ✅ Passed: ${INTEGRATION_PASSED}\n"
      COMMENT+="- ❌ Failed: ${INTEGRATION_FAILED}\n"

      # Post comment
      curl --request POST \
        --header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
        --header "Content-Type: application/json" \
        --data "{\"body\":\"${COMMENT}\"}" \
        "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  when: always

Dashboard Links¶

# Add links to external dashboards
dashboard:link:
  stage: .post
  image: alpine:latest
  before_script:
    - apk add --no-cache curl
  script:
    - |
      GRAFANA_URL="https://grafana.example.com/d/pipeline?var-pipeline=${CI_PIPELINE_ID}"
      SONAR_URL="https://sonar.example.com/dashboard?id=${CI_PROJECT_PATH}"

      COMMENT="## 📊 Quality Dashboards\n\n"
      COMMENT+="- [Pipeline Metrics](${GRAFANA_URL})\n"
      COMMENT+="- [Code Quality](${SONAR_URL})\n"
      COMMENT+="- [Test Report](${CI_PROJECT_URL}/-/pipelines/${CI_PIPELINE_ID}/test_report)"

      curl --request POST \
        --header "PRIVATE-TOKEN: ${CI_JOB_TOKEN}" \
        --data "body=${COMMENT}" \
        "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes"
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Common Pitfalls¶

Cache Key Collisions Across Branches¶

Issue: Using ${CI_COMMIT_REF_SLUG} as cache key causes cache misses when switching branches even for identical dependencies.

Example:

## Bad - Branch-specific cache keys
cache:
  key: ${CI_COMMIT_REF_SLUG}  # Different key for each branch
  paths:
    - node_modules/

build:
  script:
    - npm ci  # Reinstalls on every branch switch
    - npm run build

Solution: Use lock file hash as cache key.

## Good - Content-based cache keys
cache:
  key:
    files:
      - package-lock.json  # ✅ Changes only when dependencies change
  paths:
    - node_modules/
    - .npm/

build:
  script:
    - npm ci --cache .npm
    - npm run build

Key Points:

Use lock file hashes for dependency caches
Include package manager cache directory (.npm, .yarn)
Consider ${CI_COMMIT_REF_SLUG}-${checksum} for branch isolation
Use policy: pull in most jobs, pull-push only in one job

Missing Dependencies Specification¶

Issue: Jobs fail because dependencies or needs is not specified, causing artifacts from previous jobs to be unavailable.

Example:

## Bad - Implicit dependencies
build:
  stage: build
  script:
    - npm run build
  artifacts:
    paths:
      - dist/

deploy:
  stage: deploy
  script:
    - ls dist/  # ❌ dist/ not available!
    - ./deploy.sh

Solution: Explicitly declare job dependencies.

## Good - Explicit dependencies
build:
  stage: build
  script:
    - npm run build
  artifacts:
    paths:
      - dist/

deploy:
  stage: deploy
  dependencies:
    - build  # ✅ Download artifacts from build job
  script:
    - ls dist/  # Now available
    - ./deploy.sh

Key Points:

Use dependencies: [job1, job2] to download specific artifacts
Use dependencies: [] to download no artifacts
needs creates both dependency and downloads artifacts
Missing dependencies downloads all artifacts from previous stages

Services Hostname Confusion¶

Issue: Trying to connect to services using localhost instead of service name causes connection failures.

Example:

## Bad - Using localhost for services
test:
  services:
    - postgres:15
  variables:
    DATABASE_URL: postgresql://user:pass@localhost:5432/db  # ❌ Wrong!
  script:
    - npm test  # Cannot connect to database

Solution: Use service name as hostname.

## Good - Use service name as hostname
test:
  services:
    - name: postgres:15
      alias: database  # Optional custom alias
  variables:
    DATABASE_URL: postgresql://user:pass@database:5432/db  # ✅ Service alias
  script:
    - npm test

Key Points:

Services are accessible by image name (postgres, redis, mongo)
Use alias to customize service hostname
Service port is the container's internal port (not mapped)
Wait for service readiness before running tests

Rule Precedence Gotchas¶

Issue: Multiple rules entries create unexpected behavior due to first-match-wins semantics.

Example:

## Bad - First rule always matches
deploy:
  rules:
    - if: $CI_COMMIT_BRANCH  # ❌ Matches any branch!
      when: always
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual
  script:
    - ./deploy.sh  # Runs automatically on all branches, not just main

Solution: Order rules from most specific to least specific.

## Good - Specific rules first
deploy:
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual  # ✅ Manual deploy on main
    - if: $CI_COMMIT_BRANCH == "develop"
      when: always  # Auto deploy on develop
    - when: never  # Don't run on other branches
  script:
    - ./deploy.sh

Key Points:

Rules are evaluated top-to-bottom, first match wins
Always end with a default rule (when: never or when: on_success)
Use && for multiple conditions in one rule
Test rule logic with --dry-run

Variable Expansion in Non-String Contexts¶

Issue: Variables not expanding in certain YAML contexts like only, except, or numeric values.

Example:

## Bad - Variables don't expand in only/except
deploy:
  only:
    - $CI_DEFAULT_BRANCH  # ❌ Treated as literal string!
  script:
    - ./deploy.sh

## Bad - Variables in numeric contexts
test:
  parallel: $PARALLEL_COUNT  # ❌ Not expanded
  script:
    - npm test

Solution: Use rules for conditional execution and .env syntax for expansion.

## Good - Use rules for branch matching
deploy:
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH  # ✅ Expands correctly
  script:
    - ./deploy.sh

## Good - Expand variables in script
test:
  script:
    - export COUNT=${PARALLEL_COUNT:-4}
    - echo "Running $COUNT parallel tests"
    - npm test -- --shard=$CI_NODE_INDEX/$COUNT

Key Points:

Use rules instead of only/except for variable-based conditions
Variables expand in scripts, not in YAML structure
Use ${VAR:-default} for default values
Numeric YAML keys don't support variable expansion

Anti-Patterns¶

❌ Avoid: No Cache¶

## Bad - Reinstalling dependencies every time
test:
  script:
    - npm install
    - npm test

## Good - Using cache
test:
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/
  script:
    - npm ci
    - npm test

❌ Avoid: Hardcoded Secrets¶

## Bad - Hardcoded credentials
deploy:
  script:
    - ssh user@server "password123"

## Good - Use CI/CD variables
deploy:
  script:
    - ssh $DEPLOY_USER@$DEPLOY_SERVER

❌ Avoid: No Artifact Expiration¶

## Bad - Artifacts kept forever
build:
  artifacts:
    paths:
      - dist/

## Good - Set expiration
build:
  artifacts:
    paths:
      - dist/
    expire_in: 1 week

❌ Avoid: Not Using Rules Instead of only/except¶

## Bad - Using deprecated only/except
deploy:
  only:
    - main
  except:
    - schedules
  script:
    - ./deploy.sh

## Good - Use rules
deploy:
  rules:
    - if: $CI_COMMIT_BRANCH == "main" && $CI_PIPELINE_SOURCE != "schedule"
  script:
    - ./deploy.sh

❌ Avoid: Running All Jobs on All Branches¶

## Bad - Expensive jobs run on every branch
build-docker:
  script:
    - docker build -t myapp .
    - docker push myapp  # ❌ Pushes on every branch!

deploy-prod:
  script:
    - ./deploy-production.sh  # ❌ Deploys from any branch!

## Good - Restrict jobs to appropriate branches
build-docker:
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
    - if: $CI_COMMIT_TAG
  script:
    - docker build -t myapp:$CI_COMMIT_SHORT_SHA .
    - docker push myapp:$CI_COMMIT_SHORT_SHA

deploy-prod:
  rules:
    - if: $CI_COMMIT_TAG =~ /^v\d+\.\d+\.\d+$/
  script:
    - ./deploy-production.sh
  environment:
    name: production

❌ Avoid: Not Using extends for Shared Configuration¶

## Bad - Duplicated configuration
test-unit:
  image: node:22-alpine
  before_script:
    - npm ci
  script:
    - npm run test:unit

test-integration:
  image: node:22-alpine
  before_script:
    - npm ci
  script:
    - npm run test:integration

## Good - Use extends
.node-base:
  image: node:22-alpine
  before_script:
    - npm ci

test-unit:
  extends: .node-base
  script:
    - npm run test:unit

test-integration:
  extends: .node-base
  script:
    - npm run test:integration

❌ Avoid: Not Using Retry for Flaky Jobs¶

## Bad - Flaky job fails pipeline
integration-tests:
  script:
    - npm run test:integration  # ❌ No retry on failure

## Good - Retry flaky jobs
integration-tests:
  script:
    - npm run test:integration
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

Security Best Practices¶

Secrets Management¶

Protect sensitive data in CI/CD pipelines:

## Bad - Hardcoded secrets in pipeline
deploy:
  script:
    - echo "API_KEY=sk-1234567890abcdef" >> .env
    - aws configure set aws_access_key_id AKIAIOSFODNN7EXAMPLE  # ❌ Exposed!

## Good - Use protected CI/CD variables
deploy:
  script:
    - echo "API_KEY=$API_KEY" >> .env  # API_KEY from protected variable
    - aws configure set aws_access_key_id "$AWS_ACCESS_KEY_ID"  # ✅ From variables
  only:
    - main  # Protected branch only

## Good - Use masked variables
variables:
  DATABASE_URL: ${DB_URL}  # Masked in GitLab UI and logs

## Good - Use file-type variables for certificates
deploy:
  before_script:
    - echo "$SSH_PRIVATE_KEY" > ~/.ssh/id_rsa
    - chmod 600 ~/.ssh/id_rsa

Key Points:

Store secrets in GitLab CI/CD Variables (Settings > CI/CD > Variables)
Enable "Masked" to hide values in job logs
Enable "Protected" to restrict to protected branches only
Use "File" type for certificates and large secrets
Never commit secrets to .gitlab-ci.yml
Rotate secrets regularly

Protected Branches and Runners¶

Restrict pipeline execution to authorized users and branches:

## Good - Protected branch deployment
deploy_production:
  stage: deploy
  script:
    - ./deploy-prod.sh
  environment:
    name: production
  only:
    - main  # Protected branch
  when: manual  # Require manual approval

## Good - Use protected runners for sensitive jobs
deploy_production:
  stage: deploy
  tags:
    - protected-runner  # Runner tagged as protected in GitLab
  script:
    - ./deploy-prod.sh
  only:
    - main

Key Points:

Configure protected branches (Settings > Repository > Protected branches)
Restrict who can merge to protected branches
Use protected runners for production deployments
Require manual approval for critical deployments
Implement approval rules for merge requests

Docker Image Security¶

Use secure, trusted container images:

## Bad - Using latest tag
test:
  image: node:latest  # ❌ Unpredictable, potential security issues
  script:
    - npm test

## Good - Pin specific versions
test:
  image: node:20.10.0-alpine  # ✅ Specific, minimal image
  script:
    - npm test

## Good - Use internal registry with scanned images
test:
  image: registry.gitlab.com/myorg/secure-node:20.10.0-alpine
  script:
    - npm test

## Good - Scan images for vulnerabilities
build_image:
  stage: build
  image: docker:latest
  services:
    - docker:dind
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker scan $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA  # Scan for vulnerabilities
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

Key Points:

Always pin specific image versions (avoid latest)
Use minimal base images (alpine, distroless)
Scan images for vulnerabilities (Trivy, Clair, Snyk)
Use trusted registries only
Regularly update base images
Verify image signatures

Code Injection Prevention¶

Prevent command injection in pipeline scripts:

## Bad - Unvalidated user input
deploy:
  script:
    - ssh user@$DEPLOY_SERVER "$CI_COMMIT_MESSAGE"  # ❌ Injection risk!
    - eval $USER_COMMAND  # ❌ Never use eval!

## Good - Validate and sanitize inputs
deploy:
  script:
    - |
      if [[ ! "$DEPLOY_ENV" =~ ^(dev|staging|prod)$ ]]; then
        echo "Invalid environment"
        exit 1
      fi
    - ./deploy.sh "$DEPLOY_ENV"  # Quoted, validated variable

## Good - Use predefined commands
deploy:
  variables:
    ALLOWED_COMMANDS: "deploy.sh status.sh rollback.sh"
  script:
    - |
      if [[ " $ALLOWED_COMMANDS " =~ " $COMMAND " ]]; then
        ./"$COMMAND"
      else
        echo "Unauthorized command"
        exit 1
      fi

Key Points:

Never use eval with user-controlled input
Validate all variables before use
Use allow-lists for dynamic values
Quote all variables in scripts
Sanitize commit messages and user inputs
Use parameterized commands

Dependency Security¶

Secure third-party dependencies:

## Good - Pin dependency versions
build:
  image: node:20.10.0-alpine
  script:
    - npm ci  # Use package-lock.json (deterministic installs)
    - npm audit  # Check for vulnerabilities

## Good - Verify checksums
build:
  script:
    - wget https://example.com/tool.tar.gz
    - echo "$EXPECTED_CHECKSUM  tool.tar.gz" | sha256sum -c  # Verify checksum
    - tar -xzf tool.tar.gz

## Good - Use dependency scanning
include:
  - template: Security/Dependency-Scanning.gitlab-ci.yml

dependency_scan:
  stage: test
  allow_failure: false  # Fail on vulnerabilities

Key Points:

Pin all dependency versions (package-lock.json, Gemfile.lock, etc.)
Use npm ci instead of npm install
Run dependency audits (npm audit, bundle audit, etc.)
Verify package checksums
Use GitLab Dependency Scanning
Monitor for supply chain attacks

Access Control and Least Privilege¶

Implement least privilege for pipeline execution:

## Good - Use service accounts with minimal permissions
deploy_aws:
  script:
    - aws s3 sync ./dist s3://my-bucket --delete
  variables:
    AWS_ACCESS_KEY_ID: $AWS_DEPLOY_KEY_ID  # Service account with S3-only access
    AWS_SECRET_ACCESS_KEY: $AWS_DEPLOY_SECRET

## Good - Restrict runner access
deploy_production:
  tags:
    - production-runner  # Dedicated runner with limited network access
  only:
    - main
  script:
    - ./deploy.sh

Key Points:

Use service accounts with minimum required permissions
Separate runners by environment (dev, staging, prod)
Restrict runner network access
Use RBAC for pipeline access control
Audit who can trigger pipelines
Limit access to protected variables

Artifact Security¶

Secure build artifacts:

## Good - Set appropriate artifact expiration
build:
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 week  # Auto-cleanup
    reports:
      coverage: coverage/cobertura-coverage.xml

## Good - Protect sensitive artifacts
deploy:
  dependencies:
    - build
  script:
    - |
      # Encrypt sensitive artifacts before storage
      tar -czf dist.tar.gz dist/
      openssl enc -aes-256-cbc -salt -in dist.tar.gz -out dist.tar.gz.enc -k "$ENCRYPTION_KEY"
    - ./deploy.sh dist.tar.gz.enc

Key Points:

Set appropriate artifact expiration times
Don't store secrets in artifacts
Encrypt sensitive artifacts
Use access controls for artifact download
Validate artifact integrity (checksums)
Clean up old artifacts regularly

Audit Logging and Monitoring¶

Monitor pipeline activity:

## Good - Log security events
deploy:
  before_script:
    - echo "Deployment initiated by $GITLAB_USER_LOGIN at $(date)"
    - echo "Target environment: $CI_ENVIRONMENT_NAME"
  script:
    - ./deploy.sh
  after_script:
    - |
      if [ $CI_JOB_STATUS == "success" ]; then
        ./send-audit-log.sh "Deployment successful"
      else
        ./send-alert.sh "Deployment failed - investigation required"
      fi

Key Points:

Enable audit logging for all environments
Monitor failed pipeline runs
Track who triggered deployments
Alert on security policy violations
Review access logs regularly
Maintain pipeline execution history

Network Security¶

Secure pipeline network access:

## Good - Use VPN or private networks for sensitive operations
deploy_database:
  before_script:
    - openvpn --config production-vpn.conf  # Connect to private network
  script:
    - psql -h $DB_HOST -U $DB_USER -d $DB_NAME < migration.sql
  after_script:
    - killall openvpn  # Disconnect VPN

## Good - Restrict outbound connections
test:
  script:
    - npm test
  variables:
    HTTP_PROXY: "http://proxy.internal:8080"  # Route through approved proxy
    HTTPS_PROXY: "http://proxy.internal:8080"

Key Points:

Use VPNs or private networks for database access
Restrict outbound internet access from runners
Use approved proxies for external connections
Implement network segmentation
Monitor network traffic from runners
Use firewall rules to limit access

Tool Configuration¶

gitlab-ci-local - Local Pipeline Testing¶

Install and configure gitlab-ci-local for testing pipelines locally:

## Install gitlab-ci-local (npm)
npm install -g gitlab-ci-local

## Install gitlab-ci-local (brew)
brew install gitlab-ci-local

## Run entire pipeline
gitlab-ci-local

## Run specific job
gitlab-ci-local build

## List all jobs
gitlab-ci-local --list

## Run with specific file
gitlab-ci-local --file .gitlab-ci.custom.yml

## Dry run
gitlab-ci-local --preview

## Use specific variables
gitlab-ci-local --variable CI_COMMIT_REF_NAME=main

.gitlab-ci-local-variables.yml¶

## .gitlab-ci-local-variables.yml
## Local development variables
CI_PROJECT_NAME: my-project
CI_COMMIT_BRANCH: main
CI_COMMIT_REF_NAME: main
DOCKER_REGISTRY: localhost:5000
DEPLOY_ENV: development

gitlab-ci-lint - Pipeline Validation¶

## Validate .gitlab-ci.yml syntax (requires GitLab instance)
gitlab-ci-lint .gitlab-ci.yml

## Using GitLab API
curl --header "PRIVATE-TOKEN: ${GITLAB_TOKEN}" \
  "https://gitlab.com/api/v4/projects/${PROJECT_ID}/ci/lint" \
  --form "content@.gitlab-ci.yml"

## Using glab CLI
glab ci lint

VS Code Settings¶

{
  "files.associations": {
    ".gitlab-ci*.yml": "yaml"
  },
  "[yaml]": {
    "editor.defaultFormatter": "redhat.vscode-yaml",
    "editor.formatOnSave": true
  },
  "yaml.schemas": {
    "https://gitlab.com/gitlab-org/gitlab/-/raw/master/app/assets/javascripts/editor/schema/ci.json": [
      ".gitlab-ci.yml",
      ".gitlab-ci.*.yml"
    ]
  },
  "yaml.customTags": [
    "!reference sequence"
  ]
}

Pre-commit Hooks¶

## .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
        files: \.gitlab-ci.*\.ya?ml$
      - id: check-added-large-files

  - repo: https://github.com/adrienverge/yamllint
    rev: v1.35.1
    hooks:
      - id: yamllint
        files: \.gitlab-ci.*\.ya?ml$
        args: ['-d', '{extends: default, rules: {line-length: {max: 120}}}']

  # Optional: gitlab-ci-local validation
  - repo: local
    hooks:
      - id: gitlab-ci-local-lint
        name: GitLab CI Local Lint
        entry: gitlab-ci-local --preview
        language: system
        files: \.gitlab-ci\.yml$
        pass_filenames: false

yamllint Configuration¶

## .yamllint
extends: default

rules:
  line-length:
    max: 120
    level: warning
  indentation:
    spaces: 2
    indent-sequences: true
  comments:
    min-spaces-from-content: 1
  document-start: disable
  truthy:
    allowed-values: ['true', 'false']
  key-duplicates: enable

EditorConfig¶

## .editorconfig
[.gitlab-ci*.{yml,yaml}]
indent_style = space
indent_size = 2
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true

Makefile¶

## Makefile
.PHONY: ci-local ci-list ci-validate

ci-local:
 gitlab-ci-local

ci-list:
 gitlab-ci-local --list

ci-validate:
 gitlab-ci-local --preview
 yamllint .gitlab-ci.yml
 @echo "✓ GitLab CI configuration is valid"

ci-job:
 gitlab-ci-local $(JOB)

## Example: make ci-job JOB=build

ci-debug:
 gitlab-ci-local --shell-isolation=false $(JOB)

.gitlab-ci-include-local.yml¶

Template for reusable CI configurations:

## .gitlab-ci/templates/docker.yml
.docker_build:
  image: docker:24
  services:
    - docker:24-dind
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: "/certs"

.docker_push:
  extends: .docker_build
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA

GitLab CI Workflow Validation Job¶

Add to your .gitlab-ci.yml:

validate:ci:
  stage: .pre
  image: python:3.11-slim
  before_script:
    - pip install yamllint
  script:
    - yamllint .gitlab-ci.yml
    - echo "✓ Pipeline configuration is valid"
  rules:
    - changes:
        - .gitlab-ci.yml
        - .gitlab-ci/**/*

glab CLI Configuration¶

## ~/.config/glab-cli/config.yml
hosts:
  gitlab.com:
    user: your-username
    token: glpat-xxxxxxxxxxxxx
    git_protocol: ssh
    api_protocol: https

pager:
  ci: false
  mr: less

editor: vim

browser: firefox

Docker Compose for Local GitLab Runner¶

## docker-compose.gitlab-runner.yml
version: '3.8'

services:
  gitlab-runner:
    image: gitlab/gitlab-runner:latest
    container_name: gitlab-runner-local
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./gitlab-runner-config:/etc/gitlab-runner
    restart: unless-stopped

References¶

Official Documentation¶

Best Practices¶

Status: Active