DevSecOps

Pass/Fail Thresholds: Security Gates for CI/CD Pipelines

Configurable security thresholds CLI exit codes and automated enforcement

What Are Pass/Fail Thresholds

Pass/fail thresholds are configurable rules that determine when security scans should fail a CI/CD build based on the severity and quantity of vulnerabilities detected. Instead of treating all vulnerabilities equally, thresholds allow teams to define acceptable risk levels and enforce security gates that block merges or deployments when those levels are exceeded.

A typical threshold policy might be: "Fail the build if any CRITICAL vulnerabilities are introduced, or if more than 3 HIGH-severity vulnerabilities are detected." This prevents code with unacceptable security risk from reaching production while allowing teams to address lower-severity issues through normal development workflows.

Without thresholds, security scans are purely informational—they report findings but do not enforce remediation. Developers see vulnerability reports, acknowledge them, and merge anyway due to deadline pressure. Thresholds transform security scanning from advisory to mandatory, making security violations a build failure equivalent to failing tests.

Why Thresholds Matter: Security Gates in CI/CD

The Problem: Security Theater

Many organizations run security scans in CI/CD pipelines but do not fail builds when vulnerabilities are detected. This creates security theater: scans run, reports are generated, vulnerabilities accumulate, but nothing prevents vulnerable code from deploying to production.

Developers see hundreds of vulnerability alerts, become desensitized, and begin ignoring security warnings entirely. A CRITICAL SQL injection vulnerability appears alongside 50 LOW-severity findings, and the entire report is dismissed as noise.

The Solution: Automated Enforcement

Pass/fail thresholds enforce security standards automatically, without requiring manual code review gates or security team approvals. The CI/CD pipeline blocks merges when thresholds are exceeded, creating fast feedback loops:

  • Developer commits SQL injection: CI/CD fails within seconds with clear error message
  • Developer sees failure: Links directly to the vulnerable code and remediation guidance
  • Developer fixes issue: Re-runs CI/CD, build passes, merge proceeds

The entire cycle takes minutes instead of days waiting for security review.

Business Impact

Organizations with pass/fail thresholds reduce production vulnerabilities by 60-80% compared to advisory-only scanning. Why? Because vulnerabilities are caught and fixed in development rather than discovered in production through penetration tests or worse—active exploitation.

Thresholds also create audit trails. When compliance auditors ask "How do you prevent SQL injection from reaching production?" the answer is concrete: "Our CI/CD fails if SQL injection is detected. Here are 200 blocked deployments from last quarter."

Threshold Configuration Strategies

1. Zero Tolerance (Maximum Security)

Block any introduction of vulnerabilities above a minimum severity level:

# Fail on any CRITICAL or HIGH severity
codeslick analyze --fail-on critical,high

When to use: Financial services, healthcare, government systems with strict compliance requirements (PCI-DSS, HIPAA, FedRAMP). Organizations that prioritize security over velocity.

Tradeoffs: High false-positive rate from SAST tools can block legitimate development. Requires security team availability to triage and approve exceptions.

2. Critical Only (Balanced)

Block only the most severe vulnerabilities that pose immediate exploitation risk:

# Fail only on CRITICAL severity
codeslick analyze --fail-on critical

When to use: Most organizations. Blocks SQL injection, command injection, hardcoded secrets, RCE vulnerabilities while allowing teams to address HIGH/MEDIUM findings through normal backlog prioritization.

Tradeoffs: HIGH-severity vulnerabilities (XSS, authentication bypass) may reach production if not addressed proactively.

3. Count-Based Thresholds

Fail when vulnerability count exceeds a threshold, preventing accumulation of technical security debt:

# Fail if 1+ CRITICAL or 3+ HIGH vulnerabilities
codeslick analyze \
  --fail-on-count critical:1,high:3

When to use: Legacy codebases with existing HIGH-severity findings that cannot be remediated immediately. Prevents new HIGH findings from accumulating while allowing gradual remediation.

Tradeoffs: Requires baseline measurement. Teams must track existing vulnerability counts and reduce them over time.

4. Delta Thresholds (New Vulnerabilities Only)

Fail only when new vulnerabilities are introduced by the current change, ignoring pre-existing issues:

# Fail if this PR introduces new CRITICAL or HIGH vulnerabilities
codeslick analyze \
  --baseline main \
  --fail-on-new critical,high

When to use: Brownfield projects with legacy security debt. Prevents making the problem worse while allowing time to address existing vulnerabilities.

Tradeoffs: Existing vulnerabilities remain unaddressed unless explicitly prioritized. Requires SARIF or SBOM diff comparison.

5. Category-Based Thresholds

Different thresholds for different vulnerability categories based on organizational risk appetite:

# Strict on injection, lenient on info disclosure
codeslick analyze \
  --fail-on-category "injection:critical,high" \
  --fail-on-category "secrets:critical" \
  --warn-on-category "info-disclosure:high"

When to use: Organizations with differentiated risk tolerance. Injection and secrets are zero-tolerance; information disclosure is advisory.

Tradeoffs: Complex configuration. Requires security expertise to define appropriate category mappings.

Real-World Threshold Policies

Startup (Velocity Focused)

# Block only the worst vulnerabilities
fail_on:
  - CRITICAL (SQL injection, command injection, RCE)
  - Hardcoded secrets (API keys, passwords)

warn_on:
  - HIGH (XSS, authentication bypass)
  - MEDIUM (path traversal, insecure config)

Rationale: Early-stage startups prioritize velocity. Block vulnerabilities that enable immediate data breaches or credential theft. Treat HIGH/MEDIUM as backlog items.

Enterprise Financial Services (Compliance Focused)

# PCI-DSS Level 1 requirements
fail_on:
  - CRITICAL (all categories)
  - HIGH (injection, secrets, authentication)
  - Any SQL injection (regardless of severity)
  - Any hardcoded secrets (regardless of severity)

warn_on:
  - HIGH (other categories)
  - MEDIUM (all categories)

require_manual_approval:
  - Any changes to payment processing code
  - Any changes to authentication logic

Rationale: Regulatory compliance (PCI-DSS) mandates strict controls on vulnerabilities affecting cardholder data. SQL injection and authentication issues are zero-tolerance.

SaaS Product (Customer Trust Focused)

# Balance security and development velocity
fail_on:
  - CRITICAL (all categories)
  - HIGH (injection, secrets, authentication, XSS)

warn_on:
  - HIGH (other categories)
  - MEDIUM (all categories)

exceptions:
  - Allow HIGH findings with security team approval
  - Document exceptions in JIRA with remediation plan

Rationale: Customer trust is paramount. Block vulnerabilities that lead to account takeover or data breaches. Allow HIGH findings in less sensitive areas with documented approval.

Open Source Project (Community Trust Focused)

# Maintain community trust and security reputation
fail_on:
  - CRITICAL (all categories)
  - HIGH (injection, secrets)
  - AI-generated code with hallucinations (CVSS 8.5+)

warn_on:
  - HIGH (other categories)
  - MEDIUM (all categories)

public_disclosure:
  - Document security fixes in changelog
  - Credit security researchers in release notes

Rationale: Open-source projects depend on community trust. Proactive security prevents CVE disclosures that damage reputation. AI hallucination detection prevents runtime errors.

CLI Exit Codes and Automation

CodeSlick CLI Exit Codes

The CodeSlick CLI uses standard Unix exit codes to signal scan results to CI/CD systems:

  • Exit 0: No vulnerabilities above configured thresholds (build passes)
  • Exit 1: Vulnerabilities exceed thresholds (build fails)
  • Exit 2: Analysis error (tool failure, invalid configuration)

CI/CD systems interpret non-zero exit codes as build failures, automatically blocking merges or deployments.

Basic Threshold Configuration

# Fail on CRITICAL only
codeslick analyze --fail-on critical
echo $?  # Exit code: 0 (pass) or 1 (fail)

# Fail on CRITICAL or HIGH
codeslick analyze --fail-on critical,high

# Fail with count thresholds
codeslick analyze --fail-on-count critical:1,high:3,medium:10

Advanced Configuration File

Define thresholds in .codeslick.yml for consistent enforcement across team:

# .codeslick.yml
thresholds:
  fail_on:
    - critical
    - high
  fail_on_count:
    medium: 10
    low: 50
  fail_on_category:
    - "injection:critical,high"
    - "secrets:critical,high,medium"
  ignore_paths:
    - "test/**"
    - "scripts/**"
  exceptions:
    - rule: "js-sql-injection"
      file: "src/legacy/report-generator.js"
      reason: "Legacy code, planned refactor Q2 2026"
      approved_by: "security@company.com"
      expires: "2026-06-30"

GitHub Actions Integration

name: Security Scan with Pass/Fail Thresholds

on:
  pull_request:
    branches: [main]

jobs:
  security-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install CodeSlick CLI
        run: npm install -g codeslick-cli

      - name: Security scan with thresholds
        run: |
          codeslick analyze \
            --fail-on critical,high \
            --format sarif \
            --output results.sarif

      - name: Upload SARIF (even if scan fails)
        if: always()
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: results.sarif

      - name: Comment on PR with results
        if: failure()
        uses: actions/github-script@v6
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: 'Security scan failed: CRITICAL or HIGH vulnerabilities detected. Review the Security tab for details.'
            })

GitLab CI Integration

security-scan:
  stage: test
  image: node:18
  script:
    - npm install -g codeslick-cli
    - codeslick analyze --fail-on critical,high --output results.json
  artifacts:
    when: always
    reports:
      codequality: results.json
  only:
    - merge_requests
  allow_failure: false  # Block merge if scan fails

Jenkins Pipeline Integration

pipeline {
  agent any
  stages {
    stage('Security Scan') {
      steps {
        sh 'npm install -g codeslick-cli'
        script {
          def exitCode = sh(
            script: 'codeslick analyze --fail-on critical,high',
            returnStatus: true
          )
          if (exitCode != 0) {
            error("Security scan failed: vulnerabilities exceed thresholds")
          }
        }
      }
    }
  }
  post {
    always {
      archiveArtifacts artifacts: 'codeslick-report.json'
    }
  }
}

GitHub PR Status Checks

How GitHub Status Checks Work

When CodeSlick GitHub App scans a pull request, it posts a status check indicating pass or fail based on configured thresholds. Status checks appear at the bottom of the PR, preventing merge when checks fail.

Status Check Display

CodeSlick Security Scan — Failing
  12 vulnerabilities detected (2 CRITICAL, 5 HIGH, 5 MEDIUM)
  Threshold policy: Fail on CRITICAL or HIGH
  Details: View in Security tab

The status check links to the GitHub Security tab with full vulnerability details, or to the CodeSlick dashboard for team-level analytics.

Branch Protection Rules

Enable branch protection to enforce threshold-based status checks:

Repository Settings → Branches → Branch protection rules

✓ Require status checks to pass before merging
  ✓ CodeSlick Security Scan
✓ Require branches to be up to date before merging

With branch protection enabled, developers cannot merge PRs until the CodeSlick status check passes. The "Merge" button is disabled until vulnerabilities are remediated.

Configuring GitHub App Thresholds

Navigate to CodeSlick dashboard → Team Settings → Security Policies:

Pass/Fail Policy:
  ○ Advisory only (report findings, do not block merges)
  ● Fail on CRITICAL
  ○ Fail on CRITICAL or HIGH
  ○ Custom policy

Custom Policy:
  Fail if:
    - Any CRITICAL vulnerabilities
    - 3 or more HIGH vulnerabilities
    - Any SQL injection or command injection (regardless of severity)
    - Hardcoded secrets detected

  Ignore:
    - Test files (test/**, **/*.test.js)
    - Scripts (scripts/**)

  Exceptions:
    - [Add file/rule exceptions with expiration dates]

Status Check Override

For urgent hotfixes, repository admins can override failed status checks with documented justification:

Override security check — Reason required
  Reason: Hotfix for P0 production incident
  Approved by: security-team@company.com
  Remediation plan: Create JIRA SEC-1234 for follow-up fix

[Override and Merge]

Overrides are logged in the CodeSlick audit trail for compliance reviews.

Threshold Metrics and Refinement

Measuring Threshold Effectiveness

Track these metrics to optimize threshold policies:

1. Build Failure Rate

Percentage of builds that fail due to threshold violations:

  • Too high (30%+): Thresholds are too strict or false-positive rate is excessive. Developers route around checks or request frequent overrides.
  • Too low (5%): Thresholds are too lenient. Vulnerabilities are reaching production.
  • Target: 10-20%: Catching real issues without excessive friction.

2. Mean Time to Remediation (MTTR)

Time from vulnerability detection to fix:

  • CRITICAL: Target < 24 hours (immediate fix)
  • HIGH: Target < 1 week (sprint priority)
  • MEDIUM: Target < 1 month (backlog prioritization)

If MTTR exceeds targets, thresholds may be too strict for the team's security maturity.

3. Override Frequency

Number of threshold overrides (admin bypass of security gates):

  • High override rate (10%+): Thresholds misaligned with business needs. Indicates process problem.
  • Zero overrides: Good sign, but verify developers are not working around checks.

4. Production Vulnerability Escape Rate

Vulnerabilities detected in production that passed CI/CD checks:

  • Goal: <5%: Most vulnerabilities caught in CI/CD
  • Escape analysis: Were they false negatives (tool missed them) or threshold gaps (below severity threshold)?

Refining Thresholds Over Time

Start strict, relax incrementally based on data:

Quarter 1: Fail on CRITICAL only
  - Establish baseline, build muscle memory
  - MTTR: 18 hours (within target)
  - Failure rate: 8% (acceptable)

Quarter 2: Add HIGH for injection and secrets
  - Expand to highest-risk categories
  - MTTR: 2.5 days (above target, needs improvement)
  - Failure rate: 15% (acceptable, within range)

Quarter 3: Add HIGH for all categories
  - Full HIGH coverage
  - MTTR: 1.8 days (improved, within target)
  - Failure rate: 22% (slightly high, monitor)

Quarter 4: Add count threshold for MEDIUM (10+)
  - Prevent accumulation of security debt
  - MTTR: 1.2 days (excellent)
  - Failure rate: 18% (target achieved)

Avoid changing thresholds based on individual incidents. Use quarterly reviews with aggregate metrics.

Frequently Asked Questions

Related Guides

Pass/Fail Thresholds: Security Gates for CI/CD Pipelines | CodeSlick Security Scanner