Boundary note: The Security Architecture KRI domain measures whether your environment is configured to use current cryptographic standards, TLS versions, cipher suites, key lengths, and algorithm selection. This file measures something different: whether your cryptographic infrastructure is operationally sound, whether certificates are discovered and managed before they expire, whether keys are rotated and governed, whether your CA hierarchy is healthy, and whether your organization is positioned for the post-quantum transition. Design intent vs. operational execution.
The KRIs in this domain measure the operational health of your cryptographic program, not the algorithms themselves (covered in Security Architecture), but the management, lifecycle, and governance layer that keeps those algorithms working safely in production. For most organizations, this is the least instrumented security domain. It shouldn't be.
If you are standing this up from scratch, start with how to build a KRI program and the consolidated KRI reference library, which maps every domain to one CIS-aligned catalog.
KRI inventory
1. Certificate expiry exposure rate
What to measure. Percentage of active certificates in the environment that are within 30 days of expiration and have not been queued for automated renewal, across internal PKI, public CA-issued, and code signing certificates.
Why it matters. Certificate expiration is one of the most avoidable operational security failures in enterprise environments, and one of the most common. When certificates expire, services break immediately and visibly. The downstream effects include: application downtime, broken API integrations, failed CI/CD pipelines, and in some cases, customers receiving certificate errors that trigger security alerts. Beyond availability, expired certificates can create windows where services fall back to insecure communication modes. The $2M+ Microsoft Teams outage in 2020 was certificate-related. Many organizations still discover certificates by the outage they cause.
- Certificate management platforms (Venafi, DigiCert CertCentral, Sectigo, AppViewX): expiry tracking and renewal queue status across managed certificates
- ACME/Let's Encrypt auto-renewal status: renewal success/failure logs, failed renewal alerts
- Cloud provider certificate managers (AWS ACM, Azure Key Vault, GCP Certificate Manager): certificate expiry dashboard and renewal automation status
- Active Directory Certificate Services (ADCS): internal CA-issued certificate inventory and expiry reporting
- Certificate discovery scanning (Censys, Shodan, SSL Labs, internal scanners like Nmap with ssl-cert script): discovery of certificates not enrolled in management platforms, these are the dangerous unknowns
- CI/CD systems: code signing certificate expiry alerts, pipeline signing key rotation status
How to calculate. (Certificates expiring within 30 days with no renewal queued) ÷ (total active certificates in inventory) × 100 Track separately: internet-facing certificates, internal service certificates, code signing certificates, client authentication certificates.
| Status | Criteria |
|---|---|
| Green | <1% of certificates in expiry window with no renewal queued; automated renewal in place for >90% of certificates; certificate discovery scan covers full environment; no unmanaged certificates in production |
| Amber | 1–5% in expiry window unqueued; or >10% of certificates unmanaged (not enrolled in a management platform); or certificate discovery not run in past 90 days |
| Red | >5% in expiry window unqueued; or a certificate expiration has caused a production outage in the past 12 months; or no certificate management platform in use; or significant dark inventory (certificates in production not tracked anywhere) |
2. Certificate discovery coverage rate
What to measure. Percentage of the organization's total certificate footprint that is enrolled in a certificate management platform vs. discovered only through scanning, measuring the "dark certificate" problem: certificates issued and deployed without going through managed processes.
Why it matters. Most organizations have a managed certificate population (the ones IT knows about) and an unmanaged certificate population (the ones developers self-issued, the ones a vendor installed, the ones a project team obtained on a corporate card three years ago). Unmanaged certificates are the ones that expire without warning because no one is watching them. They're also the ones most likely to use deprecated algorithms, be issued from unauthorized CAs, or have been issued to the wrong entity. Certificate sprawl in cloud environments and microservices architectures is severe, a medium-sized organization routinely has thousands of certificates it doesn't know about.
- Certificate management platform enrollment count vs. environment scanning results: the gap is your dark inventory
- Cloud provider certificate discovery: AWS ACM, Azure Key Vault, GCP Certificate Manager inventories vs. what's actually deployed in load balancers, API gateways, and compute instances
- DNS enumeration + certificate transparency (CT) log monitoring: crt.sh, Facebook CT Search, Google Transparency Report, CT logs record every publicly-trusted certificate issued for your domains; anything in CT logs not in your management platform is dark inventory
- Internal network certificate scanning: Nmap, Masscan, or dedicated certificate scanners across RFC 1918 address space, internal services with unmanaged certificates
- Kubernetes cluster scanning: certificates in ingress controllers, service meshes (Istio, Linkerd), and pod-to-pod mTLS configurations
How to calculate. (Certificates enrolled in management platform) ÷ (certificates discovered in environment through all scanning methods) × 100 The denominator should always be larger. A 100% rate means your discovery isn't working.
| Status | Criteria |
|---|---|
| Green | >85% of discovered certificates enrolled in management platform; CT log monitoring active for all owned domains; cloud certificate inventory reconciled monthly; new certificate issuance requires management platform enrollment |
| Amber | 70–85% enrolled; or CT log monitoring not active; or last discovery scan >90 days ago; or no formal certificate issuance policy |
| Red | <70% enrolled; or no active discovery program; or unknown certificate footprint; or certificate-related outages attributed to unmanaged certificates |
3. Certificate authority (CA) infrastructure health score
What to measure. Operational integrity of the organization's public key infrastructure hierarchy, including root CA offline status, intermediate CA certificate validity and configuration, CRL/OCSP availability, CA policy compliance, and unauthorized CA issuance detection.
Why it matters. Your CA hierarchy is the trust anchor for everything in your environment. A compromised or misconfigured CA doesn't just affect one certificate, it affects the validity of every certificate that CA has ever issued. The DigiNotar breach in 2011 destroyed the company after its CA was compromised and used to issue fraudulent certificates for major domains. Internal PKI failures are equally damaging: a root CA that comes online when it shouldn't, an intermediate CA with an overly broad issuance policy, or a CRL that goes offline all create cascading trust failures across the environment.
- Root CA audit logs: when was the root CA powered on, what was issued, who was present, root CAs should be offline except for controlled issuance ceremonies
- Intermediate CA configuration review: validity period, path length constraints, name constraints, key usage extensions, are they appropriately scoped?
- CRL/OCSP infrastructure health monitoring: revocation service uptime, certificate validity, response time, clients that can't reach revocation services may fail open or hard-fail depending on configuration
- ADCS audit: issuance policy review, template configuration, who can issue what types of certificates, enrollment agent controls
- Certificate transparency monitoring for unexpected issuance: if a certificate appears in CT logs for your domain that your CA didn't issue, you have a problem
- Hardware Security Module (HSM) status: CA private keys should be HSM-protected; HSM health, backup status, and key ceremony procedures
How to calculate. Composite score across: (1) root CA offline status compliance, (2) intermediate CA configuration compliance, (3) CRL/OCSP availability %, (4) no unauthorized issuance in CT log monitoring, (5) HSM protection for all CA private keys. Each element scores pass/fail; composite = (passing elements) ÷ 5.
| Status | Criteria |
|---|---|
| Green | All five elements passing; root CA offline and access-logged; HSM-protected for all CA private keys; CRL/OCSP availability >99.9%; no unauthorized issuance detected; CA policy reviewed in past 12 months |
| Amber | One element failing; or CRL/OCSP availability 99.0–99.9%; or CA policy not reviewed in 12+ months; or HSM backup not tested in past 12 months |
| Red | Two or more elements failing; or root CA online without a logged ceremony; or CA private key not HSM-protected; or unauthorized certificate issuance detected; or CRL/OCSP outage affecting client authentication |
4. Cryptographic key rotation compliance rate
What to measure. Percentage of active cryptographic keys (signing keys, encryption keys, API keys used in cryptographic contexts, TLS private keys, symmetric encryption keys) that have been rotated within policy-defined intervals, and the percentage of key rotation events that were automated vs. manual.
Why it matters. Key rotation limits the blast radius of key compromise. A private key that has never been rotated has been in existence since the day it was created, every day it exists is another day it could be stolen, leaked, or brute-forced. Signing keys for code, tokens, or certificates that are never rotated create single points of long-term trust: if the key is ever compromised, every artifact it ever signed is suspect. Beyond compromise risk, key rotation is a regulatory requirement in PCI DSS (annual rotation for encryption keys used to protect cardholder data) and increasingly expected by cyber insurance underwriters.
- Secrets management platform (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, GCP Secret Manager): key creation date, last rotation date, rotation policy configuration, automated rotation status
- KMS (AWS KMS, Azure Key Vault, GCP Cloud KMS): key rotation status, automatic rotation enabled/disabled per key, key age reporting
- JWT signing key rotation: authentication platform (Auth0, Okta, Azure AD B2C), signing key age and rotation cadence for access tokens and refresh tokens
- Code signing certificate rotation: how old is the current code signing key? When was it last rotated?
- SSH key inventory: last rotation date for infrastructure SSH keys, developer SSH keys; age distribution of keys in authorized_keys files
- Database encryption key rotation: column-level encryption keys, TDE (Transparent Data Encryption) key rotation cadence
How to calculate. (Keys rotated within policy interval) ÷ (total keys requiring rotation per policy) × 100 Track automation rate separately: (auto-rotated keys) ÷ (total keys requiring rotation) × 100
| Status | Criteria |
|---|---|
| Green | >95% of keys rotated within policy interval; >80% of rotations automated (not manual); rotation policy defined for all key types; secrets management platform with rotation enforcement in use |
| Amber | 80–95% within interval; or <50% automated rotation; or rotation policy undefined for some key types; or manual rotation with no tracking mechanism |
| Red | <80% within interval; or keys with no rotation history since creation; or no secrets management platform; or rotation failures causing application breaks; or a key compromise detected where rotation would have limited impact |
5. Secrets exposure rate in code and infrastructure
What to measure. Rate at which cryptographic keys, API credentials, certificates, and secrets are discovered in unauthorized locations, source code repositories, CI/CD pipeline logs, container images, infrastructure-as-code files, configuration management systems, and document stores.
Why it matters. Secrets exposure in code is endemic and severely underreported. The 2023 GitHub Secret Scanning research found millions of secrets exposed in public repositories. Internal repositories are not immune, developer habits of hardcoding credentials, committing .env files, or leaving secrets in CI/CD environment variable logs create persistent exposure windows. A private key in a git repository has effectively been compromised from the moment it was committed, even if the commit was later deleted (git history). This KRI measures the rate at which secrets are finding their way into places they shouldn't be.
- GitHub Advanced Security Secret Scanning / GitLab Secret Detection: detected secrets in repositories, alert volume, time-to-remediation per alert
- Third-party secret scanning (Trufflehog, Gitleaks, Detect-Secrets, Orca, Wiz): historical repository scanning, pre-commit hooks, CI/CD pipeline scanning
- Container image scanning: secrets embedded in Docker layers, environment variables baked into images (Trivy, Grype, Snyk Container)
- CI/CD log scanning: secrets appearing in build/deploy logs (many CI systems will redact known patterns, but custom secrets or misconfigured masking leaks them)
- Cloud storage scanning: S3 buckets, Azure Blob Storage, GCS buckets scanned for credential files, .env files, private keys (Macie, Defender for Storage, DLP)
- IaC scanning (Checkov, tfsec, KICS): Terraform state files, CloudFormation templates, Ansible playbooks with embedded secrets
How to calculate. (Secrets exposure incidents per quarter), track absolute count and trend Track by type: code repository secrets, CI/CD secrets, container secrets, cloud storage secrets Track time-to-remediation: how quickly are detected secrets rotated after detection?
| Status | Criteria |
|---|---|
| Green | Secret scanning active in all repositories and CI/CD pipelines; pre-commit hooks deployed to developer workstations; <5 new exposures per quarter; mean time to rotate after detection <4 hours; no secrets in container images or IaC templates |
| Amber | Secret scanning active but not comprehensive; 5–20 exposures per quarter; or mean time to rotate >24 hours; or container images not scanned; or git history scans not performed |
| Red | >20 exposures per quarter; or no active secret scanning; or secrets found in public repositories; or rotation not performed after detection; or a secrets exposure has led to an incident |
6. Post-Quantum cryptography readiness score
What to measure. Progress toward an organization's ability to migrate cryptographic implementations to quantum-resistant algorithms, including cryptographic inventory completeness, algorithm agility assessment, and active migration planning against NIST PQC standards (CRYSTALS-Kyber/ML-KEM, CRYSTALS-Dilithium/ML-DSA, SPHINCS+).
Why it matters. NIST finalized its first post-quantum cryptography (PQC) standards in August 2024. RSA and elliptic curve cryptography, the foundation of TLS, code signing, and PKI as implemented today, are vulnerable to quantum attack. The NSA's CNSA 2.0 suite mandates PQC algorithms for national security systems by 2030. "Harvest now, decrypt later" attacks are already underway: adversaries are collecting encrypted traffic today with the intent to decrypt it once quantum capability arrives. Organizations with long data retention obligations (healthcare records, financial data, government data) face the earliest risk. Getting from "we should look into this" to "our critical systems are quantum-resistant" takes years.
- Cryptographic inventory audit: what algorithms are in use, where, and in what systems, this is the foundation of PQC readiness and most organizations don't have it
- TLS deployment assessment: cipher suite negotiation on all internet-facing and internal services, which are negotiating post-quantum hybrid key exchange (X25519MLKEM768) vs. classical-only
- Certificate infrastructure: CA algorithm types; timeline for migration from RSA/ECC to PQC signing algorithms
- Code signing: current algorithm type; migration roadmap
- VPN and remote access infrastructure: which products have PQC roadmaps, which are already shipping PQC support
- CISA/NSA PQC migration guidance: CISA Post-Quantum Cryptography Initiative roadmap as reference framework
- Vendor dependency mapping: which vendors your cryptographic dependencies rely on, and whether they have published PQC roadmaps
Maturity scoring (1 to 5 scale).
- 1. Unaware: No organizational awareness of PQC requirements; no inventory begun
- 2. Inventorying: Cryptographic inventory underway; executive awareness established; NIST standards reviewed
- 3. Assessed: Full cryptographic inventory complete; algorithm agility assessment done; critical systems identified; migration budget requested
- 4. Planning: PQC migration roadmap approved and funded; pilot implementations underway in non-critical systems; vendor roadmaps reviewed and incorporated
- 5. Migrating: Active migration of critical systems; hybrid classical/PQC deployments in production; timeline to full migration defined
Target by vertical.
- Regulated/government: Score 4+ by 2026, Score 5 migration in progress by 2028
- Healthcare / long-retention data: Score 3+ now; begin migration planning immediately
- Commercial: Score 3+ by 2026; migration roadmap by 2027
7. Certificate revocation effectiveness rate
What to measure. Percentage of certificate revocation actions that are completed and propagated within defined SLAs, covering time from revocation trigger (compromise, key loss, employee departure, vendor change) to confirmed revocation across all relying parties, plus the operational health of CRL and OCSP infrastructure.
Why it matters. Revocation is PKI's emergency stop button. When a private key is compromised, a certificate is mis-issued, or an employee with a client certificate leaves the organization, revocation is the mechanism that should prevent that certificate from continuing to work. In practice, revocation is PKI's most frequently broken component. OCSP stapling is inconsistently deployed. CRL distribution points go offline. Client applications check revocation inconsistently. Browser vendors have largely moved to CRLSets (preloaded revocation lists) for public trust certificates because online revocation checking is unreliable. This KRI measures whether your revocation program actually works, not just whether revocation is technically configured.
- Revocation SLA audit: when a revocation was triggered, how long until the certificate was confirmed invalid across relying parties
- CRL publication frequency and distribution point health: how often are CRLs published vs. policy; are CRL distribution points available and returning current CRLs
- OCSP responder health: availability monitoring, response time, response validity period, OCSP responses that are too old are functionally equivalent to no revocation checking
- OCSP Stapling deployment rate: percentage of servers with TLS certificates where OCSP stapling is configured and functioning, reduces client-side revocation checking burden
- ADCS (internal PKI): revocation log review, when were revocations triggered for terminated employees, compromised systems, or decommissioned servers
- Revocation trigger tracking: was there a defined process that triggered revocation, or were revocations discovered reactively
How to calculate. (Revocations completed within SLA) ÷ (total revocations triggered in period) × 100 Define SLA tiers: Compromise events = <4 hours; Routine (employee departure, system decommission) = <24 hours
| Status | Criteria |
|---|---|
| Green | >95% of revocations within tier SLA; CRL availability >99.9%; OCSP stapling deployed on >80% of internet-facing servers; revocation process documented and tested; no revocation failures caused by CRL/OCSP outages |
| Amber | 85–95% within SLA; or CRL availability 99.0–99.9%; or OCSP stapling <50% deployment; or revocation process not formally tested |
| Red | <85% within SLA; or CRL/OCSP outages affecting revocation; or revocations discovered reactively (not triggered by a defined process); or a compromised certificate remained valid past a 24-hour window |
Deriving these KRIs by source type
Certificate Management Platforms
venafi-cli get certificates --format csv --fields "CommonName,ExpirationDate,IssuingCA,ManagedStatus,AutoRenew"
venafi-cli get certificates --filter "ExpirationDate LT +30d AND AutoRenew=false"
venafi-cli get certificates --filter "ManagedStatus=Unmanaged"
aws acm list-certificates --query 'CertificateSummaryList[*].[DomainName,CertificateArn,Status]'
aws acm list-certificates \
--query 'CertificateSummaryList[?NotAfter<`2024-02-01`]'
aws acm describe-certificate --certificate-arn <arn> \
--query 'Certificate.RenewalEligibility'
az keyvault certificate list --vault-name <vault> \
--query '[*].{Name:name,Expires:attributes.expires}' -o table
az keyvault certificate get-default-policy
Certificate Transparency Monitoring
curl -s "https://crt.sh/?q=%.yourdomain.com&output=json" | \
jq '.[] | {issuer: .issuer_name, not_after: .not_after, common_name: .common_name}'
curl -s "https://crt.sh/?q=%.yourdomain.com&output=json" | \
jq '.[] | select(.not_before > "'$(date -d '30 days ago' +%Y-%m-%d)'")'
Compare CT log results to managed inventory. Any certificate that appears in CT logs for your domain that is NOT in your certificate management platform is a dark certificate requiring investigation.
Secrets Scanning
trufflehog git https://github.com/yourorg/repo --only-verified
trufflehog github --org=yourorg --token=$GITHUB_TOKEN
trufflehog docker --image=yourimage:tag
gitleaks detect --source . --report-format json --report-path gitleaks-report.json
gitleaks detect --source . --log-opts="--all" --report-format json
- uses: gitleaks/gitleaks-action@v2
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/orgs/YOUR_ORG/secret-scanning/alerts?state=open"
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/orgs/YOUR_ORG/secret-scanning/alerts" | \
jq 'group_by(.secret_type) | map({type: .[0].secret_type, count: length})'
Key Management and Rotation
vault secrets list -format=json
vault read transit/keys/<keyname>
vault audit list
vault list transit/keys | while read key; do
vault read -format=json transit/keys/$key | \
jq '{key: "'$key'", latest_version: .data.latest_version, created: .data.creation_time}'
done
aws kms list-keys --query 'Keys[*].KeyId' --output text | \
tr '\t' '\n' | \
while read key; do
status=$(aws kms get-key-rotation-status --key-id $key \
--query 'KeyRotationEnabled' --output text)
echo "$key: rotation=$status"
done
aws kms list-keys --query 'Keys[*].KeyId' --output text | \
tr '\t' '\n' | \
while read key; do
aws kms get-key-rotation-status --key-id $key \
--query '{KeyId: "'$key'", RotationEnabled: KeyRotationEnabled}'
done | jq 'select(.RotationEnabled == false)'
PKI Infrastructure Health
certutil -view -out "RequestId,CommonName,NotAfter,DispositionMessage" csv
Get-ChildItem -Path cert:\LocalMachine\CA |
Where-Object { $_.NotAfter -lt (Get-Date) } |
Select-Object Subject, NotAfter, Thumbprint
certutil -CRL
certutil -verify -urlfetch <certificate.cer>
certutil -view -restrict "Disposition=21" -out "RequestId,CommonName,RevocationDate"
echo | openssl s_client -connect hostname:443 2>/dev/null | \
openssl x509 -noout -dates
openssl verify -CAfile ca-bundle.crt certificate.crt
openssl ocsp -issuer issuer.crt -cert cert.crt \
-url http://ocsp.yourdomain.com -text
openssl crl -in crl.pem -noout -text | grep -E "(Next Update|Last Update)"
Post-Quantum Readiness Assessment
openssl s_client -connect hostname:443 -groups X25519MLKEM768 2>&1 | \
grep -E "(Server Temp Key|Group)"
nmap --script ssl-enum-ciphers -p 443 hostname
curl -v --tlsv1.3 https://hostname 2>&1 | grep "SSL connection"
find /etc/ssl/certs -name "*.pem" -exec \
openssl x509 -noout -text -in {} \; 2>/dev/null | \
grep -E "(Public Key Algorithm|RSA Public-Key|EC Public-Key)"
echo | openssl s_client -connect hostname:443 2>/dev/null | \
openssl x509 -noout -text | grep -E "(Public Key Algorithm|Public-Key)"
Draxis turns these KRIs into a live signal
Draxis connects to the tools you already run (certificate management platforms, secrets managers, KMS, and CT log monitoring) and computes these cryptography and PKI KRIs automatically, with the green/amber/red bands, trend lines, and drift alerts described above. No spreadsheets, no manual stitching.
See how Draxis reads your stack →