Scope note: This file covers cloud security *runtime posture*, the operational state of your cloud environment as it exists right now. Design-time decisions (IaC scanning, zero trust architecture, network segmentation design) are covered in the Security Architecture KRI file. The distinction matters because runtime posture changes continuously with every deployment, configuration change, and access grant, and requires continuous monitoring, not periodic assessment.
The KRIs in this domain measure the runtime state of your cloud security controls: whether CSPM findings are being remediated, whether cloud identities are operating within expected scope, whether cloud-native threat detection is producing signal, and whether your multi-cloud environment has the visibility coverage it needs to catch threats before they become incidents.
If you are standing this up from scratch, start with how to build a KRI program and the consolidated KRI reference library, which maps every domain to one CIS-aligned catalog.
KRI inventory
1. CSPM critical finding rate and age
What to measure. Count of open critical and high-severity findings from Cloud Security Posture Management tooling, segmented by account/subscription, service category, and finding age (open <7 days, 7–30 days, >30 days).
Why it matters. CSPM findings are confirmed misconfigurations with known exploitation paths, not theoretical risks. Every critical finding open for more than 30 days is an acknowledged vulnerability that nobody has closed. Finding age distribution tells you whether your remediation process is functioning or whether findings accumulate indefinitely.
- AWS Security Hub: findings aggregated from GuardDuty, Inspector, Macie, Config, Access Analyzer
- Microsoft Defender for Cloud: secure score recommendations by severity and resource
- GCP Security Command Center: findings by severity, resource, and category
- Third-party CSPM (Wiz, Prisma Cloud, Lacework, Orca Security): findings API with age, severity, and resource owner
- Cloud provider native: AWS Config rules, Azure Policy compliance, GCP Organization Policy violations
How to calculate.
- Finding count by severity and age bucket: export via CSPM API, group by
severity × age_days - Remediation velocity: (Findings resolved in period) ÷ (findings opened in period), ratio >1.0 means backlog shrinking
- Mean time to remediate by severity: (Resolution date − Creation date) averaged by severity
| Status | Criteria |
|---|---|
| Green | Zero critical findings >7 days; high findings <30 days; remediation velocity >1.0 |
| Amber | Critical findings 7–30 days with active remediation; or remediation velocity <1.0 (backlog growing) |
| Red | Critical findings >30 days; or remediation velocity <0.5 (backlog growing rapidly); or CSPM not deployed |
2. Cloud account and subscription governance health
What to measure. Percentage of cloud accounts/subscriptions that are registered in the central account inventory, have a documented owner, are covered by organizational guardrails (SCPs, Azure Policy, GCP Org Policy), and have cloud security tooling active.
Why it matters. Shadow cloud accounts, development sandboxes, project accounts created without IT involvement, acquired subsidiary accounts, are the cloud equivalent of unmanaged devices. Each unregistered account is outside your security tooling, outside your CSPM coverage, and outside your logging pipeline. Organizations routinely discover 20–40% more cloud accounts than they believe they have when they run a full inventory.
- AWS Organizations:
aws organizations list-accounts, complete account inventory with status - Azure Management Groups: subscription inventory under root management group
- GCP Resource Manager:
gcloud projects list --filter="lifecycleState:ACTIVE"at organization level - Cloud management platform (Apptio Cloudability, CloudHealth, Spot.io): billing-derived account inventory
- Shadow account detection: cross-reference accounts paying invoices at corporate email domains not in central inventory
How to calculate. (Accounts with owner + guardrails + security tooling active) ÷ (total accounts discovered) × 100
| Status | Criteria |
|---|---|
| Green | 100% of accounts in inventory; organizational guardrails applied at root level; CSPM active on all accounts |
| Amber | 95–99% inventory; guardrails applied but with known exceptions; new accounts provisioned without automatic security tool enrollment |
| Red | <95% inventory; unregistered accounts discovered; or no organizational guardrails |
3. Cloud identity runtime posture
What to measure. Count of cloud IAM entities (users, roles, service principals) actively operating outside their documented and approved permission scope, including unused permissions, overpermissioned service roles, and any human user with console access using long-term credentials rather than federated access.
Why it matters. Cloud IAM permissions drift from intent rapidly. A service role granted S3 FullAccess for a one-time migration three years ago still has that permission. A developer IAM user created before SSO was configured still has a long-term access key. IAM Access Analyzer and equivalent tools now make this measurable without manual review, but only if someone is looking at the data.
- AWS IAM Access Analyzer: unused access findings, IAM entities with permissions unused in last 90 days
- AWS Credential Report: human IAM users with active access keys (signal: should be near zero in mature SSO environments)
- Azure Entra ID: service principal permission audits; Privileged Identity Management standing vs. JIT assignments
- GCP IAM Recommender:
gcloud recommender recommendations list --recommender=google.iam.policy.Recommender, unused permission recommendations - Wiz / Prisma Cloud: effective permissions analysis showing what each identity can actually do vs. what is intended
How to calculate.
- Overpermissioned entity rate: (Entities with IAM Recommender or Access Analyzer findings) ÷ (total IAM entities) × 100
- Human users with long-term credentials: count (should approach zero in SSO-mature environments)
- Service roles with wildcard permissions in production: absolute count
| Status | Criteria |
|---|---|
| Green | <5% of entities with unused/excess permissions; zero human users with long-term API credentials; all wildcard roles in production remediated |
| Amber | 5–15% with unused permissions under review; or some human IAM users with long-term credentials being migrated to SSO |
| Red | >15% overpermissioned; or human IAM users with long-term credentials as standard practice; or wildcard permissions in production with no remediation plan |
4. Cloud workload protection coverage
What to measure. Percentage of cloud compute workloads (EC2/Azure VMs/GCP Compute instances, and container workloads) with cloud workload protection platform (CWPP) or cloud-native endpoint protection active and reporting.
Why it matters. Cloud VMs and containers are endpoints. They need runtime protection for the same reason on-premises servers do, attackers target them, malware runs on them, and without runtime protection, compromise goes undetected. Cloud instances without CWPP are invisible to threat detection regardless of what CSPM says about their configuration.
- AWS: Systems Manager inventory for managed instances; GuardDuty Runtime Monitoring for EC2 and ECS/EKS
- Azure: Microsoft Defender for Servers enrollment rate (Log Analytics agent / Azure Monitor Agent coverage)
- GCP: Security Command Center VM Threat Detection and Container Threat Detection coverage
- Third-party CWPP (CrowdStrike Falcon for Cloud, SentinelOne Cloud Workload Security, Wiz Runtime): agent deployment report
- Infrastructure inventory (Terraform state, CloudFormation stacks): total provisioned compute vs. protected compute
How to calculate. (Cloud compute instances with CWPP agent active and reporting) ÷ (total cloud compute instances) × 100 Track separately: VMs, containers (pods/tasks), serverless (where applicable)
| Status | Criteria |
|---|---|
| Green | >98% VM coverage; container runtime protection active; auto-enrollment on instance launch |
| Amber | 90–97% VM coverage; or container workloads unprotected; or auto-enrollment not configured |
| Red | <90% VM coverage; or CWPP not deployed to cloud workloads; or coverage gaps concentrated in production workloads |
5. Cloud logging and audit trail health
What to measure. Operational status of cloud-native audit logging across all accounts/subscriptions, specifically whether management plane logs (CloudTrail, Azure Activity Log, GCP Audit Logs), data plane logs (S3 data events, Azure Blob access, GCP Data Access), and network flow logs are enabled, centralized, and generating alerts on gaps.
Why it matters. Cloud audit logs are the forensic foundation for every cloud incident investigation. An attacker who compromises a cloud account and finds CloudTrail disabled has effectively erased their tracks in advance. Gaps in cloud logging, frequently caused by accounts created without the standard security baseline, create blind spots that are discovered during post-incident forensics, not before.
- AWS CloudTrail:
aws cloudtrail get-trail-statusper trail;aws cloudtrail describe-trailsfor all accounts (via Organizations) - AWS Config:
cloudtrail-enabledmanaged rule compliance status - Azure Monitor / Activity Log: diagnostic settings per subscription; Log Analytics workspace ingestion health
- GCP Audit Logs:
gcloud logging sinks listat organization level; data access log enablement status per project - SIEM ingestion health: cloud log source heartbeat, last event received per account per log type
How to calculate.
- Management plane logging: (Accounts with CloudTrail/Activity Log/Audit Log enabled and centralized) ÷ (total accounts) × 100
- Log gap detection: (Log sources with automated alerting on ingestion gap) ÷ (total cloud log sources) × 100
| Status | Criteria |
|---|---|
| Green | 100% of accounts with management plane logging enabled and centralized; data plane logging on sensitive services; automated alerting on log gaps |
| Amber | 95–99% management plane coverage; or data plane logging absent on some sensitive services |
| Red | <95%; or log gaps discovered during incident investigation rather than proactively; or no centralized cloud logging |
6. Cloud data exposure posture
What to measure. Count of cloud storage resources (S3 buckets, Azure Blob containers, GCP Storage buckets, databases with public endpoints) that are publicly accessible or accessible beyond their documented and approved scope, and encryption status of sensitive data stores.
Why it matters. Publicly accessible cloud storage containing sensitive data is the most consistently discovered cloud security failure of the last decade. This KRI measures the live runtime state of data exposure, not policy intent but actual accessibility, which changes every time a permissions configuration is modified.
- AWS: S3 Block Public Access settings per bucket and account; S3 Access Analyzer; RDS public accessibility flag; Macie sensitive data findings in public buckets
- Azure: Storage account public access settings; Azure Defender for Storage; Azure Policy compliance for "no public access" policy
- GCP: Cloud Storage uniform bucket-level access; GCP Asset Inventory public access analysis; Cloud SQL public IP flag
- CSPM (Wiz, Prisma Cloud): data exposure findings, public storage with sensitive data detected
- DLP integration: Macie, Purview, or CSPM DLP for sensitive data classification within cloud storage
How to calculate.
- Public storage with sensitive data: count (should be zero, any nonzero is a critical finding)
- Public storage without sensitive data: count with documented business justification
- Encryption coverage: (Cloud data stores with encryption at rest enabled) ÷ (total cloud data stores) × 100
| Status | Criteria |
|---|---|
| Green | Zero public storage with sensitive data; all storage encrypted; public storage without sensitive data <5 with documented justification |
| Amber | Public storage without sensitive data but without documented justification; or encryption gaps in non-sensitive stores |
| Red | Any public storage with sensitive data; or encryption disabled on sensitive data stores; or no automated public access detection |
7. Cloud-Native threat detection alert rate and quality
What to measure. Monthly volume of cloud-native threat detection findings (GuardDuty, Microsoft Defender for Cloud Alerts, GCP Security Command Center Threats) by severity, and the percentage that result in investigation versus are suppressed or auto-closed as noise.
Why it matters. Cloud-native threat detection tools catch things SIEM rules miss: IAM credential exfiltration, unusual API call patterns, crypto mining on compromised instances, C2 communication from cloud workloads, and anomalous data access patterns. Alert quality, the ratio of actionable alerts to noise, determines whether the SOC trusts and acts on cloud threat signals or learns to ignore them.
- AWS GuardDuty: finding severity distribution, finding type breakdown, suppression rule inventory
- Microsoft Defender for Cloud: security alert volume, alert type, dismissal/suppression rates
- GCP Security Command Center: threat finding volume by severity and category
- SIEM: cloud-native findings ingested via GuardDuty → Security Hub → SIEM pipeline; alert disposition rates
- SOAR: automated playbook activation rates for cloud-native threat findings
How to calculate.
- Alert fidelity rate: (Cloud threat alerts investigated by SOC) ÷ (total cloud threat alerts) × 100
- High/critical alert rate: (High + critical findings per month), track as trend
- Suppression quality: (Suppression rules with documented rationale) ÷ (total active suppression rules) × 100
| Status | Criteria |
|---|---|
| Green | Cloud-native threat detection enabled in all accounts; alert fidelity >15%; all suppression rules documented and reviewed quarterly |
| Amber | 10–14% fidelity; or cloud threat detection not deployed in all accounts; or suppression rules undocumented |
| Red | <10% fidelity (noise-dominated); or cloud-native threat detection disabled; or cloud alerts not ingested into SIEM |
8. Cloud spend anomaly rate (Security signal)
What to measure. Rate of cloud spend anomalies, unexpected resource creation or compute usage spikes, that are triaged as potential security events (cryptojacking, unauthorized resource provisioning, data exfiltration via egress cost).
Why it matters. Cryptomining attacks and unauthorized resource provisioning produce a financial signal before a security alert. Egress cost spikes can indicate large-scale data exfiltration. Organizations that monitor cloud spend anomalies through a security lens catch attacker activity that bypasses detection tools. Cloud spend is a free sensor that most organizations aren't using.
- AWS Cost Explorer: anomaly detection alerts (
aws ce get-anomaly-monitors); unusual instance type launches - Azure Cost Management: budget alerts and anomaly detection; unexpected VM series deployments
- GCP Billing: budget alert anomalies; Recommender cost insights showing unusual resource usage
- Cloud SIEM / SOAR: correlation between spend anomaly alert and concurrent GuardDuty/Defender findings
- FinOps platform (CloudHealth, Apptio): anomaly detection with security team notification integration
KRI values.
- Monthly cloud spend anomaly alerts generated: count by severity
- Anomalies triaged by security team: percentage routed to SOC investigation
- Confirmed security-related spend anomalies: count per quarter (cryptomining, exfiltration, unauthorized provisioning)
| Status | Criteria |
|---|---|
| Green | Cloud spend anomaly detection active; anomaly alerts reviewed by security team within 24 hours; confirmed security events <1 per quarter |
| Amber | Anomaly detection active but routed to FinOps only (no security team review); or review SLA >48 hours |
| Red | No cloud spend anomaly detection; or confirmed cryptomining/unauthorized provisioning with delayed detection |
Deriving these KRIs by source type
From AWS (Security Hub, GuardDuty, IAM Access Analyzer, Config)
- Security Hub aggregated findings:
aws securityhub get-findings --filters '{"SeverityLabel":[{"Value":"CRITICAL","Comparison":"EQUALS"}],"RecordState":[{"Value":"ACTIVE","Comparison":"EQUALS"}]}', pipe through jq for age calculation (CreatedAtvs now) - GuardDuty findings:
aws guardduty list-findings --detector-id <id> --finding-criteria '{"Criterion":{"severity":{"Gte":7}}}', high + critical findings - IAM Access Analyzer unused access:
aws accessanalyzer list-findings --analyzer-arn <arn> --filter '{"findingType":{"contains":["UnusedPermission","UnusedIamRole","UnusedIamUserAccessKey"]}}' - CloudTrail health:
aws cloudtrail get-trail-status --name <trail-name>, checkIsLoggingfield per account; use AWS Organizations + CloudFormation StackSets to query all accounts - Account inventory:
aws organizations list-accounts --query 'Accounts[?Status==ACTIVE]'
From Azure (Defender for Cloud, Monitor, Entra ID)
- Defender for Cloud secure score:
GET /subscriptions/{subscriptionId}/providers/Microsoft.Security/secureScoresvia ARM API oraz security secure-score list - Defender alerts:
az security alert list --query '[?properties.status==Active]'filtered by severity - Activity log health:
az monitor diagnostic-settings list --resource <subscription-id>, verify Log Analytics workspace destination per subscription - Service principal permissions audit: Microsoft Graph
GET /servicePrincipals?$select=id,displayName,appRoles+ effective permissions via Defender for Cloud identity workload
From GCP (Security Command Center, Cloud Logging, IAM Recommender)
- SCC findings:
gcloud scc findings list organizations/ORG_ID --filter="state=ACTIVE AND severity=CRITICAL", pipe to count and age calculation - IAM Recommender:
gcloud recommender recommendations list --recommender=google.iam.policy.Recommender --location=global --project=PROJECT_ID - Audit log status:
gcloud logging sinks list --organization=ORG_ID, verify audit log sink exists and destination is healthy - Public storage: Cloud Asset API,
gcloud asset search-all-resources --scope=organizations/ORG_ID --asset-types=storage.googleapis.com/Bucket --query='iamPolicy:allUsers OR iamPolicy:allAuthenticatedUsers'
From Third-Party CSPM (Wiz, Prisma Cloud, Orca, Lacework)
- Finding export API: Most CSPM platforms expose a findings API; schedule daily export, calculate age distribution
- Coverage gaps: Wiz and Orca both show unscanned/unconnected cloud accounts, direct feed for account governance KRI
- Effective permissions (Wiz): "Identity Attack Path" graph, service identities with paths to sensitive data or lateral movement capability
- Data exposure: Wiz Data Security module; Orca data exposure findings, public storage with sensitive data classified
From Cloud Spend / FinOps Platforms
- AWS Cost Anomaly Detection: Subscribe SNS topic to security team;
aws ce get-anomaly-subscriptionsto verify active notifications - Azure Anomaly Alerts: Configure budget alerts in Cost Management; add security team email to alert recipients
- Correlation enrichment: In SOAR, enrich GuardDuty finding with concurrent cost anomaly query, correlated signals increase confidence in cryptomining/exfiltration findings
*See also: Security Architecture KRI file for IaC security scanning, network segmentation design validation, and zero trust maturity, the design-time controls that complement the runtime posture measured here.*
Draxis turns these KRIs into a live signal
Draxis connects to the tools you already run (CSPM, CWPP, cloud IAM, and cloud-native logging) and computes these cloud security KRIs automatically, with the green/amber/red bands, trend lines, and drift alerts described above. No spreadsheets, no manual stitching.
See how Draxis reads your stack →