KRI Series: AI Security

In the last 18 months most organizations stood up LLM-powered tools, wired in third-party AI services, and shipped AI-assisted features faster than their security program adapted. The result is a real-time gap: AI is processing sensitive data and, increasingly, taking actions, while security has no live picture of where it lives or what it can do.

The KRIs below measure outcomes, not intentions. Where does AI actually run? What can it reach? Do its guardrails hold under attack? They are newer and less standardized than other domains, and the derivation sources are less mature, but the risk is real. An organization that does not measure AI risk is accepting unknown exposure, not low exposure. These signals pair with the broader KRI reference library and the steps in how to build a KRI program. If you are watching these signals move over time, reading the signal in drifting KRIs covers velocity and drift.

In this guide

Why AI security KRIs require a different approach
The eight KRIs
Deriving these KRIs by source type

Why AI security KRIs require a different approach

Traditional security KRIs assume deterministic systems and measure the state of controls around data and assets. AI shifts the question. The same prompt can produce different outputs, guardrails can be bypassed with creative phrasing, and behavior depends on training data you did not control. AI security KRIs have to measure how often guardrails held, not whether a single control is on or off.

The second complication is that the AI system spans risks a traditional program has not had to connect. You are measuring the AI system itself as an attack vector through prompt injection and jailbreaks, the data it can reach through RAG pipelines and tool use, the actions it can take as an agent, the supply chain behind its model weights and fine-tuning data, and the integrity of what it emits. The first several KRIs here focus on discovery and inventory, because you cannot govern AI you have not found.

An employee pasting customer PII into an unsanctioned LLM is an uncontrolled data egress that no DLP rule caught. Shadow AI is shadow IT with autonomous action capability.

Framework mapping

CIS Controls v8

The KRIs in this domain implement and measure these CIS Critical Security Controls:

CIS 3, Data Protection. Model and training-data access scope and output leakage.
CIS 16, Application Software Security. AI application security and the model supply chain.
CIS 15, Service Provider Management. Third-party model and API provider risk.

AI is not yet a dedicated CIS control. These are the closest v8 mappings; the domain also aligns with the NIST AI Risk Management Framework.

The eight KRIs

1. AI system inventory completeness

What to measure. The percentage of AI systems in production (models, LLM integrations, agentic tools, AI-powered features in applications) documented in the enterprise AI inventory with ownership, data access scope, capability scope, and risk classification, measured against systems found through technical discovery.

Why it matters. You cannot govern AI risk you have not inventoried. AI is deployed by engineering teams, business units, and individual employees through SaaS tools with AI features, at a rate that outpaces most programs’ visibility. Undiscovered AI integrations reaching sensitive data are the equivalent of shadow IT, except the system can act on its own.

Where this comes from

Code repositories: AI SDK imports and API calls (grep for openai, anthropic, bedrock, vertex, Hugging Face usage).
SaaS application inventory: tools with AI features that may process company data (Copilot, Gemini for Workspace, Salesforce Einstein).
Network and proxy logs: calls to known AI provider endpoints (api.openai.com, api.anthropic.com, bedrock-runtime.us-east-1.amazonaws.com).
Cloud IAM: API keys and service accounts holding permissions for AI provider services.
Procurement records: AI tool purchases and subscriptions.

How to calculate. (AI systems in the formal inventory with owner, data scope, capability scope, and risk classification) ÷ (AI systems discovered through technical discovery) × 100.

Status	Criteria
Green	>95% of discovered AI systems in the formal inventory; discovery runs continuously.
Amber	80–94%; or an inventory exists but has not been updated in the last 90 days.
Red	<80%; or no formal AI inventory process; or discovery is surfacing significant undocumented AI systems.

2. AI data access scope compliance

What to measure. The percentage of AI systems operating within their documented and approved data access scope, specifically systems that hold no access to sensitive data categories (PII, financial data, PHI, IP) beyond what was reviewed and approved at deployment.

Why it matters. LLM-integrated systems with RAG pipelines, tool use, or database access get over-permissioned the same way human identities do, and the consequences can be worse. An over-scoped AI system can surface sensitive information to users who should not see it, or be manipulated through prompt injection to exfiltrate data it was never meant to touch. A data security posture tool like the Varonis Data Security Platform integration maps which sensitive repositories an AI identity can actually reach.

Where this comes from

Data access review records: the approved scope documented at AI system deployment.
IAM and service account permissions: the actual grants on the service identity the AI system runs as.
RAG pipeline configuration: document collections, database queries, and API endpoints the retrieval layer can reach.
LLM observability platforms (LangSmith, Langfuse, Helicone, Weights & Biases): logged tool calls and retrievals showing actual runtime access.

How to calculate. Compare the approved scope from the deployment review against actual access from runtime observability logs, and flag any access beyond approved scope.

Status	Criteria
Green	100% of AI systems within approved scope; quarterly access review; observability active for all production AI systems.
Amber	Known scope creep with remediation in progress; or observability present but not reviewed regularly.
Red	Any AI system reaching sensitive data beyond approved scope; or no access review at deployment; or no runtime observability.

3. Prompt injection testing coverage

What to measure. The percentage of customer-facing and internal AI applications tested for prompt injection, covering both direct injection (user input steering model behavior) and indirect injection (malicious content in the data sources the model retrieves).

Why it matters. Prompt injection is the most commonly exploited AI-specific vulnerability class. It lets an attacker bypass an application’s intended behavior, exfiltrate data from the context window, drive unauthorized actions, or manipulate AI-powered controls. Untested systems should be assumed vulnerable. The same secure-pipeline signal feeds the wider application security KRIs, where injection testing already has a long history.

Where this comes from

Security test records: prompt injection cases in assessment reports or automated test suites.
Red team records: AI red teaming for prompt injection, jailbreaking, and indirect injection via RAG.
Bug bounty and VDP submissions: AI-specific reports for injection, jailbreak, and data leakage.
LLM security testing tools (Garak, Promptmap, PyRIT): automated adversarial prompt testing reports.
Penetration test reports: scope confirmation that AI application testing included prompt injection.

How to calculate. (AI applications with documented prompt injection testing in the last 12 months) ÷ (total customer-facing and high-risk internal AI applications) × 100.

Status	Criteria
Green	>90% of high-risk AI applications tested; testing covers indirect injection via RAG and agentic tool misuse; findings tracked and remediated.
Amber	60–89%; or testing covers direct injection only; or the last test is more than 12 months old.
Red	<60%; or no formal prompt injection testing; or customer-facing AI applications untested.

4. Agentic AI action scope and oversight rate

What to measure. The percentage of AI agent deployments (systems that take autonomous actions such as browsing, writing files, calling APIs, sending email, or executing code) with documented action scope limits, human-in-the-loop review for high-consequence actions, and audit trails of every action taken.

Why it matters. An agent that takes actions on behalf of users is a privileged account with non-deterministic behavior. Manipulated through external prompt injection, it can perform actions the legitimate user never authorized. Without scope limits and review gates, the blast radius of a hijacked agent scales with everything it is allowed to do.

Where this comes from

AI system deployment records: documented action inventories and scope limits for each agent.
LLM orchestration platforms (LangGraph, AutoGen, CrewAI, OpenAI Assistants): tool definitions and capability inventories.
Audit log systems: action logs from agentic systems, where every external action should be recorded.
SOAR and workflow automation platforms: the review queue for high-consequence steps where AI is integrated.
Security review records: human-in-the-loop gates designed in at deployment time.

How to calculate. Track three values: action scope documentation rate (agentic deployments with documented limits), high-consequence action review rate (actions above the risk threshold requiring human approval before execution), and audit trail completeness (agentic systems with complete action logs).

Status	Criteria
Green	100% of agentic deployments with documented action scope; human review required for all destructive or external-communication actions; full audit trail active.
Amber	Scope documentation present but human review gates inconsistent; or audit trails incomplete.
Red	Any agentic system with no documented scope limits or audit trail; or agentic write access to sensitive systems with no human review.

5. Model supply chain integrity

What to measure. The percentage of production models (open-source, fine-tuned, and API-accessed) with documented provenance, integrity verification, and a known, acceptable license and usage policy.

Why it matters. Model supply chain attacks are emerging but credible. Poisoned open-source weights on a public hub, fine-tuning datasets with backdoors, and integrations from unverified sources carry risk analogous to malicious npm packages, with more significant consequences when the model drives security-relevant decisions. The pickle deserialization in many model formats is itself a remote code execution vector. The Protect AI integration feeds model scan results and provenance signals into Draxis.

Where this comes from

Model registry (Hugging Face, MLflow, AWS SageMaker Model Registry, Azure ML): inventory with source, version, and provenance metadata.
CI/CD pipeline: model artifact integrity steps such as hash verification of downloaded weights.
Training data documentation: lineage records for fine-tuning datasets.
Vendor security assessments: review of third-party AI API providers (OpenAI, Anthropic, Google, Cohere).
License compliance review: models with licenses restricting commercial use or requiring specific data handling.

How to calculate. Track the percentage of production models with documented provenance, automated integrity verification (for example Protect AI’s modelscan or Hugging Face malware scanning), and a completed license review, and flag any model loaded from an unverified source.

Status	Criteria
Green	100% of production models with documented provenance, integrity verification, and license review; no models from unverified sources.
Amber	Known models without provenance documentation; or license review incomplete; or integrity verification manual rather than automated.
Red	Any production model from an unverified source with no integrity check; or a model with a license incompatible with its production use case.

6. Sensitive data leakage rate in AI outputs

What to measure. The rate at which AI systems emit outputs containing sensitive data (PII, credentials, internal system information, confidential business data) that should not appear in responses, detected through output monitoring and red team testing.

Why it matters. A model with access to sensitive data can leak it through direct retrieval in RAG, training data memorization, or manipulation via prompt injection. Output monitoring is the runtime control that catches these leaks before they become regulatory exposure or customer harm.

Where this comes from

LLM observability platform: output scanning for PII patterns (regex plus ML detection in Langfuse, Helicone, or custom logging). A grep for sk- or AKIA in logged outputs is a crude but useful starting point.
DLP integration on AI API responses: DLP scanning applied to output streams.
Red team testing records: documented sensitive-data extraction attempts and their results.
User feedback and abuse reports: customer or employee reports of unexpected sensitive content in responses.
DMARC aggregate reports, where AI generates email: unexpected data in AI-generated email content.

How to calculate. (AI responses flagged for sensitive data content) ÷ (total AI responses in the period) × 100. Hold a near-zero tolerance: any measurable leakage rate from production AI systems is a signal that warrants investigation.

Status	Criteria
Green	Output monitoring active on all production AI systems; zero confirmed leakage incidents; DLP controls applied to output streams.
Amber	Output monitoring present but with coverage gaps; or isolated leakage events under investigation.
Red	Confirmed sensitive data leakage from a production AI system; or no output monitoring on systems with access to sensitive data.

7. AI security training completion for developers and AI teams

What to measure. The percentage of engineers building AI-powered features, AI researchers, and data scientists who have completed training on AI-specific security risks: prompt injection, model supply chain, data poisoning, and output integrity.

Why it matters. Developers are trained on the OWASP Top 10 and secure coding, but those topics do not cover the AI threat model. Engineers who build AI systems without AI security training concatenate user input into prompts, grant agents broad permissions, and log sensitive data, the same dynamic that preceded the web application security crisis in the 2000s. Training is the cheapest control for the most common mistakes. Completion data often comes from the same platform behind the KnowBe4 integration.

Where this comes from

Learning management system: completion records for AI security training modules.
Security awareness platform (KnowBe4, Proofpoint Security Awareness Training, SANS): AI security course completion rates.
Engineering org chart cross-referenced with the LMS: identify engineers on AI features who have not completed training.
Team inventory from AI platform and MLOps tooling: identify all users with AI development access.

How to calculate. (AI engineers, researchers, and data scientists with current AI security training) ÷ (total staff in those roles) × 100.

Status	Criteria
Green	>90% of AI engineers with current training; content updated for new threat classes within 6 months of publication.
Amber	70–89%; or training content not updated in more than 18 months.
Red	<70%; or no AI-specific security training for engineers building AI systems.

8. AI regulatory compliance posture

What to measure. Readiness across applicable AI-specific regulatory requirements (EU AI Act risk classification, emerging US state AI laws, and sector guidance such as the FDA on AI in medical devices and the SEC on AI in financial advice), expressed as the percentage of applicable requirements with documented compliance procedures.

Why it matters. EU AI Act enforcement begins in 2025 and scales through 2027, with obligations on high-risk operators for conformity assessments, registration, transparency, and human oversight. Knowing each system’s risk tier is the starting point for the auditor, the regulator, the insurer, and, when it comes up, the board. Deploying AI without regulatory mapping accumulates compliance debt that gets expensive to retrofit. A GRC platform like the one behind the Vanta integration can carry the AI compliance control mapping.

Where this comes from

Legal and compliance team: inventory of applicable AI regulations by jurisdiction and use case.
EU AI Act risk classification: self-assessment of each system’s tier (prohibited, high-risk, limited-risk, minimal-risk).
EU AI Act Article 6 and Annex III: the high-risk use case checklist.
Sector regulator guidance: FDA (AI/ML-based Software as a Medical Device), OCC and CFPB (AI in financial services), EEOC (AI in hiring).
GRC platform: AI compliance control mapping.

How to calculate. (Applicable AI requirements with documented compliance procedures) ÷ (total applicable requirements) × 100, with high-risk systems tracked separately for completed conformity documentation.

Status	Criteria
Green	Applicable regulations identified; AI systems classified by risk tier; high-risk systems with a compliance roadmap in place.
Amber	Regulatory inventory in progress; or AI systems deployed without classification.
Red	AI systems in production in regulated domains (hiring, medical, financial advice) with no regulatory compliance assessment.

Deriving these KRIs by source type

The same AI risk picture can be assembled from several vantage points. Most programs combine a few of these because no single source sees everything.

From code repositories and CI/CD

The codebase is where AI usage first appears, often before anyone files a ticket.

AI API call discovery: grep -r "openai\|anthropic\|bedrock\|vertex" --include="*.py,*.ts,*.js" across all repositories.
Model weight downloads: identify Hugging Face download calls in code, then confirm integrity checks exist in the pipeline.
Prompt template audit: extract every system and user prompt template and review it for injection resistance.
Environment variable audit: AI API keys in the secrets scanner scope, with rotation age tracked.

From LLM observability platforms

Platforms like Langfuse, LangSmith, Helicone, and Weights & Biases capture runtime behavior the code review cannot predict.

Token usage and request logs: an inventory of production LLM calls, model versions, and prompt patterns.
Tool call logs for agentic systems: every external action an agent takes, including file writes, API calls, and browser actions.
Output content analysis: regex and ML classifiers run over response content for PII, credentials, and internal data patterns.
Latency anomalies: unusual response patterns can indicate prompt injection driving extended reasoning loops.

From cloud AI platform consoles

AWS Bedrock, Azure OpenAI, and Google Vertex AI consoles expose who is using which models and where inference runs.

Service usage logs: CloudTrail, Azure Monitor, and Cloud Logging for AI service API calls, showing who is calling from where against which models.
IAM permissions: service identities with AI service access, reviewed for least privilege.
Data residency: the regions where inference runs, a direct regulatory compliance signal.
Model access logs: which foundation models are accessed and which custom fine-tuned versions are in use.

From network and proxy logs

For SaaS AI tools and direct API use, the network is often the only place shadow AI shows up.

Outbound API calls to AI providers: proxy and firewall logs revealing employee direct use of AI APIs without IT oversight.
Data exfiltration risk: large outbound payloads to AI providers from endpoints with sensitive data access.
Unauthorized AI SaaS: AI-powered SaaS tools connecting to company data sources without IT approval.

From red team and purple team exercises

Adversarial testing is the only way to know whether the guardrails hold against a motivated attacker.

Prompt injection test results: structured testing against the OWASP LLM Top 10 cases using Garak and PyRIT.
RAG poisoning tests: adversarial documents injected into retrieval sources to test indirect injection.
Agentic misuse tests: attempts to manipulate agents into unauthorized actions through crafted inputs.
Output filtering bypass tests: attempts to extract sensitive data from context windows through creative prompting.

The common mistake: treating a model file as inert data

Teams download a model from a public hub and load it like a CSV, when many model formats deserialize pickle objects that can execute code on load. Scan models before deployment, pin them to a hash, and verify provenance, the same discipline you already apply to third-party packages.

See your AI security KRIs live

Draxis maps your AI system inventory, tracks data access scope against approved permissions, ingests model scan results, and watches for output leakage, computing each KRI continuously instead of once a quarter when someone builds a slide. No new agents, no manual data pulls.

Request access →