Appendix D: Risk Scoring Matrix

A comprehensive methodology for assessing AI system risks based on probability of occurrence and severity of impact, aligned with enterprise risk management frameworks and EU AI Act risk classifications.

D.1 Introduction to AI Risk Scoring

The Risk Scoring Matrix provides a standardized methodology for evaluating and prioritizing AI-related risks across the organization. This framework enables consistent risk assessment during the AI lifecycle, from initial ideation through deployment and monitoring.

Purpose of Risk Scoring

  • Prioritization: Determine which risks require immediate attention vs. monitoring
  • Resource Allocation: Guide investment in risk mitigation based on severity
  • Governance Triggers: Define escalation thresholds for RAI Council review
  • Regulatory Alignment: Map internal risk levels to EU AI Act classifications
  • Communication: Provide a common language for discussing risk across stakeholders

📐 Core Risk Calculation Formula

Risk Score = Likelihood × Impact × Detectability Factor

Where:

  • Likelihood (L): Probability of the risk event occurring (1-5)
  • Impact (I): Severity of consequences if the risk materializes (1-5)
  • Detectability Factor (D): Ability to detect the risk before harm occurs (0.5-1.5)

This produces a composite score ranging from 0.5 to 37.5, which is then mapped to risk levels.

D.2 The 5×5 Risk Matrix

The foundational risk matrix maps Likelihood against Impact to produce a base risk score. This is the primary tool for initial risk classification.

LIKELIHOOD
↓ / IMPACT →
IMPACT SEVERITY
1 - Negligible 2 - Minor 3 - Moderate 4 - Major 5 - Catastrophic
5 Almost Certain
(>90%)
5
MEDIUM
10
HIGH
15
HIGH
20
CRITICAL
25
CRITICAL
4 Likely
(70-90%)
4
LOW
8
MEDIUM
12
HIGH
16
HIGH
20
CRITICAL
3 Possible
(30-70%)
3
LOW
6
MEDIUM
9
MEDIUM
12
HIGH
15
HIGH
2 Unlikely
(10-30%)
2
MINIMAL
4
LOW
6
MEDIUM
8
MEDIUM
10
HIGH
1 Rare
(<10%)
1
MINIMAL
2
MINIMAL
3
LOW
4
LOW
5
MEDIUM

Risk Level Definitions

CRITICAL Score: 20-25 — Unacceptable risk requiring immediate action. May trigger "stop the line" authority. Requires executive approval to proceed.
HIGH Score: 10-19 — Significant risk requiring active management and mitigation. RAI Council review mandatory. Enhanced monitoring required.
MEDIUM Score: 5-9 — Moderate risk requiring documented mitigation plans and regular monitoring. Standard governance processes apply.
LOW Score: 3-4 — Low risk that can be managed through standard procedures. Periodic review recommended.
MINIMAL Score: 1-2 — Negligible risk requiring no special action. Document and continue normal operations.

D.3 Likelihood Assessment Criteria

Use the following criteria to assess the probability that a specific AI risk will materialize:

Level 5: Almost Certain (>90% probability)

Indicators
  • Similar incidents have occurred frequently in the organization or industry
  • No controls or mitigations currently in place
  • Known active threats or vulnerabilities exist
  • Regulatory or technical changes make occurrence inevitable
  • Historical data shows >90% occurrence rate in similar systems
AI Examples
  • LLM generating factually incorrect information without RAG grounding
  • Biased outcomes from model trained on historically biased data without mitigation
  • Privacy violations when processing PII without anonymization controls

Level 4: Likely (70-90% probability)

Indicators
  • Similar incidents have occurred multiple times in the past
  • Partial controls exist but are known to be insufficient
  • High-risk environment with known threat actors
  • Significant operational complexity increases failure probability
AI Examples
  • Model drift causing performance degradation without monitoring
  • Prompt injection attacks on customer-facing chatbots
  • User misuse of generative AI tools for prohibited purposes

Level 3: Possible (30-70% probability)

Indicators
  • Incidents have occurred occasionally in similar contexts
  • Reasonable controls exist but gaps remain
  • Environmental factors create periodic risk windows
  • Moderate operational complexity
AI Examples
  • Disparate impact emerging in edge cases for well-tested models
  • Third-party model updates causing unexpected behavior
  • Data quality issues affecting model performance

Level 2: Unlikely (10-30% probability)

Indicators
  • Incidents are rare but have occurred in the industry
  • Strong controls exist with minor gaps
  • Well-understood and stable operating environment
  • Active monitoring and early warning systems in place
AI Examples
  • Adversarial attacks on well-defended enterprise systems
  • Model extraction from rate-limited APIs
  • Rare edge case failures in thoroughly tested models

Level 1: Rare (<10% probability)

Indicators
  • No known incidents in comparable contexts
  • Comprehensive, defense-in-depth controls in place
  • Theoretical risk with no practical exploitation path
  • Requires multiple simultaneous failures to occur
AI Examples
  • Complete model failure in systems with robust redundancy
  • Sophisticated state-actor attacks on non-critical systems
  • Black swan events affecting AI infrastructure

D.4 Impact Assessment Criteria

Assess impact across multiple dimensions, using the highest applicable severity level:

Level 5: Catastrophic Impact

Financial >$10M direct loss, or material impact to market value
Operational Complete business unit shutdown, >30 days to recover critical systems
Reputational International media coverage, sustained public outrage, executive resignation
Regulatory Criminal prosecution, license revocation, regulatory shutdown order
Safety/Rights Loss of life, permanent disability, mass civil rights violations
AI Examples
  • Autonomous vehicle AI causing fatal accident
  • AI-enabled medical misdiagnosis leading to patient death
  • Massive data breach exposing millions of sensitive records
  • Systematic discrimination affecting protected classes at scale

Level 4: Major Impact

Financial $1M-$10M direct loss, significant profit impact
Operational Major process disruption, 7-30 days recovery, significant resource diversion
Reputational National media coverage, customer trust significantly damaged, partner concerns
Regulatory Significant fines, formal enforcement action, mandatory remediation
Safety/Rights Serious injury, significant civil rights violation, documented harm to individuals
AI Examples
  • AI hiring system causing systematic employment discrimination
  • Credit scoring AI denying services based on protected characteristics
  • Facial recognition false positives leading to wrongful detentions
  • Generative AI producing large-scale misinformation

Level 3: Moderate Impact

Financial $100K-$1M direct loss, noticeable budget impact
Operational Process degradation, 1-7 days recovery, overtime required
Reputational Trade press coverage, customer complaints, internal morale impact
Regulatory Formal warnings, compliance audit triggered, documentation required
Safety/Rights Minor injury, individual rights violation, distress caused
AI Examples
  • Customer-facing AI providing incorrect professional advice
  • Recommendation system showing inappropriate content
  • AI chatbot sharing confidential information inappropriately
  • Model performance degradation affecting service quality

Level 2: Minor Impact

Financial $10K-$100K direct loss, within operational budget
Operational Temporary inconvenience, <24 hours recovery, workarounds available
Reputational Limited external awareness, some customer complaints, quickly resolved
Regulatory Informal inquiry, self-reported issue, no formal action
Safety/Rights No physical harm, inconvenience or frustration, easily remediated
AI Examples
  • AI assistant providing unhelpful but harmless responses
  • Personalization algorithm showing irrelevant content
  • Temporary AI service outage with graceful degradation
  • Minor data quality issues affecting analytics accuracy

Level 1: Negligible Impact

Financial <$10K direct loss, absorbed in normal operations
Operational Minimal disruption, immediate recovery, no workarounds needed
Reputational No external awareness, internal only, quickly forgotten
Regulatory No regulatory implications
Safety/Rights No harm, no complaints
AI Examples
  • Internal AI tool producing slightly suboptimal results
  • Spam filter with occasional false positives
  • Development model failing during testing (as expected)

D.5 Detectability Factor (Risk Modifier)

The Detectability Factor adjusts the base risk score based on how quickly and reliably the risk can be detected before significant harm occurs.

Factor Detectability Level Description AI Examples
1.5× Undetectable No mechanism to detect the risk before harm occurs. Discovery only after significant damage. Subtle bias emerging over time; slow model drift; privacy violations with no audit trail
1.25× Difficult to Detect Detection possible but requires specialized effort or occurs after partial harm. Complex adversarial attacks; sophisticated prompt injection; data poisoning in training
1.0× Moderately Detectable Standard monitoring will likely detect the risk, but not immediately. Performance degradation; obvious hallucinations; standard security events
0.75× Easily Detectable Robust monitoring and alerting systems provide early warning. Input validation failures; rate limit violations; known attack signatures
0.5× Immediately Detectable Real-time detection with automatic response. Risk is caught before any harm. Hard-coded guardrails blocking prohibited content; circuit breakers; kill switches

📊 Adjusted Risk Score Calculation

Adjusted Score = (Likelihood × Impact) × Detectability Factor

Example: A risk with Likelihood=4, Impact=3, and Difficult Detectability:

(4 × 3) × 1.25 = 12 × 1.25 = 15 → HIGH RISK

The same risk with robust detection (Factor=0.75):

(4 × 3) × 0.75 = 12 × 0.75 = 9 → MEDIUM RISK

D.6 Risk Response Actions by Level

Risk Level Response Actions Governance Requirements Timeline
CRITICAL
  • Immediate escalation to executive leadership
  • Consider "stop the line" authority
  • Emergency mitigation plan required
  • Daily status reporting until resolved
  • May require project cancellation
  • CAIO/CEO approval required to proceed
  • RAI Council emergency session
  • Board notification (if material)
  • Legal/Compliance review mandatory
Immediate action within 24 hours
HIGH
  • Active risk management required
  • Detailed mitigation plan with milestones
  • Enhanced monitoring and controls
  • Regular status reporting
  • Human oversight requirements
  • RAI Council review and approval
  • Executive sponsor accountability
  • Risk owner assigned
  • Documented acceptance if residual risk accepted
Mitigation plan within 7 days
MEDIUM
  • Documented mitigation plan
  • Standard controls and monitoring
  • Periodic risk review
  • Include in project risk register
  • Model Owner approval sufficient
  • RAI Council notification
  • Included in routine governance reporting
Mitigation plan within 30 days
LOW
  • Accept or implement simple mitigations
  • Standard operating procedures
  • Routine monitoring
  • Team-level approval
  • Documented in risk register
Address in normal project cycle
MINIMAL
  • Accept the risk
  • No special action required
  • Document and proceed
  • No governance requirements
  • Optional documentation
No deadline

D.7 AI-Specific Risk Categories

When applying the risk matrix, consider these AI-specific risk categories:

🎯 Fairness & Discrimination Risks

Risk Type
  • Disparate impact on protected groups
  • Historical bias perpetuation
  • Proxy discrimination
  • Feedback loop amplification
Key Factors
  • Likelihood Increase: Training data imbalances, lack of fairness testing
  • Impact Increase: High-stakes decisions (hiring, credit, healthcare)
  • Detectability Challenge: Subtle biases may emerge gradually

🔒 Security & Adversarial Risks

Risk Type
  • Adversarial input attacks
  • Prompt injection / jailbreaking
  • Model extraction / theft
  • Data poisoning
  • Privacy inference attacks
Key Factors
  • Likelihood Increase: Public-facing systems, valuable IP
  • Impact Increase: Sensitive data, critical infrastructure
  • Detectability Challenge: Sophisticated attacks may evade detection

⚡ Reliability & Performance Risks

Risk Type
  • Model drift (data and concept)
  • Performance degradation
  • Hallucination / confabulation
  • Edge case failures
  • Cascading failures
Key Factors
  • Likelihood Increase: Dynamic environments, long deployment cycles
  • Impact Increase: Mission-critical applications
  • Detectability Opportunity: Can be monitored with proper tooling

🔐 Privacy & Data Protection Risks

Risk Type
  • Training data memorization
  • Unauthorized data exposure
  • Re-identification attacks
  • Cross-context data use
  • Inadequate consent
Key Factors
  • Likelihood Increase: Large training datasets, PII processing
  • Impact Increase: Sensitive categories (health, financial)
  • Regulatory Multiplier: GDPR violations can increase severity

📋 Compliance & Legal Risks

Risk Type
  • EU AI Act non-compliance
  • Sector-specific regulation violations
  • IP infringement in training/outputs
  • Contractual breaches
  • Lack of required documentation
Key Factors
  • Likelihood Increase: Unclear regulatory guidance, rapid deployment
  • Impact Increase: Regulated industries, EU market exposure
  • Evolution Risk: Regulatory landscape rapidly changing

D.8 Mapping to EU AI Act Risk Categories

The internal risk scoring framework aligns with EU AI Act classifications:

Internal Rating EU AI Act Category Regulatory Requirements
CRITICAL (20-25) Prohibited or High-Risk (Annex III)
  • If prohibited → Do not deploy
  • If high-risk → Full conformity assessment, CE marking, EU database registration
  • Risk management system required
  • Human oversight mandatory
HIGH (10-19) High-Risk (Annex III)
  • Conformity assessment (self or third-party)
  • Quality management system
  • Technical documentation
  • Logging and monitoring
  • Human oversight
MEDIUM (5-9) Limited Risk
  • Transparency requirements (AI disclosure)
  • User notification when interacting with AI
  • Deep fake labeling requirements
LOW (3-4) Minimal Risk
  • No mandatory requirements
  • Voluntary codes of conduct encouraged
MINIMAL (1-2) Minimal Risk
  • No mandatory requirements

D.9 Worked Examples

Example 1: AI Resume Screening Tool

Context: Automated resume screening for technical positions, affects hiring decisions.

Factor Assessment Score
Likelihood Likely (4) - Historical data shows gender and ethnic bias in resume screening is common without mitigation 4
Impact Major (4) - Employment discrimination has significant legal, reputational, and individual harm potential 4
Base Score 4 × 4 = 16
Detectability Difficult to Detect (1.25) - Bias may emerge gradually and require specialized analysis 1.25×
Final Score 16 × 1.25 = 20 → CRITICAL RISK

Required Actions: RAI Council review, comprehensive bias testing across all protected characteristics, mandatory human review of AI decisions, continuous fairness monitoring, EU AI Act high-risk compliance.

Example 2: Customer Service Chatbot

Context: LLM-powered chatbot for general customer inquiries, no financial transactions.

Factor Assessment Score
Likelihood Possible (3) - LLMs can occasionally provide incorrect information 3
Impact Minor (2) - Customer inconvenience, easily corrected, no financial impact 2
Base Score 3 × 2 = 6
Detectability Easily Detectable (0.75) - Customer feedback, conversation logs, quality sampling 0.75×
Final Score 6 × 0.75 = 4.5 → LOW RISK

Required Actions: Standard guardrails, AI disclosure to users, escalation path to human agents, periodic quality review.

Example 3: Medical Diagnosis Support

Context: AI system providing diagnostic suggestions to physicians for rare diseases.

Factor Assessment Score
Likelihood Possible (3) - Rare disease diagnosis is inherently difficult 3
Impact Catastrophic (5) - Incorrect diagnosis could lead to patient harm or death 5
Base Score 3 × 5 = 15
Detectability Moderately Detectable (1.0) - Physician review, but subtle errors may be missed 1.0×
Final Score 15 × 1.0 = 15 → HIGH RISK

Required Actions: Mandatory HITL with physician final decision, comprehensive clinical validation, EU AI Act high-risk compliance, FDA/CE medical device requirements, robust uncertainty quantification.

🧮 Interactive Risk Calculator

Use this calculator to quickly assess AI system risks:

Base Score: 9
9.0
MEDIUM RISK

🔗 Related Framework Components