Appendix E: Glossary of AI Ethics Terms
Comprehensive reference for responsible AI terminology, regulations, and concepts
Quick Reference Guide
🏛️ Key Regulations
- EU AI Act
- GDPR
- US Executive Order 14110
- NIST AI RMF
- EEOC Guidelines
⚖️ Core Principles
- Fairness & Non-discrimination
- Transparency & Explainability
- Privacy & Data Protection
- Safety & Reliability
- Accountability & Oversight
📊 Risk Categories
- Prohibited AI
- High-Risk AI
- Limited Risk AI
- Minimal Risk AI
🛡️ Key Controls
- Human-in-the-Loop (HITL)
- Algorithmic Impact Assessment
- Model Cards
- Red Teaming
- Bias Testing
Categories:
Regulatory
Technical
Governance
Fairness
Privacy
Security
GenAI/LLM
A
Accountability
The obligation of individuals, organizations, and systems to be answerable for actions, decisions, and their consequences. In AI contexts, this includes clear assignment of responsibility for AI system outcomes, mechanisms for redress, and documentation of decision-making processes.
Adversarial Attack
A technique that attempts to manipulate AI systems by introducing malicious inputs designed to cause incorrect outputs. Types include evasion attacks (manipulating inputs to avoid detection), poisoning attacks (corrupting training data), and model extraction attacks (stealing model parameters).
AI BOM — AI Bill of Materials
A comprehensive inventory documenting all components of an AI system, including training datasets, model architectures, third-party components, APIs, and dependencies. Analogous to a software bill of materials (SBOM), it enables supply chain transparency and risk assessment.
AIA — Algorithmic Impact Assessment
A systematic evaluation process that identifies, analyzes, and documents the potential impacts of an AI system before deployment. Includes assessment of affected populations, potential harms, fairness implications, and mitigation measures. Required under various regulations for high-risk AI systems.
AGI — Artificial General Intelligence
A hypothetical form of AI that possesses human-level cognitive abilities across all domains, capable of performing any intellectual task that a human can. Distinguished from current "narrow AI" systems that excel at specific tasks. Not yet achieved as of 2025.
Audit Trail
A chronological record that provides documentary evidence of the sequence of activities affecting an AI system at any time. Includes logs of data access, model changes, predictions made, and human interventions. Essential for compliance and incident investigation.
Automation Bias
The tendency of humans to over-rely on automated systems, often ignoring contradictory information or failing to critically evaluate algorithmic outputs. A key concern when implementing human-in-the-loop controls, as human reviewers may rubber-stamp AI recommendations.
B
Bias (Algorithmic)
Systematic errors in AI outputs that create unfair outcomes, typically favoring or disadvantaging particular groups. Types include: Historical bias (patterns from biased past data), Representation bias (underrepresentation of groups in training data), Measurement bias (features that proxy for protected characteristics), and Evaluation bias (testing on non-representative datasets).
Black Box Model
An AI system whose internal decision-making process is opaque or not interpretable to users. Complex models like deep neural networks and large language models are often considered black boxes because the relationship between inputs and outputs cannot be easily explained in human-understandable terms.
Biometric Data
Personal data resulting from specific technical processing relating to physical, physiological, or behavioral characteristics (e.g., fingerprints, facial images, voice patterns, gait). Under GDPR and EU AI Act, biometric data is considered a special category requiring enhanced protections. Biometric identification in public spaces is heavily regulated.
C
Concept Drift
A change in the statistical relationship between input features and target variables over time, causing model performance degradation. Unlike data drift, concept drift means the underlying patterns the model learned are no longer valid. Requires retraining or continuous learning approaches.
Conformity Assessment
Under the EU AI Act, the process of verifying that an AI system meets the requirements set out in the regulation. For high-risk AI, this includes internal controls and, for certain categories, third-party assessment by a notified body. Results in CE marking for compliant products.
Constitutional AI
An AI alignment technique where models are trained to follow a set of principles (a "constitution") during training. The model critiques and revises its own outputs based on these principles, reducing harmful content without extensive human feedback. Pioneered by Anthropic.
Counterfactual Fairness
A fairness criterion stating that a decision is fair if it would have been the same had the individual belonged to a different demographic group, all else being equal. Based on causal reasoning: what would the outcome be in a counterfactual world where only the protected attribute changed?
D
Data Drift
A change in the statistical properties of the input data used by an AI model compared to the training data distribution. Can cause model performance degradation even if the underlying relationships remain unchanged. Monitored through statistical tests and distribution comparisons.
Data Lineage
A comprehensive record of data's origins, movements, characteristics, and transformations throughout its lifecycle. Includes source systems, processing steps, quality checks, and downstream uses. Essential for debugging, compliance, and understanding data quality issues.
Data Minimization
A privacy principle under GDPR (Article 5) requiring that personal data collected and processed be limited to what is directly necessary for the specified purpose. In AI contexts, this means using only the features strictly required for model functionality and not collecting data "just in case."
DPIA — Data Protection Impact Assessment
A process mandated under GDPR Article 35 for assessing the risks of data processing activities, particularly those using new technologies or involving systematic evaluation, large-scale processing of special categories, or public monitoring. Required before processing that poses high risk to individuals' rights.
Deep Learning
A subset of machine learning using artificial neural networks with multiple layers (hence "deep") to learn hierarchical representations from data. Powers modern AI breakthroughs in computer vision, natural language processing, and speech recognition. Generally less interpretable than traditional ML methods.
Demographic Parity
A fairness metric requiring that the proportion of individuals receiving a positive outcome be equal across different demographic groups. Also called "statistical parity" or "group fairness." Formula: P(Ŷ=1|A=0) = P(Ŷ=1|A=1). May conflict with other fairness metrics like equalized odds.
Differential Privacy
A mathematical framework for ensuring that statistical queries on a dataset do not reveal information about any individual. Achieved by adding calibrated noise to outputs. Provides provable privacy guarantees with a privacy budget (ε - epsilon). Used by Apple, Google, and US Census Bureau.
Disparate Impact
An unintentional discrimination pattern where a seemingly neutral policy or practice adversely affects a protected group. In US employment law, the "four-fifths rule" indicates potential disparate impact if a selection rate for any group is less than 80% of the highest group's rate. Key legal standard for AI fairness.
E
EEOC — Equal Employment Opportunity Commission
US federal agency enforcing civil rights laws against workplace discrimination. Has issued guidance on AI in hiring, warning that employers remain liable for discriminatory outcomes even when using third-party AI tools. Key enforcer of Title VII compliance for AI employment systems.
Embedding
A dense vector representation of data (text, images, etc.) in a continuous vector space where semantically similar items are mapped to nearby points. Used in LLMs to represent words/sentences and in recommendation systems. Critical for RAG (Retrieval-Augmented Generation) implementations.
Equalized Odds
A fairness metric requiring that true positive rates and false positive rates be equal across demographic groups. More nuanced than demographic parity as it considers predictive accuracy. Formula: P(Ŷ=1|A=0,Y=y) = P(Ŷ=1|A=1,Y=y) for y ∈ {0,1}.
EU AI Act
The European Union's comprehensive AI regulation (Regulation 2024/1689), effective August 2024 with phased implementation through 2027. Establishes a risk-based framework classifying AI systems as Prohibited, High-Risk, Limited Risk, or Minimal Risk. Sets requirements for transparency, human oversight, data governance, and conformity assessment.
Explainability (XAI)
The ability to describe an AI system's internal mechanics and decision-making process in human-understandable terms. Ranges from global explanations (overall model behavior) to local explanations (specific predictions). Techniques include SHAP, LIME, attention visualization, and decision trees as surrogates.
F
Fairness Metrics
Quantitative measures used to evaluate whether an AI system treats different groups equitably. Key metrics include demographic parity, equalized odds, predictive parity, calibration, and disparate impact ratio. Note: different fairness metrics can be mathematically incompatible—perfect performance on one may preclude achieving another.
Feature Importance
A measure of how much each input variable (feature) contributes to a model's predictions. Methods include permutation importance, mean decrease in impurity (for tree-based models), and SHAP values. Critical for understanding which factors drive AI decisions and identifying potential bias sources.
Federated Learning
A privacy-preserving machine learning approach where models are trained across decentralized devices or servers holding local data, without exchanging raw data. Only model updates (gradients) are shared. Used by Google (keyboard prediction) and healthcare consortia to enable collaboration while protecting data privacy.
Fine-Tuning
The process of further training a pre-trained AI model on a specific dataset to adapt it for a particular task or domain. Common for LLMs, where a general-purpose model is fine-tuned on domain-specific data. Requires careful governance as fine-tuning can introduce new biases or capabilities.
Four-Fifths Rule (80% Rule)
A legal guideline from the EEOC stating that a selection rate for any protected group should be at least 80% of the rate for the group with the highest selection rate to avoid disparate impact claims. Formula: (Selection Rate for Group A) ÷ (Selection Rate for Group B) ≥ 0.8.
Foundation Model
A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. Examples include GPT-4, Claude, LLaMA, and BERT. Under the EU AI Act, providers of foundation models have specific transparency and documentation obligations regardless of downstream use case risk.
G
GDPR — General Data Protection Regulation
EU regulation (2016/679) establishing comprehensive data protection requirements, effective since May 2018. Key AI-relevant provisions include: Article 22 (right not to be subject to solely automated decisions with legal effects), Article 13-14 (right to information about automated decision-making logic), and Articles 17/21 (right to erasure and objection).
Generative AI
AI systems capable of creating new content (text, images, audio, video, code) rather than just analyzing or classifying existing data. Includes Large Language Models (LLMs), image generators (DALL-E, Midjourney), and multimodal models. Raises unique ethical issues around content authenticity, IP, and misuse.
GPAI — General-Purpose AI
Under the EU AI Act, AI models trained with large amounts of data using self-supervision at scale that display significant generality and can competently perform a wide range of distinct tasks. GPAI models have specific transparency obligations, and those posing systemic risks face additional requirements including adversarial testing and incident reporting.
Green AI
An approach to AI development prioritizing energy efficiency and environmental sustainability. Involves measuring and reducing the carbon footprint of training and inference, choosing efficient architectures, using renewable energy, and reporting environmental impact. Contrasted with "Red AI" that optimizes only for accuracy regardless of compute.
Guardrails
Technical controls and policies that constrain AI system behavior within acceptable boundaries. For LLMs, includes input filters (blocking harmful prompts), output filters (preventing toxic content), topic restrictions, and safety classifiers. Part of a defense-in-depth approach to AI safety.
H
Hallucination
When an AI system, particularly an LLM, generates content that is factually incorrect, nonsensical, or fabricated but presented with apparent confidence. A key reliability challenge for generative AI. Mitigated through techniques like RAG (Retrieval-Augmented Generation), fact-checking layers, and citation requirements.
High-Risk AI
Under the EU AI Act, AI systems in sensitive domains requiring compliance with strict requirements. Includes: biometric identification, critical infrastructure management, educational/vocational assessment, employment/HR decisions, essential services access, law enforcement, migration/asylum, justice administration. Must undergo conformity assessment before market placement.
HITL — Human-in-the-Loop
A system design where human judgment is required at specific decision points before AI recommendations are executed. The human reviews and approves (or overrides) each AI output. Provides strong oversight but may create bottlenecks and is subject to automation bias. Required for many high-risk AI applications.
HOTL — Human-over-the-Loop
A system design where humans monitor AI operations and can intervene when necessary, but the AI operates autonomously for routine decisions. Humans set parameters, review aggregated outcomes, and handle exceptions. Balances efficiency with oversight but requires robust monitoring systems to detect issues.
Human Rights Impact Assessment
An evaluation of how an AI system may affect fundamental human rights as defined by international frameworks (UN Declaration, European Convention on Human Rights). Considers impacts on dignity, privacy, freedom of expression, non-discrimination, access to justice, and other protected rights. Increasingly integrated into AI governance.
I
Inference
The process of using a trained AI model to make predictions or generate outputs on new data. Distinguished from training (learning from data). Inference costs (compute, latency, energy) are a key operational consideration, especially for deployed models at scale. Also refers to logical reasoning in symbolic AI.
Interpretability
The degree to which a human can understand the cause of a model's decision. Intrinsically interpretable models (linear regression, decision trees) provide direct insight into decision logic. Complex models require post-hoc interpretation methods. Distinct from but related to explainability (the ability to communicate understanding to others).
K
Know Your Algorithm (KYA)
A governance principle requiring organizations to thoroughly understand the AI systems they deploy, including training data, model architecture, limitations, and failure modes. Analogous to "Know Your Customer" in finance. Essential for vendor AI procurement and regulatory compliance.
L
LLM — Large Language Model
A type of AI model trained on massive text corpora to understand and generate human language. Based on transformer architecture with billions of parameters. Examples include GPT-4, Claude, LLaMA, and Gemini. Capable of various tasks including writing, coding, analysis, and conversation. Raises unique governance challenges around hallucination, prompt injection, and content moderation.
LIME — Local Interpretable Model-agnostic Explanations
An explainability technique that explains individual predictions by approximating the complex model locally with a simpler, interpretable model. Creates perturbations of the input and observes how predictions change. Model-agnostic—works with any classifier. Provides feature importance for specific decisions.
M
Machine Learning (ML)
A subset of AI where systems learn patterns from data rather than being explicitly programmed. Types include supervised learning (labeled data), unsupervised learning (pattern discovery), and reinforcement learning (reward-based). Modern AI advances are largely driven by ML techniques.
Model Card
A standardized documentation format for ML models, pioneered by Google and adopted by Hugging Face. Includes model details, intended use, limitations, training data, performance metrics, ethical considerations, and deployment guidance. Promotes transparency and enables informed use decisions. Required documentation under EU AI Act for high-risk systems.
Model Extraction Attack
A security attack where an adversary queries an AI model repeatedly to reconstruct a functional copy of the model. Threatens intellectual property and enables subsequent attacks. Defenses include rate limiting, query monitoring, output perturbation, and watermarking. Also called model stealing.
Model Monitoring
Continuous surveillance of deployed AI models to detect performance degradation, drift, bias emergence, and anomalies. Includes tracking accuracy metrics, input/output distributions, fairness measures, and system health. Essential for maintaining reliable, compliant AI systems in production.
Membership Inference Attack
A privacy attack that determines whether a specific data record was used to train a model. Exploits the fact that models often behave differently on training data vs. unseen data. Particularly concerning for models trained on sensitive personal data. Mitigated by differential privacy and regularization.
N
NIST AI RMF — NIST AI Risk Management Framework
A voluntary framework published by the US National Institute of Standards and Technology (2023) for managing AI risks throughout the AI lifecycle. Organized around four core functions: Govern, Map, Measure, and Manage. Widely referenced in US government procurement and corporate AI governance programs.
O
Overfitting
A modeling error where a model learns the training data too well, including noise and outliers, resulting in poor generalization to new data. Indicators include high training accuracy but low validation/test accuracy. Addressed through regularization, cross-validation, and early stopping. A reliability concern for production AI.
P
Poisoning Attack (Data Poisoning)
An attack where adversaries corrupt training data to manipulate model behavior. Types include backdoor attacks (triggering specific misclassifications) and availability attacks (degrading overall performance). Particularly concerning for systems using user-generated or web-scraped data. Requires data quality controls and anomaly detection.
Predictive Parity
A fairness metric requiring that the positive predictive value (precision) be equal across demographic groups. Meaning: among those predicted positive, the proportion who are actually positive should be the same regardless of group membership. May conflict with equalized odds.
Prohibited AI
Under the EU AI Act, AI applications that are banned entirely due to unacceptable risks to fundamental rights. Includes: social scoring by governments, real-time biometric identification in public spaces (with exceptions), manipulation exploiting vulnerabilities, and emotion recognition in workplaces/schools. Violations subject to significant fines.
Prompt Injection
An attack technique specific to LLMs where malicious instructions are embedded in user inputs or external data to override system prompts and manipulate model behavior. Types include direct injection (explicit override attempts) and indirect injection (hidden instructions in retrieved content). A critical security concern for LLM applications.
Protected Characteristics
Personal attributes that legally cannot be used for discriminatory treatment. Varies by jurisdiction but typically includes: race, color, religion, sex, national origin, age, disability, genetic information, and sexual orientation. AI systems must be tested for disparate impact across protected groups.
Proxy Variable
A feature that correlates with protected characteristics and can serve as an indirect proxy for discrimination. Examples: ZIP code proxying for race, name proxying for gender/ethnicity. Even when protected attributes are excluded, proxies can perpetuate bias. Requires careful feature analysis during model development.
R
RACI Matrix
A responsibility assignment tool defining who is Responsible (does the work), Accountable (ultimately answerable), Consulted (provides input), and Informed (kept updated) for each task or decision. Essential for AI governance to clarify roles across technical, legal, ethics, and business stakeholders.
RAG — Retrieval-Augmented Generation
A technique combining LLMs with external knowledge retrieval to ground responses in factual, up-to-date information. The model retrieves relevant documents from a knowledge base before generating responses. Reduces hallucination, enables domain-specific applications, and provides source citations. Key architecture for enterprise LLM deployments.
Red Teaming
A structured adversarial testing practice where a team attempts to find vulnerabilities, biases, and failure modes in an AI system. For LLMs, includes testing for jailbreaks, harmful content generation, and prompt injections. For ML models, includes adversarial examples and edge cases. Required under EU AI Act for GPAI with systemic risk.
RLHF — Reinforcement Learning from Human Feedback
A training technique where an AI model is fine-tuned using human preferences as a reward signal. Humans rank model outputs, a reward model learns these preferences, and the main model is optimized to maximize the learned reward. Key technique for aligning LLMs with human values. Pioneered by Anthropic and OpenAI.
Responsible AI (RAI)
An approach to developing, deploying, and using AI that emphasizes ethical considerations, transparency, accountability, and societal benefit. Encompasses fairness, privacy, safety, explainability, and governance. Distinguished from "AI ethics" (philosophical principles) by its focus on practical implementation frameworks.
Right to Explanation
The legal right of individuals to receive meaningful information about the logic involved in automated decisions affecting them. Derived from GDPR Articles 13-15 and 22. Contested in scope—may require only system-level descriptions or individual decision explanations depending on interpretation. Growing legal recognition globally.
S
Shadow AI
The use of AI tools and services by employees without formal organizational approval, visibility, or governance. Includes use of public GenAI tools (ChatGPT, Claude, Midjourney) for work tasks. Creates risks around data leakage, compliance violations, and inconsistent outputs. Requires policies, training, and sanctioned alternatives.
SHAP — SHapley Additive exPlanations
An explainability method based on game theory (Shapley values) that assigns each feature an importance value for a particular prediction. Provides consistent, locally accurate explanations for any model. Computationally expensive but considered one of the most theoretically sound explanation methods.
Social Scoring
The use of AI to evaluate individuals' trustworthiness based on social behavior, resulting in benefits or detriments. Prohibited under the EU AI Act when used by public authorities or on their behalf to evaluate natural persons over time, leading to detrimental treatment disproportionate to behavior or unrelated contexts.
Synthetic Data
Artificially generated data that mimics the statistical properties of real data without containing actual personal information. Used for privacy-preserving model training, testing, and data augmentation. Must be carefully validated to ensure it captures relevant patterns without perpetuating biases from original data.
System Card
Documentation describing an AI system's safety properties, usage policies, and evaluation results. More comprehensive than model cards—covers the full deployed system including integrations, guardrails, and operational context. Introduced by OpenAI for GPT-4 and now a best practice for production AI systems.
T
Three Lines of Defense
A governance model separating risk ownership across three levels: First Line (operational teams who own and manage risk daily), Second Line (risk management and compliance functions providing oversight), Third Line (internal audit providing independent assurance). Applied to AI governance by mapping technical, RAI council, and audit roles.
Training Data
The dataset used to teach an AI model patterns and relationships. Quality, representativeness, and bias in training data directly impact model behavior. Under EU AI Act, high-risk AI providers must use training data that is "relevant, sufficiently representative, and to the best extent possible, free of errors and complete."
Transformer
A neural network architecture based on self-attention mechanisms, introduced in the 2017 paper "Attention Is All You Need." Foundation of modern LLMs (GPT, BERT, Claude). Enables parallel processing and captures long-range dependencies in sequences. Revolutionary for NLP and increasingly applied to vision and multimodal tasks.
Transparency
The principle that AI systems should be open about their nature, capabilities, limitations, and decision-making processes. Under EU AI Act, includes: disclosing AI interaction (Article 50), marking AI-generated content, providing meaningful explanations, and documenting system design. Distinct from explainability, which focuses on technical understanding.
U
US Executive Order 14110
Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, signed by President Biden in October 2023. Establishes AI safety standards, addresses algorithmic discrimination, promotes innovation, and mandates actions across federal agencies. Key US policy framework for AI governance alongside NIST AI RMF.
V
Validation
The process of evaluating a trained model on held-out data to assess generalization performance and tune hyperparameters. Distinguished from testing (final evaluation) and training. Cross-validation uses multiple validation folds for more robust estimates. Essential for avoiding overfitting and ensuring reliable real-world performance.
Vendor Due Diligence
The process of evaluating third-party AI providers before procurement, covering technical capabilities, data practices, security controls, bias testing, compliance certifications, and contractual protections. Under EU AI Act, deployers of high-risk AI systems remain responsible even when using third-party solutions.
W
Watermarking (AI Content)
Techniques for embedding imperceptible markers in AI-generated content to enable later detection and attribution. Can be applied to text (statistical patterns), images (steganography), audio, and video. Growing regulatory interest as a tool for combating misinformation. Challenges include robustness to modification and standardization.
Z
Zero-Shot Learning
The ability of an AI model to perform tasks it was not explicitly trained on, using only task descriptions or instructions. A key capability of modern LLMs—they can follow natural language instructions for novel tasks without task-specific examples. Related: few-shot learning uses a small number of examples.