1.2 Core Ethical Principles (The North Star)
These five foundational principles serve as the ethical foundation for all AI activities within your organization. They should be embedded in every policy, process, and decision involving AI systems—from initial ideation through deployment and retirement.
These principles align with major international frameworks including the OECD AI Principles (2019/2024), UNESCO Recommendation on AI Ethics, EU AI Act fundamental requirements, NIST AI RMF trustworthiness characteristics, and the G7 Code of Conduct for AI developers.
1.2.1 Fairness & Inclusiveness
Principle: AI systems must treat all individuals and groups equitably, avoiding unfair discrimination and promoting inclusive outcomes across diverse populations.
What This Means in Practice
- Non-Discrimination: AI systems must not produce outcomes that unfairly disadvantage individuals or groups based on protected characteristics (race, gender, age, disability, religion, national origin, etc.)
- Representation: Training data must adequately represent the populations the system will serve, including historically marginalized groups
- Accessibility: AI systems must be designed to be usable by people with disabilities and across diverse technical capabilities
- Equitable Access: Benefits of AI should be distributed fairly, not concentrated among privileged groups
Types of Bias to Address
| Bias Type | Description | Mitigation Strategy |
|---|---|---|
| Historical Bias | Data reflects past discriminatory practices | Audit historical data for patterns; consider reweighting or supplementing |
| Representation Bias | Training data underrepresents certain groups | Stratified sampling; synthetic data augmentation; active data collection |
| Measurement Bias | Features or labels systematically vary across groups | Validate measurements across demographics; use multiple proxy measures |
| Aggregation Bias | Single model fails to account for group differences | Develop subgroup-specific models or calibration; disaggregated evaluation |
| Evaluation Bias | Testing data doesn't reflect deployment population | Ensure test sets are representative; conduct subgroup performance analysis |
Fairness Metrics to Track
- Demographic Parity: Equal selection/approval rates across groups
- Equalized Odds: Equal true positive and false positive rates across groups
- Predictive Parity: Equal positive predictive value across groups
- Individual Fairness: Similar individuals receive similar outcomes
- Counterfactual Fairness: Outcomes would be the same if protected attributes differed
Different fairness metrics can be mathematically incompatible (the "impossibility theorem"). Organizations must make explicit choices about which fairness criteria to prioritize based on use case context, stakeholder input, and legal requirements. Document these decisions and their rationale.
1.2.2 Reliability & Safety
Principle: AI systems must perform consistently and dependably within their intended operating conditions, while minimizing the potential for harm to individuals, organizations, and society.
Reliability Requirements
- Robustness: Systems maintain performance across expected input variations and edge cases
- Consistency: Similar inputs produce similar outputs over time
- Graceful Degradation: Systems fail safely when encountering unexpected conditions
- Reproducibility: Results can be recreated given the same inputs and conditions
- Resilience: Systems recover appropriately from failures and attacks
Safety Considerations by Risk Level
| Risk Level | Safety Requirements | Testing Approach |
|---|---|---|
| Minimal | Basic quality assurance, standard testing | Unit tests, integration tests, user acceptance testing |
| Limited | Enhanced testing, monitoring, disclosure requirements | A/B testing, canary deployments, performance monitoring |
| High | Rigorous validation, human oversight, audit trails | Red teaming, adversarial testing, fairness audits, third-party review |
| Prohibited | System cannot be deployed | N/A |
Safety Engineering Practices
Failure Mode Analysis
Systematically identify how the system could fail and the consequences of each failure mode. Design mitigations for high-severity scenarios.
Adversarial Testing
Test system behavior under hostile conditions including malicious inputs, data poisoning attempts, and model extraction attacks.
Boundary Definition
Clearly define operational boundaries and implement guardrails to prevent operation outside safe parameters.
Incident Response
Establish procedures for detecting, responding to, and learning from safety incidents when they occur.
1.2.3 Privacy & Security
Principle: AI systems must protect individual privacy rights and secure sensitive data throughout the AI lifecycle, from training data collection through model deployment and retirement.
Privacy-by-Design Requirements
- Data Minimization: Collect and retain only data necessary for the stated purpose
- Purpose Limitation: Use data only for specified, explicit, and legitimate purposes
- Consent & Notice: Obtain appropriate consent and provide clear notice of AI-powered processing
- Individual Rights: Enable data access, correction, deletion, and portability rights
- Retention Limits: Delete data when no longer necessary; implement automated retention policies
Privacy-Preserving Techniques
| Technique | Description | Use Cases |
|---|---|---|
| Differential Privacy | Add calibrated noise to protect individual records while preserving aggregate insights | Analytics, model training, data sharing |
| Federated Learning | Train models on decentralized data without centralizing raw information | Multi-party collaboration, mobile devices |
| Homomorphic Encryption | Perform computations on encrypted data without decryption | Cloud processing of sensitive data |
| Secure Multi-Party Computation | Multiple parties jointly compute a function without revealing inputs | Collaborative analytics, benchmarking |
| Synthetic Data | Generate artificial data preserving statistical properties without real records | Testing, development, external sharing |
AI-Specific Security Concerns
- Model Theft: Protect models from extraction through query-based attacks
- Data Poisoning: Prevent manipulation of training data to compromise model behavior
- Adversarial Inputs: Detect and filter inputs designed to cause misclassification
- Prompt Injection: Guard against malicious instructions embedded in LLM inputs
- Membership Inference: Prevent attackers from determining if specific data was used in training
- Model Memorization: Mitigate unintentional reproduction of training data in outputs
1.2.4 Transparency & Explainability
Principle: AI systems must operate in a manner that enables stakeholders to understand how decisions are made, with appropriate levels of transparency for different audiences.
Levels of Transparency
| Audience | Information Needed | Delivery Mechanism |
|---|---|---|
| End Users | That AI is being used; general purpose; how to contest decisions | Clear notices, disclosure statements, help resources |
| Affected Individuals | Factors considered; how to seek human review | Explanation interfaces, appeal processes |
| Business Stakeholders | System purpose, limitations, performance metrics | Model cards, system documentation, dashboards |
| Regulators | Technical details, training data, validation results | Conformity assessments, audit logs, technical documentation |
| Developers | Architecture, code, hyperparameters, reproduction steps | Model cards, code repositories, experiment logs |
Explainability Techniques
SHAP Values
Attribute predictions to features using game-theoretic approach. Provides global and local explanations for any model type.
LIME
Generate local interpretable explanations by approximating complex models with simpler ones around specific predictions.
Attention Visualization
For neural networks, visualize which inputs the model "attended to" when making predictions.
Counterfactual Explanations
Show what would need to change for a different outcome: "If X were Y, the decision would have been Z."
EU AI Act Transparency Requirements
- AI System Disclosure: Users must be informed when interacting with AI (chatbots, emotion recognition)
- Deep Fake Labeling: AI-generated or manipulated content must be disclosed
- High-Risk Documentation: Detailed technical documentation, instructions for use, and audit logs
- GPAI Transparency: Training data summaries, model capabilities and limitations, downstream provider information
1.2.5 Accountability & Human Oversight
Principle: Clear responsibility must be assigned for AI systems and their outcomes, with appropriate mechanisms for human oversight, intervention, and redress.
Accountability Framework
| Level | Responsible Party | Accountabilities |
|---|---|---|
| Strategic | Board of Directors / C-Suite | Setting AI ethics policy, risk appetite, resource allocation |
| Tactical | AI Ethics Board / CAIO | Framework implementation, cross-functional coordination, escalations |
| Operational | Model Owners / Product Managers | Individual system compliance, risk assessments, documentation |
| Technical | Data Scientists / Engineers | Implementation quality, testing, monitoring, technical safeguards |
Human Oversight Models
Human-in-the-Loop (HITL)
Human approval required for every AI decision. Appropriate for highest-risk decisions (medical diagnosis, criminal sentencing).
Human-on-the-Loop (HOTL)
Humans monitor AI decisions in real-time with ability to intervene. Suitable for time-sensitive decisions with moderate risk.
Human-over-the-Loop
Periodic human review and system tuning without real-time involvement. Appropriate for lower-risk, high-volume decisions.
Essential Accountability Mechanisms
- Designated Model Owner with documented responsibilities for each AI system
- Audit trails capturing all significant decisions and model changes
- Clear escalation pathways for ethical concerns and incidents
- "Stop the line" authority enabling any team member to halt deployment
- Redress mechanisms for individuals harmed by AI decisions
- Regular reporting to leadership on AI governance metrics
- Third-party audits for high-risk systems
- Insurance coverage for AI-related liabilities
Implementation Steps
Draft Your Ethical Principles Statement
Customize these five principles with organization-specific context, examples, and commitments. Ensure executive sign-off and Board endorsement.
Timeline: 2-3 weeks | Owner: CAIO / Legal / Ethics Board
Translate Principles to Policies
Develop operational policies that implement each principle. Include specific requirements, thresholds, and approval processes.
Timeline: 4-6 weeks | Owner: AI Governance Team
Create Assessment Criteria
Define how systems will be evaluated against each principle. Establish metrics, testing requirements, and acceptance thresholds.
Timeline: 3-4 weeks | Owner: Data Science / Quality Assurance
Integrate into Development Lifecycle
Embed principles checkpoints into your AI development process—from ideation through deployment. Create templates and checklists.
Timeline: 4-6 weeks | Owner: Development Teams / PMO
Communicate and Train
Launch organization-wide communication of principles. Develop role-specific training modules for developers, executives, and end users.
Timeline: 4-8 weeks | Owner: HR / Communications / Training
These principles should not remain abstract ideals. Each AI project should document how it addresses each principle, with specific evidence and testing results. This documentation forms the foundation of your compliance posture and audit readiness.