EU AI Act Article 10: data governance for high-risk AI systems
Article 10 of the EU AI Act sets the data governance requirements for high-risk AI systems. It is the most-cited provision in early Annex III enforcement actions because data quality and representativeness are demonstrable through documentation. Article 10 applies on 2 August 2026 and requires substantial documentation work for any Annex III SaaS.
Core Article 10 requirements
Article 10(2) requires that training, validation, and testing datasets be subject to data governance and management practices. The practices must address: design choices and assumptions, data collection processes, data preparation operations (annotation, labelling, cleaning, enrichment, aggregation), bias examination, identification of data gaps or shortcomings, and the formulation of mitigations. Article 10(3) requires datasets to be relevant, sufficiently representative, free of errors, and complete in view of the intended purpose.
The representativeness requirement
Article 10(3) requires datasets to be sufficiently representative. For SaaS shipping to the EU, this typically means: demographic representation across the EU population for systems affecting natural persons (age, gender, geography, language), domain representation for the specific use case, and balance across different subgroups likely to interact with the system. The exact standard is fact-specific - higher representativeness is required for higher-stakes uses.
Bias examination and mitigation
Article 10(2)(f) requires examination of possible biases that are likely to affect health, safety, or rights of natural persons, including biases reinforcing future operations. Practical bias testing: demographic performance breakdowns (accuracy, false-positive rate, false-negative rate per protected characteristic), causal analysis of identified biases, and documented mitigations. The depth of bias examination scales with use case stakes.
Special-category data
Article 10(5) permits processing of special categories of personal data (race, ethnic origin, political opinions, etc. under GDPR Article 9) for bias monitoring and correction - but only to the extent strictly necessary. This means you can collect demographic data specifically for the purpose of bias testing without separate GDPR Article 9 consent, provided you document the necessity, implement strong security, and delete the data after the bias-testing purpose is complete.
Practical compliance for SaaS
Build the Article 10 documentation package as part of the broader Article 11 technical documentation. Required elements: dataset documentation (source, license, size, composition), data preparation procedures, bias-testing methodology and results, representativeness analysis, gap identification and mitigation actions. For continuous-learning systems, add a process documentation showing how Article 10 requirements are maintained as the training data evolves. Engage a compliance advisor for Annex III systems - the bias-testing methodology requires both technical and legal expertise.
Frequently asked questions
When does Article 10 take effect?
2 August 2026 for high-risk AI systems.
Does Article 10 apply to non-high-risk AI?
Only as good practice - Article 10 binds high-risk AI under Chapter III. Non-high-risk systems benefit from following the principles but are not legally bound.
Can I collect demographic data for bias testing?
Yes - Article 10(5) permits processing of special-category data specifically for bias monitoring and correction, to the extent strictly necessary. Document the necessity and implement strong security.
What is the penalty for Article 10 failures?
Up to €15M or 3% of global turnover under Article 99(4) for high-risk system non-compliance.
Does Article 10 require external auditing?
Not directly. Article 10 sets requirements; conformity assessment under Article 43 verifies them (internally for most Annex III categories, externally via notified body for biometric).
Sources
Last updated: 2026-05-28