EU AI Act Compliance Scorecard
Independent multi-model assessment of leading AI systems against the EU AI Act (2024/1689). Updated continuously as models and regulations evolve.
| System | Provider | Verdict | Confidence | Risk Flags |
|---|---|---|---|---|
| Claude 3.5 Sonnet | Anthropic | Compliant | 94% | None |
| GPT-4o | OpenAI | Compliant | 92% | Transparency |
| Mistral Large | Mistral AI | Compliant | 91% | None |
| Gemini 1.5 Pro | Compliant | 89% | Data Governance | |
| Command R+ | Cohere | Compliant | 88% | Documentation |
| Phi-3-medium | Microsoft | Compliant | 88% | None |
| Llama 3.1 405B | Meta | Compliant | 87% | Oversight |
| Jamba 1.5 Large | AI21 Labs | Compliant | 86% | None |
| Granite 34B | IBM | Compliant | 86% | Documentation |
| Qwen 2.5 72B | Alibaba | Compliant | 85% | TransparencyData Governance |
| Inflection 3.0 | Inflection AI | Compliant | 84% | Documentation |
| Grok-2 | xAI | Compliant | 83% | Documentation |
| Cohere Aya 23 | Cohere | Compliant | 82% | Documentation |
| DeepSeek V3 | DeepSeek | Non-Compliant | 78% | TransparencyOversightData Governance |
| DBRX | Databricks | Non-Compliant | 74% | DocumentationTransparency |
| Falcon 180B | TII | Non-Compliant | 72% | TransparencyRisk Management |
| Nemotron-4 340B | NVIDIA | Non-Compliant | 71% | Data GovernanceTransparency |
| Yi-Large | Yi/01.AI | Non-Compliant | 69% | TransparencyData GovernanceOversight |
| OLMo 7B | AI2 | Non-Compliant | 65% | Risk ManagementTransparencyOversight |
Beta Notice
This scorecard is produced by the ICOSA protocol in beta and is provided for informational purposes only. Scores may be revised as the assessment methodology evolves and as model providers update their systems. This does not constitute legal advice or an official regulatory determination.
Methodology
Each AI system is assessed by the ICOSA multi-model council using Byzantine Fault Tolerant consensus. The council evaluates compliance across all relevant articles of the EU AI Act, including transparency obligations (Art. 50, 52), risk management (Art. 9), data governance (Art. 10), human oversight (Art. 14), and documentation requirements (Art. 11, 12).
Confidence scores represent the degree of consensus among council members. Higher confidence indicates stronger agreement on the verdict. Risk flags identify specific areas where deficiencies were detected during assessment.
Assessments use the Sentinel (3-model) tier for initial screening, with Baseline (5-model) and Full Certification (11-model) available for deeper analysis upon request.
Get Your System Scanned
Don't wait for August 2, 2026. Get your AI system assessed against the EU AI Act before enforcement begins.
Start with Sentinel Assessment