← Back to blog

EU AI Act Article 12: what "automatic recording" means for agent deployments

Article 12 requires high-risk AI systems to automatically log events throughout their lifecycle. Here's what that actually requires — and why your agent's conversation history doesn't qualify.

Jan Szymanski
Jan Szymanski
Founder, theup

The EU AI Act entered force on August 1, 2024. Its provisions are phasing in through 2027. High-risk system obligations under Annex III — which covers AI in healthcare, finance, employment, and law enforcement — apply from August 2, 2026. High-risk systems in regulated products (Annex I) follow by August 2, 2027.

Article 12 is one of the requirements that enterprises are least prepared for. Not because it's obscure — it's straightforward to read. But because the gap between what it requires and what most AI systems actually record is enormous.

What Article 12 actually says

The full text requires that high-risk AI systems include "logging capabilities that enable the automatic recording of events ('logs') while the high-risk AI systems are operating." Specifically, the logs must enable:

  1. Identifying situations that may result in the system presenting a risk or requiring substantial modification
  2. Facilitating post-market monitoring (Article 72)
  3. Monitoring the operation of high-risk AI systems by deployers (Article 26(5))

For certain high-risk systems (notably remote biometric identification under Annex III, point 1(a)), Article 12 specifies minimum log contents: period of each use, the reference database checked, input data that led to a match, and the identity of natural persons involved in verifying results.

For other high-risk systems — including those in healthcare, finance, and legal — Article 12 is less prescriptive about exact log fields but requires logging capabilities that are "appropriate to the intended purpose of the system." This is both more flexible and more demanding: you need to determine what's appropriate for your use case, and defend that determination to regulators.

Key implication: The "appropriate to the intended purpose" standard means you can't just log prompts and responses. For an agent that acts on knowledge, regulators will ask: what knowledge did the agent consult? Were there conflicts? What was the confidence level? Who reviewed the output?

Why conversation logs don't qualify

Most AI agent deployments today log the conversation: the user's prompt, the agent's response, maybe the retrieved documents. This is necessary but nowhere near sufficient for Article 12 compliance. Here's what's missing:

  • No knowledge state snapshot. You know what the agent said, but not what the knowledge graph looked like when it said it. If a conflict existed in the data at query time, was the agent aware of it?
  • No decision provenance. The agent made a recommendation — but what was the chain of evidence that led to it? Which sources contributed? What were their authority scores? Were any sources contested?
  • No constraint verification record. Did the agent check whether the proposed action complied with applicable policies? If so, what was the result? If not, why not?
  • No human-in-the-loop record. Article 14 requires human oversight. If a human reviewed the decision, when? Who? What was their determination? This needs to be linked to the specific decision, not just "a human was involved."
  • No tamper-proof guarantee. Conversation logs in a database can be edited. Article 12 doesn't explicitly require immutability, but any auditor will ask how you can guarantee log integrity.

What compliant logging looks like

A system that satisfies Article 12 for autonomous agent deployments needs to record, for each decision:

// Required decision record (Article 12 compliant) { "decision_id": "d-7f3a2b...", "timestamp": "2026-03-20T14:32:07.412Z", "agent_id": "clinical-agent-7", "action": "write → pharmacy_record", "gate_verdict": "BLOCK", // Knowledge state at decision time "knowledge_state": { "entities_consulted": 47, "active_conflicts": 2, "min_consensus": 0.42, "conflict_ids": ["c-drug_interaction", "c-allergy_check"] }, // Constraint verification chain "constraints_checked": [ {"rule": "HIPAA_minimum_necessary", "result": "PASS"}, {"rule": "allergy_cross_check", "result": "FAIL"}, {"rule": "physician_order_valid", "result": "PASS"} ], // Human oversight (Article 14) "escalation": { "escalated_to": "dr.smith@hospital.org", "escalated_at": "2026-03-20T14:32:08.001Z", "reviewed_at": "2026-03-20T14:35:22.847Z", "determination": "BLOCK_UPHELD", "review_comment": "Correct — allergy record not verified" }, // Integrity "prev_hash": "a1b2c3d4...", "record_hash": "e5f6g7h8..." }

This is what a compliant audit record looks like. Every decision links to the knowledge state, the constraint verification chain, and the human oversight record. The hash chain ensures tamper-proof integrity.

How this compares to current approaches

RequirementConversation logsGuardrails SDKBrain
Period of useYesYesYes
Reference database stateNoNoYes — consensus snapshot
Decision provenance chainNoPartialYes — full chain
Constraint verificationNoYesYes — per gate
Human oversight recordNoNoYes — linked to decision
Tamper-proof integrityNoNoYes — hash-chained

Articles 11 and 13: the documentation stack

Article 12 doesn't exist in isolation. Article 11 requires technical documentation describing the system's design, development, and intended purpose. Article 13 requires transparency — the ability to explain to users how the system works and what its limitations are.

Together, Articles 11, 12, and 13 form a documentation stack:

  • Article 11 — How the system works (design documentation)
  • Article 12 — What the system did (automatic event logging)
  • Article 13 — How users can understand it (transparency obligations)

Brain addresses all three. The consensus scoring model is fully explainable (Article 13 — every score can be decomposed into source authority, evidence weight, and temporal decay). The hash-chained audit trail satisfies Article 12. And the system's design is documented with full provenance metadata (Article 11).

The timeline

For Annex III high-risk systems (healthcare, finance, employment, law enforcement), obligations apply from August 2, 2026. For regulated product AI (Annex I), the deadline is August 2, 2027.

Non-compliance penalties for high-risk system obligations: up to €15M or 3% of global annual turnover, whichever is higher. (Violations of prohibited AI practices under Article 5 carry the higher tier: €35M or 7%.)

If you're deploying autonomous agents in domains covered by Annex III, you have months — not years — to get Article 12 compliant logging in place.


Get Article 12 ready

Brain's audit trail is designed for EU AI Act compliance out of the box. See it with your agent stack.

Get a Demo