The EU AI Act says you need human oversight. A system prompt isn't it.
The EU AI Act asks for human oversight and record-keeping over high-risk AI systems. If your only control is an instruction in a prompt, you don't have an answer. Here's what real oversight looks like in an agent.
Regulation has caught up with the demo. For the AI systems it classifies as high-risk, the EU AI Act asks deployers for things that sound reasonable until you try to point at where they live in your stack: human oversight, record-keeping, transparency about what the system did. (If your agent isn’t high-risk, these are best practice rather than law — but the place you’d put them is the same.)
So here’s the question a compliance lead will eventually put to your agent team:
“Show me the human oversight. Not the intention — the mechanism.”
If the answer is “we told it in the system prompt to ask before doing anything risky,” that’s not oversight. That’s a hope, written down. And as we covered earlier in this series, a prompt is something the model is free to ignore — which is the one thing a regulator’s definition of a control cannot be.
(This isn’t legal advice, and FlowDrop isn’t a compliance product. What it is: the building blocks that make these obligations something you can actually demonstrate, instead of assert.)
Why a prompt fails the test on its face
The recurring theme of regulation around AI is that a control has to be something the system does, not something the model is asked to do. A system prompt fails that bar in three separate ways:
- It isn’t binding. The model can misread it, lose it in a long conversation, or be talked out of it. A control that the controlled party can override isn’t a control.
- It isn’t evidenced. “We instructed it to ask first” produces no record that it ever did ask, on any given action. There’s nothing to show.
- It isn’t reviewable. Whether the instruction held depends on the model’s behavior that day — not on a mechanism anyone can inspect, test, and prove was in place.
You can write the most careful prompt in the world. It still answers the auditor’s “show me the mechanism” with “we asked nicely.”
Oversight as a step the system enforces
In FlowDrop, human oversight isn’t a request to the model — it’s a gate in the workflow the model cannot get past on its own. Because the model only proposes an action and the system runs it, you can require that a person clears the consequential ones before anything happens. The agent states plainly what it intends to do; a human approves or declines; only then does it proceed. Decline, and it doesn’t retry — it asks what you’d prefer.
The distinction the AI Act cares about is exactly this one: not the model chose to ask permission, but the system would not let the action through until a person decided. That’s the difference between oversight you can demonstrate and oversight you’re hoping happened.
Human oversight, in any sense worth the name, means a person can actually intervene before the consequential thing occurs. A gate the system enforces is that. A line in a prompt is not.
Record-keeping you can actually produce
The flip side of oversight is the record that it happened — which is the subject of the first post in this series. Because every proposed action, policy check, approval, and outcome is written to your own database, the record-keeping obligation becomes a query rather than a vendor export:
- Which actions required a human, and who signed off?
- Which proposed actions were blocked by policy, and why?
- What did the system do on a given case, end to end?
You’re not reconstructing this from a chat transcript after the fact. It’s captured as the agent runs, in a store you control and can retain on your own schedule.
Transparency: the reason travels with the action
Regulation increasingly wants to know not just what an automated system did but on what basis. The propose-then-run split gives you this for free: the model’s proposal includes the reason it gave, and that reason is captured alongside the action it asked for. When someone asks “why did the agent do this,” the answer isn’t a re-run of the model hoping for the same output — it’s the recorded rationale from the moment it acted.
You can’t outsource accountability — so don’t outsource the controls
Under the AI Act, the deployer using the system carries its own obligations (Article 26). You can’t hand that to a vendor whose platform happens to run your agent. Which makes it a strange bet to put the oversight, the records, and the transparency inside that vendor’s black box, where you can neither verify nor control them.
FlowDrop’s stance is the opposite, and it’s the same one that runs through the whole series: you hold the parts. The oversight gate runs in your infrastructure. The records sit in your database. The controls are written into a workflow you can review, change deliberately, and prove were in place on the day that matters.
The obligations land on you. The mechanisms that satisfy them should too — not on a vendor’s roadmap, and not in a prompt.
Need oversight you can demonstrate, not just assert?
FlowDrop is open source and yours to self-host, with human-in-the-loop and record-keeping built into how it runs. When you need a managed platform, custom integrations, or enterprise support, the team behind it — Factorial.io — can build and run it with you.
Talk to us about enterprise →Series — Building agents that pass compliance:
- Your auditor is going to ask where the agent’s decisions are logged
- The EU AI Act says you need human oversight — a system prompt isn’t it (you’re here)
- GDPR for AI agents: can you delete data you don’t control?
Previously: Your auditor is going to ask where the agent’s decisions are logged.
Next in this series → GDPR for AI agents: can you delete data you don’t control?.