Most agent access models are static. The agent has scope X. The policy that grants scope X was written six months ago. The agent has had scope X for every session since. The only change happens when someone runs a review, which most teams run quarterly at best, and which produces approve-everything as the default answer.
This is the model Anthropic's framework is gradually moving away from. Part III's continuous authorization. Part IV's Just-In-Time access. Behavioral monitoring as a monitoring layer. Each step takes the static grant and softens it toward something the system can adjust.
The piece that's missing in most deployments is using behavior as an input to the access decision, not just to the monitoring view. A working model treats trust as a continuous variable per actor, updated by what the actor actually does. Behavior earns scope. Drift removes it. This post is the mechanics.
The trust score as a primitive
For each actor (human or agent, since the model applies to both with different signal weights), the access layer carries a trust score updated continuously. The score isn't a single number; it's a structured object with components: history depth, session completion patterns, scope minimality, alternative acceptance rate, behavior consistency. The output of the score is a default friction level for that actor's next session.
Three friction levels are usually enough.
Strict mode. Every mutation routes through approval. Sessions are short. Intent re-validation happens at fine intervals during the session. New scope requests require a human.
Standard mode. Auto-resolution for routine sessions matching prior patterns. Mutations within the declared envelope proceed. Scope expansion requests get expedited routing with the recommendation already structured.
Trusted mode. Most routine work auto-resolves without notification. Approval comes only for scope changes or anomalous patterns. The actor experiences minimal friction for the patterns they've established.
A new actor starts in strict. As clean sessions accumulate, the level relaxes. An incident or a drift event pulls the level tighter. The actor's experience tracks their behavioral pattern. The IT lead doesn't decide the level; the system computes it from the audit trail.
What earns scope
The signals the system weights positively are the patterns of good behavior most security teams already wish they could measure systematically.
Sessions that complete inside the declared intent envelope without scope expansion. The actor asked to do X, did X, and the session closed. This is the baseline of competence.
Scope requests that are consistently the minimum the task needed. The actor asks for read on three tables when read on three tables is what the work requires, rather than asking for read on all tables because it's easier. Over a quarter, this signal separates engineers who think about access from engineers who don't.
Acceptance of safer alternatives when offered. The system suggests "use this tenant-scoped debug dataset instead of the full table." The actor accepts. Each acceptance is a small signal that the actor will work with the system rather than against it.
Mutation patterns that are reversible. Read-then-write-then-revert is a less risky pattern than write-and-walk-away. The system can tell which the actor's habits favor.
Cross-session consistency. The actor's sessions look similar to each other. Same time windows, same systems, same shape of intent. Variance is fine, but a sudden change is a signal.
What removes scope
The negative signals are the inverse of the above, plus some that don't have positive counterparts.
Declared intent that doesn't match the actions taken. The session started as "debug latency" and ended as "export customer records." The runtime catches this in real time; the trust score absorbs the lesson after the fact.
Repeated scope expansion requests within a session. The actor declared intent A, then needed to expand to A', then to A''. Each expansion is fine in isolation. The pattern of "the original declaration is consistently insufficient" is information about how the actor declares intent.
Accessing systems unrelated to the stated purpose. A debug session that reaches for HR data is anomalous. The trust score absorbs the anomaly even if a human ultimately approves the expansion.
Working hours drift. The actor's sessions usually run between 9 and 7 local. A session at 3am with the same actor's credential is either a legitimate late-night incident or the start of an account compromise. The system can't tell which from the timestamp alone; it can apply tighter mode and route the session for confirmation.
These signals are weighted, not binary. A single anomaly doesn't crash the trust score. A pattern of anomalies does. The decay is calibrated to the team's tolerance for false positives vs. false negatives.
How drift is caught mid-session
The trust score is the steady-state input. Drift is the within-session signal. The two work together.
Each tool call the agent makes is evaluated against the originally declared intent. The runtime computes a drift score: how far does this action sit from what the session said it would do. Small drift is tolerated (intent declarations aren't precise enough to constrain every action). Large drift triggers a check.
The check has three possible outcomes. The action is in scope after all, and the runtime updates its understanding. The action is borderline, and the runtime asks the actor to re-declare or to accept a narrower alternative. The action is clearly out of scope, and the runtime blocks it.
The blocked actions are usually the most informative. They show where the actor's mental model and the system's intent model diverge. Most are legitimate workflow gaps; the intent template gets updated and the next session covers the case cleanly. A few are not. Those become incidents.
The trust score updates after the session. Clean sessions raise the score. Drift events lower it. Incidents lower it sharply.
Where this fits in Anthropic's framework
Three pieces of the Anthropic whitepaper connect to behavioral baselines as access input.
Part III, Permission Models. The whitepaper tiers permission models as RBAC at Foundation, ABAC with context-aware policies at Enterprise, continuous authorization with real-time policy evaluation at Advanced. The behavioral baseline is one of the attributes that feeds ABAC and that gets re-evaluated under continuous authorization. The framework establishes the layer; behavioral signals are what populate it.
Part III, Behavioral Monitoring. The whitepaper covers baseline establishment, anomaly detection, and automated response. The framing treats behavior as monitoring. The extension here is making behavior an input to the access decision rather than only to the alert pipeline.
Part IV, Phase 8 (Measure What Matters). The whitepaper recommends measuring dwell time, coverage, detection speed, explainability, and behavioral conformance. The behavioral baseline is what behavioral conformance is measured against. Without it, the measurement is descriptive at best.
The framework anticipates this layer. Most current deployments don't have it built out. The opportunity is to ship it as part of the access stack, not as a separate analytics product bolted on later.
What this isn't, and where it's hard
A few clarifications, and the parts that get hard in practice.
Not a credit score for employees. The signals are about how access is used, not about the person. A team lead can override the trust level if they understand the actor's context better than the system. The score is an input, not a verdict.
Not a replacement for roles. RBAC sets the outer boundary of what the actor can do at all. Trust score decides how the actor moves inside that boundary today. Both layers are necessary.
Not autonomous decision-making. The system computes the score and proposes the friction level. Material decisions still route to humans. Anthropic's framing applies: automate the bookkeeping around incidents, not the decisions. The score is the bookkeeping.
Where this is harder than the model makes it look. Three real problems worth naming.
The bootstrap problem. When you turn the system on for the first time, nobody has history. Every actor starts in strict mode. The first week is friction-heavy, and the team that just paid for the system feels it. Plan for a transition period where you carry imported trust state from prior systems (existing role assignments, time-on-team, manager attestation) so that day one isn't "every senior engineer is in strict mode and everything routes for approval."
Override abuse. If team leads can override the trust score, some will override liberally for their own people. The audit on this is straightforward (overrides are logged, frequency by lead is measurable, anomalies surface) but the social dynamic matters more than the audit. Treat overrides as a metric the security team reviews, not a private decision a lead makes alone.
Behavioral data implications. The system is now logging more behavioral data per user than the previous access stack did. The privacy and labor-relations conversation that produces is real, especially in environments with works councils or strong privacy regimes. Get the data retention policy and the access-to-the-data policy written before you turn the score on, not after the first time a manager asks to see "everyone's score on my team."
None of these is a deal-breaker. They're the work the model doesn't show. Skipping the work doesn't make the system fail; it makes the rollout harder than the architecture diagram suggested.
What this changes for the IT team
The IT lead's queue contracts. The Trusted-mode actors stop generating routine approval requests. The Strict-mode actors generate enough events that anomalies stand out. The middle case (Standard mode) sees expedited routing with recommendations attached. The lead applies judgment to what the system can't decide.
The audit trail gets richer. Every session is annotated with the trust state at evaluation time and the signals that produced it. A reviewer can reconstruct not just what happened but why the policy responded the way it did. This is meaningfully stronger evidence for SOC 2 CC6.2, CC6.3, and the equivalent ISO 27001 controls than what static role-based systems produce.
The contractor workflow changes most. A new contractor in their first week is in strict mode by default. As their sessions accumulate clean history, they earn standard mode without any IT intervention. By month two they're running with the friction of a full-time engineer doing the same work. None of this required a manual ramp by the IT team; the system did the ramp by reading the audit trail.
The visible work for IT shifts. Less grant-decision-at-provisioning, more override-decision-when-the-score-disagrees-with-your-judgment. The override case is rare if the signals are right; when it happens, it's a real conversation about a specific actor in a specific situation, not a batch of fifteen approvals at 9am.