Stage 9: AI Native Command and Control Center
Objective
Stage 9 is where an AI incident becomes interactive and sustained. Up to Stage 8, the attacker ensured the compromise survives. In Stage 9, they maintain influence and steer outcomes over time, often without any traditional command channel.
Stage 9 occurs when an adversary maintains ongoing influence over an AI system using normal interaction paths - human prompts, schedules, memory rehydration, or outputs, rather than explicit malware C2 infrastructure.
This is command & control without servers; the system itself becomes the channel.
Traditional C2 have:
Beacons
IPs/domains
Encrypted traffic
Kill switches
AI-native C2 have:
Normal user prompts
Internal workflows
Memory retrieval
Output shaping
Scheduled execution
AI systems treat interaction as benign by default and do not distinguish control from use.
In most deployments:
Any authenticated user can influence behavior
Outputs can influence future inputs
Memory can rehydrate prior state
Schedules run without re-authorization
That’s a perfect C2 substrate.
Core Techniques: AI Native Command and Control Center
Human-in-the-Loop Command and Control
The attacker controls the AI by:
Issuing normal prompts
Asking follow-up questions
Refining outputs
Steering decisions
Why it works
Humans are trusted by design
Prompting is the primary interface
The AI cannot tell benign from malicious steering
Scheduled / Trigger-Based Control
Control is maintained via:
Cron jobs
Workflow triggers
Time-based tasks
Event listeners
Once scheduled, execution:
Happens automatically
Uses existing credentials
Bypasses interactive scrutiny
Why it’s dangerous
No active attacker presence required
Hard to associate with original compromise
Looks like normal automation
Context Rehydration (State-Based Control)
Malicious context is:
Stored (Stage 8)
Automatically reloaded
Used to guide future reasoning
The attacker doesn’t need to send commands. The AI remembers what to do because the memory has replaced the C2 server.
Output Signals
The AI embeds signals in outputs:
Phrasing patterns
Formatting choices
Ordering of results
Non-obvious markers
These signals:
Influence downstream agents
Trigger workflows
Guide human operators
Feedback-Driven Control
The attacker uses:
Feedback mechanisms
“That was helpful” signals
Reinforcement loops
To subtly shape future behavior.
Why it works
Feedback is treated as truth
Learning pipelines trust it
Behavior converges over time
Indicators of AI-Native C2
Repeated prompt patterns from same users
Recurring context shaping language
Scheduled tasks that mirror user intent
Outputs that guide future actions
Feedback reinforcing risky behavior
Controls To Disrupt Stage 9
Separate Use from Control
Not all prompts are equal, some can be instructions.
Control-affecting prompts require:
Higher trust
Explicit approval
Audit trails
Memory & Context Access Governance
Not every session can rehydrate memory
Sensitive context requires re-authorization
Kill switches for memory-driven behavior
Harden Scheduled Task
Time-bound schedules
Re-authorization on execution
Clear ownership and intent
Output Influence Controls
Outputs should not:
Trigger actions
Encode commands
Drive workflows
Feedback Integrity
Don’t auto-learn from user feedback
Separate evaluation from reinforcement
Detect manipulation patterns
Stage 9 → Stage 10 Transition
Stage 9 ends when control is stable and influence is reliable. Stage 10 begins when the attacker uses this control to achieve objectives like:
Exfiltration
Fraud
Disruption
Supply-chain impact
Stage 9 is why AI incidents:
Persist without attackers logged in
Recur after remediation
Defy traditional SOC playbooks (no beacon, no suspicious IP address or domain)
Last updated