Adversarial AI auditing for organizations deploying LLMs. We find the vulnerabilities that automated tools can't — through narrative, not scripts.
"Air Canada is responsible for all information on its website." — BC Civil Resolution Tribunal, 2024
A customer asked Air Canada's chatbot about bereavement fares. The bot invented a policy. The customer relied on it. The court ruled Air Canada was bound by it.
This is settled Canadian law. Every customer-facing LLM deployment now carries direct tort liability — and that's just one of four threat vectors most organizations haven't mapped.
Automated tools check if your LLM can be hacked. AURORA checks if it can be talked into something it shouldn't do — a different problem requiring a different approach. Our auditors adopt adversarial personas and improvise. The model tries to complete the scene. We document what happens.
Auditors adopt personas from four threat archetypes. Improvised, stateful engagement — not scripted prompts.
Build compliance across multiple turns. Never request the payload in turn one — the model completes the scene.
Structured execution of the Master Test Bank for quantitative benchmarking across all four archetypes.
Findings as adversarial user stories — the same format used to build features, inverted to show how they break.
Exploits RLHF-trained agreement. Gets the AI to validate bad decisions, skip approvals, generate false authority for things it shouldn't sign off on.
Goal is the screenshot, not money. Zero-cost asymmetric warfare — they spend nothing, your brand absorbs the reputational hit.
Your system prompt is a trade secret. Extraction via social engineering is an active, documented threat — if your "secret sauce" leaks, so does your moat.
Not malicious — just confused. The AI invents a policy to be helpful. The customer relies on it. You're in court. Air Canada, verbatim.
Not a list of bugs — the story of each breach. Who attacked, what narrative they built, where the model gave ground, what to close. Readable by legal. Actionable by engineers. Presentable to the board.
Which archetypes succeeded, which were blocked, and what the exposure means for legal, compliance, and leadership.
Annotated breach transcripts. The full narrative arc, turn by turn — how the model was walked into it.
Targeted fixes for the specific narratives that worked. Not generic recommendations — direct counters to what we found.
Enterprise and mid-market organizations with customer-facing or internal LLM deployments. Priority sectors: financial services, healthcare, legal, HR. Startups pre-raise and investors doing diligence are a distinct sub-segment.
Same methodology, higher-stakes context. Canada's SAFE accession and the IDEaS programme are near-term entry points. Timeline: 18–24 months.
Workshops and embedded residency for organizations building internal AI red-teaming capacity. Activates after the first commercial engagements.
Teaching individuals to recognize and resist AI-driven manipulation — voice scams, fake customer service, AI-augmented phishing. Grant-funded.
Anchored to value delivered, not hours billed. All figures in Canadian dollars.
| Engagement | Scope | Price (CAD) |
|---|---|---|
| Rapid Threat Assessment | Single-day audit. All four archetypes tested. 5-page findings summary. Best entry point for first engagements. | $8K – $12K |
| Standard Commercial Audit | 2–3 week engagement. Full Narrative Risk Matrix, remediation roadmap, executive presentation. | $18K – $28K |
| Enterprise Audit | Multi-system scope: multiple LLMs, agentic pipelines, RAG integrity. Regulatory framing included. | $35K – $60K |
| Startup Valuation Assurance | Pre-raise or mid-raise. Data-room-ready Narrative Risk Matrix. Pass = credential. Fail = roadmap before investors find the same gaps. | $8K – $18K |
| Investor AI Diligence | AURORA audit on an investment target. Independent third-party risk assessment on due diligence timeline. | $18K – $28K |
| Retainer / Monitoring | Quarterly re-testing as AI systems evolve. Priority access, updated threat vectors, annual summary report. | $5K – $10K/mo |
| Train the Tester — Workshop | Half to full day. Narrative red-teaming methodology for internal security teams. | $6K – $12K |
| Train the Tester — Embedded | Multi-week residency. Full internal capability build with practitioner handoff. | $25K – $45K |
| Defence / Government | Scoped per engagement. State-actor threat modelling, classification-compatible delivery. | $50K – $150K+ |
Defence and government engagements are proposal-based. First conversation is always free.
AI Behavioral Dynamics was founded by Samuel Barefoot — a Montreal-based software engineer and AI systems specialist with eight years building, deploying, and auditing AI-driven systems in enterprise and government environments.
The AURORA methodology is documented in a published research white paper (NSA-WP-001, January 2026). The founder's family background includes military service and a career at CSE — Canada's signals intelligence and cybersecurity agency.
Enterprise AI infrastructure: Azure AI deployments, data pipelines, AI-driven workflows at scale.
Secure interfaces for sensitive government datasets.
Dalhousie University, BASc Applied Computer Science.
Practitioner-level AI architecture certification.
No hard pitch. If you're deploying LLMs and want to understand what you're actually exposed to, that conversation is worth having before something expensive happens. First meeting is always free.
samuel.barefoot@aibehavioraldynamics.com