Real-time AI impersonation is turning Zoom/Teams calls into a new social-engineering weapon. Here’s what’s happening, how it works, and a verification protocol that stops it — without paranoia.
Introduction: Video is no longer proof
For years, a video call has been treated as “highest confidence” communication. If you can see a face and hear a voice, it must be real… right?
That assumption is now a liability.
In May 2024, UK engineering firm Arup confirmed it was hit by a deepfake video-call scam after an employee was tricked into sending HK$200 million (about US$25M) following what looked like a legitimate internal meeting.
This isn’t a “future risk.” It’s an operational one — affecting businesses, families, and anyone who treats live video as verification.
New rule: If money, access, or secrets are involved, video calls do not verify identity. Only a protocol does.
Quick Summary: What you need to know in 60 seconds
- Deepfake fraud is scaling fast — Entrust reports deepfake attempts occurring on average every five minutes in 2024.
- The most dangerous attacks combine AI impersonation + pressure: urgency, secrecy, authority, and “do it now.”
- “Spot the glitch” is unreliable. The fix is out-of-band verification plus no-exception rules and two-person approvals.
- For families: a safe word and a hang-up/call-back habit prevent emotional blackmail scams.
The new threat: “Deepfake meetings” and real-time AI impersonation
Deepfakes used to be edited videos posted online. Now the threat has moved into real-time communication — the moment where people authorize payments, approve access, or make decisions under pressure.
The Arup case is the template: a believable internal video conference, familiar faces, business context, then a request that triggers irreversible action.
At the same time, public warnings increasingly emphasize impersonation campaigns using AI-generated voice and messaging to build trust and extract access or money.
How the scam actually works (attacker playbook)
Most deepfake video-call frauds aren’t “movie magic.” They’re process attacks.
1) They prepare your brain to comply
- Authority (“CEO”, “CFO”, “senior official”)
- Urgency (“wire must go today”, “this is confidential”)
- Isolation (“don’t loop anyone else in”)
2) They control the environment
- Camera angle excuses (“webcam issue”)
- Lighting excuses (“traveling”, “bad connection”)
- Audio excuses (“lag”, “mic problems”)
3) They target high-impact actions
- Wire transfers / payment approvals
- Credential/MFA resets
- Remote access
- Sensitive documents / strategic info
This is why detection-by-eyeballing fails. The attacker doesn’t need perfection — just enough realism for long enough to trigger action.
The human factor: why smart people still fall for it
Deepfake fraud works because it exploits three predictable realities:
- We equate “face on screen” with authenticity.
- We obey urgency when social pressure is high.
- We avoid embarrassment (people don’t want to “challenge the boss”).
Even if a deepfake contains minor tells, pressure collapses scrutiny.
Real-world case studies (what’s already happened)
Case 1: The “internal meeting” that moved $25 million (corporate)
Arup’s Hong Kong-linked incident showed the highest-risk scenario: an employee seeing what appeared to be senior leaders in a video call and authorizing large transfers.
Key takeaway: If your payment controls allow “approval by call,” your company is already exposed.
Case 2: Deepfake endorsements used to drain victims (consumer)
In New Zealand, a Taranaki grandmother reportedly lost NZ$224,000 after being pulled into a cryptocurrency investment scam featuring an AI deepfake of Prime Minister Christopher Luxon.
Key takeaway: Deepfakes don’t just impersonate you. They impersonate trusted public figures to hijack credibility.
Case 3: Emotional coercion scams are evolving fast (families)
Warnings about AI-enabled “proof” tactics (voice, images, video) increasingly recommend direct contact and a family safe word to defeat impersonation pressure.
Key takeaway: The defense is not “spotting deepfake artifacts.” It’s verification habits that work under stress.
The Lifesaving Verification Protocol (practical, fast, enforceable)
This is the core: a protocol that assumes video can be forged and makes the request prove itself.
The rule that stops most deepfake fraud
No high-risk action is approved inside the same channel where the request was received.
(Video call request → verify via known number, trusted chat, or in-person.)
That’s classic out-of-band verification: using a separate, independent channel to confirm identity and intent.
Protocol table: “If X happens, do Y”
| Situation | What you do (no exceptions) | Why it works |
|---|---|---|
| “Urgent transfer” on a call | Hang up. Call back using a known directory number. Require 2-person approval. | Breaks the attacker’s control and urgency loop. |
| “Reset my MFA / password now” | Refuse on-call. Use formal IT workflow + ticketing + verified identity steps. | Attackers often target MFA to take accounts. |
| “This is confidential—don’t tell anyone” | Treat as a red flag. Escalate to a second verifier. | Secrecy is a fraud accelerant. |
| “I’m your family member—help now” | Ask for family safe word. Call the person directly. | Beats voice/video cloning with a pre-shared secret. |
| “Click this meeting link / open this doc” | Verify the sender out-of-band. Use known channels. | Meeting invites can be used to pivot into account compromise. |
The 5-layer protocol you can actually implement
Layer 0 — Non-negotiables (write these down)
- No payment approvals solely via video call.
- No MFA codes shared to anyone — ever.
- No remote-access installs “because the caller said so.”
Layer 1 — Out-of-band call back (default)
Call the person using a pre-registered or known-good number — not what they provide in the moment. (This mirrors formal guidance around out-of-band verification relying on pre-registered channels.)
Layer 2 — Challenge-response (fast identity check)
Use one of:
- Company passphrase (rotated quarterly)
- Private verification question (not guessable from social media)
Layer 3 — Two-person integrity for high-impact actions
Require a second approver from a different function (e.g., finance + ops). Deepfake fraud targets single points of failure.
Layer 4 — Time holds for large transfers
Add a mandatory delay for high-value transfers unless verified via multiple channels. Fraud thrives on speed.
“Spotting signs” is optional — don’t bet your safety on it
Yes, deepfakes can still show tells: strange blinking, lip-sync mismatch, odd lighting, unnatural head movement. But betting your defense on human detection is fragile.
Modern guidance is consistent: verify identity via trusted channels rather than trusting perception.
What about Content Credentials and provenance tech?
Content provenance standards like C2PA / Content Credentials aim to let platforms and devices attach cryptographic provenance (“history”) to media. Governments and security agencies have discussed it as part of a broader solution for media integrity.
But here’s the reality today:
- Many real-world calls and shared clips won’t carry provenance data.
- Even with provenance, attackers can still social-engineer humans into bypassing process.
Treat provenance as a helpful layer — not the foundation.
If you suspect a deepfake call: do this immediately
- Stop the action (transfer, reset, access grant).
- Switch channels (call back on known number / verified internal chat).
- Preserve evidence (invite details, email headers, chat logs, timestamps).
- Escalate (security/IT/finance lead).
- Notify your bank/platform quickly if money moved.
Conclusion: Trust less. Verify better.
Deepfake fraud isn’t winning because AI is magic. It’s winning because most people don’t have a protocol — and fraudsters exploit urgency, authority, and habit.
Adopt one principle and you’ll eliminate most risk:
If the request is high-impact, verification must be out-of-band and repeatable.
That’s how you keep video calls useful — without letting them become a liability.