How EchoDepth works

The science behind emotional sales coaching.

44 FACS-compliant facial Action Units. VAD emotional state mapping. Real-time feedback. No specialist hardware. Here is exactly how EchoDepth reads and coaches emotional performance.

Step 1

Capture

Standard webcam + microphone

EchoDepth captures live webcam video and microphone audio during the practice session. No VR headset, no depth sensor, no camera upgrade required. A standard enterprise laptop at 720p or above is sufficient for reliable Action Unit detection.

This matters because accessibility is a prerequisite for scale. If specialist hardware were required, enterprise rollout would be a procurement and logistics problem before it was a coaching problem. EchoDepth removes that barrier entirely.

Hardware requirement
Standard webcam (720p minimum)
Standard microphone or headset
Modern browser (Chrome, Edge, Firefox)
VR headset — not required
Specialist sensor — not required
Sample Action Units detected
AU1
Inner brow raise
AU4
Brow lowerer
AU6
Cheek raiser
AU12
Lip corner puller
AU17
Chin raiser
AU23
Lip tightener

44 Action Units analysed simultaneously in real time

Step 2

Analyse 44 Action Units

FACS — Facial Action Coding System

The Facial Action Coding System (FACS) was developed by Paul Ekman and Wallace Friesen and is the most rigorously validated framework for describing facial expressions in terms of underlying muscle movements. EchoDepth analyses all 44 FACS Action Units in real time.

Why 44? Because emotional states are not monolithic. Genuine confidence produces a specific combination of Action Units — brow position, lip tension, cheek engagement — that differs measurably from performed confidence. EchoDepth can distinguish between them. Sales reps cannot game the system by exaggerating — and buyers, who are unconsciously reading the same signals, respond accordingly.

Step 3

Map to VAD emotional space

Valence · Arousal · Dominance

Action Unit patterns are mapped to the VAD (Valence, Arousal, Dominance) model — a three-dimensional framework for representing emotional state, originating in Mehrabian and Russell's foundational work on emotion.

V
Valence

Positive or negative quality of the state. Confidence = high positive. Anxiety = low.

A
Arousal

Activation intensity. Both excitement and panic are high-arousal — they need different coaching.

D
Dominance

Perceived control and authority. The dimension buyers read when deciding whether to trust a rep.

VAD is why EchoDepth can distinguish between a rep who is nervously activated versus confidently activated — two states that look similar at a surface level but produce completely different buyer responses.

VAD state examples
High Confidence
V: 0.82 · A: 0.61 · D: 0.78
Nervous Energy
V: 0.31 · A: 0.79 · D: 0.24
Calm Authority
V: 0.71 · A: 0.44 · D: 0.82
Defensive Flatness
V: 0.29 · A: 0.22 · D: 0.31
Session feedback output
Confidence74
Warmth81
Authority68
Coaching tip: Anchor authority earlier — pause before your first sentence and hold eye contact for 2 seconds before speaking.
Step 4

Deliver coaching feedback

Real-time scores + post-session guidance

VAD data is translated into three coaching metrics — Confidence, Warmth, and Authority — that reps can track and improve in real time. These are not arbitrary scores; each maps directly to buyer decision-making research showing which emotional signals drive trust, rapport, and commitment.

After each session, EchoDepth generates a specific coaching report: which moments weakened delivery, what emotional state the buyer was likely perceiving, and exactly what to adjust before the next real conversation.

  • Live confidence, warmth, authority scores during practice
  • Post-session report with moment-by-moment analysis
  • Specific, actionable coaching tip per session
  • Team aggregate data for sales leaders
Cultural calibration

14 cohorts. 6 countries.

Emotional expression is universal in its underlying structure — but cultural norms shape how and when emotions are displayed. EchoDepth's model is calibrated across 14 cultural cohorts in 6 countries, accounting for display rule variation so that feedback is accurate for globally distributed sales teams — not just calibrated to a Western European baseline.

Request access to see this in action

Technology FAQs

What are facial Action Units?
Facial Action Units (AUs) are the individual muscle movements that combine to form facial expressions. The Facial Action Coding System (FACS), developed by Paul Ekman and Wallace Friesen, identifies 44 discrete AUs that can be reliably observed and measured. EchoDepth analyses all 44 in real time.
What is the VAD model?
VAD (Valence, Arousal, Dominance) is a three-dimensional framework for representing emotional state. Valence = positive/negative. Arousal = intensity. Dominance = perceived control. VAD is more nuanced than binary classification — it lets EchoDepth distinguish, for example, between anxious activation and confident activation, which produce completely different buyer responses.
How does EchoDepth differ from facial recognition?
EchoDepth analyses patterns of muscle movement (Action Units) to infer emotional state. It does not identify who someone is and stores no biometric identity data. It is a coaching tool, not a surveillance tool. All analysis is voluntary and in-session only.
What hardware is required?
Standard webcam (720p minimum) and microphone. No VR headset, no depth sensor, no specialist hardware. Works on any enterprise laptop with Chrome, Edge, or Firefox.

See the technology in action.

Request access to the private beta and experience how 44 Action Units translate into coaching that changes how your team sells.

Request Access