🎯 Motion AI Foil Pattern
BP found what Motion AI missed
Motion AI logged 17 action items and named an owner for each. Beyond Physician listened for whose voice actually carried the commitment, and rebuilt the list.
📋 The Action List
Every commitment from the meeting, re-attributed
An action item is a task or decision someone committed to on the call. Motion AI named an owner for each; BP listened for whose voice actually carried it. Click any row to hear that exact moment.
| # |
Action item |
🤖 Motion AI named |
🎙️ BP heard (real voice) |
▶ Listen |
🚀 What this becomes
From one recording to every meeting
One 70-minute file did all of this after the fact. Wired into how your team meets, it stops being a recap and becomes the record of what was decided.
📼
Today · one file
Upload a recording → the corrected action list, real owners, and conviction behind each commitment. No human review.
⚡
Live · every meeting
The verified to-do list the second a call ends. Owners and commitments attributed. Not a transcript. Not notes.
📈
Over time · compounds
Who follows through, where conviction trends, what slips. Motion gives notes; BP gives accountability that builds.
👥 Per-Speaker Intelligence
4 voices · 854 spoken moments · 6 emotional dimensions each
ISIA on one side, OSA on the other. Each voice gets a six-dimension acoustic profile, a conviction read, and a quote bank tied to audio playback.
📊 At a glance
Emotional posture, all four voices
Every dimension, every speaker, on one grid. Brighter means stronger. Scroll down for each voice in full.
🔗 Cross-Speaker Dynamics
Group Dynamics
How the four voices interacted: who drove the conversation, where conviction concentrated, and where the voice diverged from the words.
🎤 Airtime & Control
Who drove the conversation
Speaking time as a fraction of the total meeting.
Why this mattersAirtime maps to control. A one-sided pitch shows the seller dominating the room; an even split signals a real negotiation between equals. Here OSA (the pitchers) and ISIA's Dr. Patel split the room almost evenly, the posture of a serious, two-way deal, not a sales presentation. Who controls airtime tells you who is actually steering the outcome.
⚖️ Posture by Side
Pitcher vs. decider: the emotional asymmetry
Average emotion scores aggregated by side. The OSA side projects maximum-confidence selling mode; the ISIA side carries roughly 2× the concern signal.
Why this mattersA healthy deal looks exactly like this: the seller confident, the buyer measured and careful. If the two sides showed flat parity (both equally confident, or both equally worried), that would be the red flag: either the buyer is already over-eager, or the seller doesn't believe their own pitch. The 2× concern gap is the signature of a decision-maker who is genuinely evaluating, not rubber-stamping.
🚨 Divergence Moments
Where the voice contradicts the words
Moments where what was said and how it was said pull apart: the findings no text recap can surface.
Why this mattersThese are the tells. When the words say "zero downside risk" but the voice spikes with stress or doubled emphasis, that is the exact spot to probe in the next conversation. A transcript shows the sentence; only the audio shows the hesitation behind it. This is where a negotiator follows up, and where most deals quietly turn.
🗣️ Disfluency Density
How each voice fills space
Fillers per minute per speaker, plus each voice's top three patterns. David's "like" concentrates in the 20:00 equity explainer; Patel's "right" is confirmation-seeking, not hesitation.
Why this mattersFiller-rate spikes mark where a speaker is least certain or improvising on the fly. Jordan runs more than double Patel's rate, so the OSA side's confidence is not uniform, and the seam is visible. Knowing who is rehearsed versus who is winging it tells you exactly where a pitch is softest, and who to press.
📊 Emphasis Rate
Who drove points home, and who just filled space
Acoustic emphasis per speaker: the moments a voice physically leans in on a word, versus level, unmarked speech.
Why this mattersTalking a lot is not the same as saying a lot. Emphasis separates conviction-weighted speech from occupying the room. High airtime with low emphasis is someone filling space; high emphasis is someone landing a point. It tells you whose words to actually weight when you read the recap.
🚪 Topic Openers
Who navigated the agenda
First utterance per topic · indicator of who steered the meeting from section to section. Patel opened 6 of 13 · he wasn't just dominant in speaking time, he set the topic flow.
📋 Topic-by-Speaker Matrix
Two parallel meetings in one call
Share of speaking time per topic, keyword-matched on the verified transcript. OSA drove the deal mechanics; ISIA drove its own operational reality, same call, two agendas.
Why this mattersWhen two parties run parallel agendas inside the same call (OSA on equity and licensing, ISIA on its own staffing and space), it exposes alignment gaps before they harden into deal problems. A recap lists the topics that came up; this shows you the two sides were, in effect, sitting in different meetings. That is a partnership-readiness signal you want before you sign, not after.
📜 Patent-Pending Framework
How Beyond Physician reads voice
Verbatim transcription + proprietary acoustic conviction analysis + dimension-mapped quote extraction. Fully self-owned: zero third-party APIs, no data leaves the system.
📊 This analysis, by the numbers
⚖️ Scoring Weights
Proprietary 60 · 25 · 15 weighting
Every score is a weighted blend of three independent layers. The split is the patent-pending product of years of voice research on healthcare interview data.
These are not three data sources. They are three analytical layers, each reading a different representation of the same recording. The split is a fusion model: what was said, how it was meant, and how it physically sounded, blended into one score.
60% Sentiment
Reads the words: the linguistic and semantic content of the transcript
Answers What was said
25% Emotional Conviction
Reads the six-dimension emotional posture: our proprietary in-house layer
Answers How it was meant
15% Audio Analysis
Reads the raw signal: pitch, intensity, pauses, speech rate, disfluency timing
Answers How it physically sounded
Why this matters
The bottom 40% (Emotional Conviction plus Audio Analysis) can only be produced from the audio signal itself. A transcript tool throws that signal away, so it lives entirely in the top 60%. That 40% is the part Motion AI, Otter, Fathom, and Fireflies structurally cannot compute, and it is the product.
🔬 Pipeline
Per-recording pipeline
🧪 Dimensions
Six universal emotion dimensions · proprietary BP layer
| Dimension | Acoustic signature |
🔐 Self-owned · no third parties
📏 Sample size acknowledgment