Feature Request Trends from 2,099 Sales Calls

TL;DR

We built an automated pipeline that exports 2,099 Kaia sales call transcripts from the Outreach API, uses Claude AI (Haiku) to score each transcript against 15 product features, uploads the scored data to a Notion database, and generates an interactive HTML dashboard showing how client feature requests trend over time. The goal: validate whether shipping a feature actually reduces the number of times clients ask for it on demo calls.

The Problem

Phorest's product team ships features on a regular cadence, but had no quantitative way to measure whether a release actually reduced client demand for that feature. The only signal was anecdotal, reps saying "people ask about X a lot" in Slack.

Meanwhile, every sales demo is recorded via Outreach Kaia, producing full transcripts with timestamps and speaker labels. Thousands of these were sitting untapped.

2,099 qualifying recordings across 17 months (Nov 2024 – Mar 2026)
Each transcript is thousands of lines long, impossible to read manually
We needed to distinguish a rep mentioning a feature from a client requesting it
Results needed to be trackable over time with release-date markers
The process needed to be repeatable as new calls come in

The Solution: Four-Stage Pipeline

Stage 1, Outreach API Export

Connects via OAuth2 refresh-token flow, queries /api/v2/kaiaRecordings, and exports all qualifying recordings to a CSV. Filters: startTime Nov 2024–Apr 2026, duration ≥30 min, provider = Zoom or Google Meet.

Output: kaia-transcripts.csv, 224 MB, 2,099 rows. export-transcripts.mjs handles pagination, rate limiting, and streaming CSV output with embedded newlines.

Stage 2, AI Scoring with Claude API

For each of the 2,099 transcripts, sends the full text to Claude Haiku with a structured scoring prompt. The AI returns a 0/1 score for each of 15 tracked features, a call overview, and specific client quotes for any feature scored 1.

Score = 1 only if a client genuinely requests or expresses need for the feature. If only the sales rep mentions it during a demo walkthrough, score = 0.

This distinction is essential, we're measuring unmet client demand, not rep pitch frequency.

The 15 features tracked

Send Payment Links · Memberships 2.0 · Add-ons & Zero-minute services · Send Consultation Form on booking · OB Strict Gap Time · Custom Break Blocks · Print Travelers for stylists · Gift Voucher Commissions · Business Level Commission · Log of notifications in Go · Finishing time for a single service · Additional Gender Settings · Go, restrict 'my numbers' history · Aveda+ Rewards Integration · Sell memberships online

Model: Claude Haiku (200K context, ~$0.001 per transcript)
Concurrency: 3 parallel API calls with exponential backoff
Checkpoint system: after each successful record the ID is saved to processed-ids.json; re-running automatically skips completed records, no duplicates, no lost work

Stage 3, Notion Database Population

For each scored transcript, creates a Notion page with the call title, date, recording ID, 15 number columns (0/1) per feature, and a 2–4 sentence call overview. For any feature scored 1, the exact client quote and context are appended.

This works as both a filterable dashboard (sort/group by feature, filter by month) and a drill-down tool, click any row to read what the client actually said.

Stage 4, Interactive HTML Dashboard

Summary cards: total calls, features tracked, date range, overall mention rate
Per-feature trend charts, % of calls per month, with vertical dashed release-date markers
Monthly breakdown table, 15 features × 17 months, colour-coded (green = low, red = high)
Toggle controls, % vs raw counts, monthly vs quarterly, trend % visibility

Trend calculation: linear regression across all months (not just first vs last). The slope is expressed as a % change relative to the mean, giving an accurate "is demand actually rising or falling" signal that accounts for month-to-month noise.

Results

Before	After
No data on what clients actually ask for on calls	2,099 transcripts scored across 15 features with client quotes
Anecdotal "reps say people want X"	Quantified: exact % of calls per month per feature
No way to measure feature release impact	Before/after trend lines with release-date markers
Would take weeks to read transcripts manually	AI scored all 2,099 in ~4.5 hours for ~$2
One-off analysis	Repeatable pipeline, re-run anytime with new data

What We Learned

Client-requested vs rep-mentioned is the key distinction. Without it, every demo call would score 1 on most features since reps walk through the product. The AI is surprisingly good at telling them apart.
Checkpoint/resume is essential for large batch jobs. The Claude API rate-limited us frequently. Without the checkpoint file, any interruption would mean re-processing (and duplicating) hundreds of records.
Haiku is good enough for structured scoring. We tested Sonnet (10× the cost), scores were nearly identical for this use case. Haiku's 200K window handles even the longest transcripts in a single call.
Linear regression beats first-vs-last for trend calculation. Initial approach compared first 3 months to last 3, misleading for features that spiked mid-period.
Partial months skew data. Worth noting when interpreting the latest data points.
The dashboard is a conversation starter, not the final answer. The real value is clicking into individual Notion pages and reading the actual client quotes.

Tech Stack

Outreach API, OAuth2 + REST for Kaia recording export
Node.js, export-transcripts.mjs, score-and-upload.mjs
Claude Haiku API, transcript scoring (~$0.001/transcript)
Notion API, database population via MCP integration
Chart.js, interactive dashboard charts
GitHub Pages, dashboard hosting