Automation · Playbook 02

File call transcripts into the right company folder.

After a recorded call ends, the transcript lands in the right company folder without anyone having to move it manually.

Diligence

Reference build

A working reference build that runs in production. When a recorded call finishes, the orchestrator pulls the full paginated transcript, resolves which company it's about (CRM link first, LLM fallback on title + participants + transcript), and drops the transcript file into that company's 'Call Transcripts' folder in the document store. Companion to the CRM-notes automation: notes go on the company record, raw transcript goes in the company folder.

Vendors below are our choices. The flow is roles-not-vendors; every layer swaps cleanly.

Flow
01 · Trigger
Meeting recording ready (webhook from CRM/notetaker)
Fires with meeting_id + call_recording_id once the recording is available
02 · Fetch meeting
GET /meetings/{meeting_id}
Pulls title, participant emails, and the linked_records array (companies already attached to this meeting in CRM)
03 · Probe transcript
GET /call_recordings/{id}/transcript (first page)
Quick check that a transcript exists; surfaces the pagination cursor
04 · Fetch full transcript
Paginate every page, concatenate
Cursor-walk all pages, join, insert a newline before each [hh:mm:ss] speaker turn
05 · Look up linked company
Find CRM company by linked_records[].record_id
success_on_miss = true — if the meeting isn't linked to any company yet, the run continues and falls back to the LLM resolver
06 · Resolve company name
LLM — return the single company discussed on the call
GPT-4o with the CRM name as the preferred answer. Rules: one company only; prefer the startup over any co-investor on the call; if multiple are discussed, pick the most heavily discussed.
07 · Sanitize file name
Strip title to [a-zA-Z0-9 ], collapse whitespace, trim
SharePoint rejects special characters in file names. This is the cheapest fix and the highest-ROI line in the whole flow.
08 · Find target folder
Search drive for "<Company> Call Transcripts"
Folder naming convention IS the routing logic. success_on_miss = false — unfiled is better than misfiled.
09 · Upload transcript
Upload full transcript as a file into the matched folder
conflictBehavior = rename so duplicate titles never overwrite each other

Fields produced along the way

Stable intermediate shape, so the upload step always knows what to write and where.

FieldWhat it holds
meeting_titleTitle from the meeting record — drives both the company-name resolver and the file name
participantsAttendee emails — extra signal for the company-name LLM when the title is ambiguous
full_transcriptPaginated transcript, concatenated with newline before each [hh:mm:ss] speaker turn
linked_companyFirst company already linked to the meeting in CRM, if any — used as the authoritative name
resolved_companyFinal company name: CRM name when present, otherwise LLM-extracted from title + transcript + participants
clean_file_nameTitle stripped to letters / numbers / single spaces — safe for SharePoint, no special characters
target_folder_idID of the company's '<Company> Call Transcripts' folder in SharePoint

Gotchas

The things you only learn by running this in anger across a year of calls.

  • 01CRM-name-first, LLM-second: the resolver prompt is told 'if our CRM returned a name, use it exactly.' The model only invents a name when no company is linked to the meeting. This stops the LLM from overriding the partners' canonical naming (e.g. 'Acme' vs 'Acme, Inc.' vs 'Acme Labs').
  • 02Always-startup rule: meetings with other investors about a portfolio target are common. The prompt explicitly says return the startup name, never the co-investor. Otherwise transcripts land under 'Sequoia' instead of the actual company.
  • 03Pagination is non-optional: the transcript endpoint cursors through pages. A one-shot fetch silently truncates long diligence calls — the ones you most need archived in full.
  • 04Filename sanitization: SharePoint rejects `/ \ : * ? " < > |` and trims trailing dots. Strip to `[a-zA-Z0-9 ]+`, collapse whitespace. Without this, ~5% of uploads silently fail.
  • 05Folder search by convention, not ID: looks up '<Company> Call Transcripts' inside the configured SharePoint drive. Folder naming convention IS the routing logic — no hardcoded folder map to maintain.
  • 06Conflict behavior = rename: two calls with the same title don't overwrite each other. SharePoint appends a numeric suffix.
  • 07No-match = no upload: success_on_miss is false on the folder search. Better to leave a transcript unfiled than to dump it in the wrong company's folder — pair this with the 'archive unmapped transcripts' automation to catch the orphans.

Swap matrix

Every layer is replaceable. The orchestrator owns the wiring so the notetaker, the CRM, and the document store all move independently.