File call transcripts into the right company folder.
After a recorded call ends, the transcript lands in the right company folder without anyone having to move it manually.
Reference build
A working reference build that runs in production. When a recorded call finishes, the orchestrator pulls the full paginated transcript, resolves which company it's about (CRM link first, LLM fallback on title + participants + transcript), and drops the transcript file into that company's 'Call Transcripts' folder in the document store. Companion to the CRM-notes automation: notes go on the company record, raw transcript goes in the company folder.
Vendors below are our choices. The flow is roles-not-vendors; every layer swaps cleanly.
Fields produced along the way
Stable intermediate shape, so the upload step always knows what to write and where.
| Field | What it holds |
|---|---|
| meeting_title | Title from the meeting record — drives both the company-name resolver and the file name |
| participants | Attendee emails — extra signal for the company-name LLM when the title is ambiguous |
| full_transcript | Paginated transcript, concatenated with newline before each [hh:mm:ss] speaker turn |
| linked_company | First company already linked to the meeting in CRM, if any — used as the authoritative name |
| resolved_company | Final company name: CRM name when present, otherwise LLM-extracted from title + transcript + participants |
| clean_file_name | Title stripped to letters / numbers / single spaces — safe for SharePoint, no special characters |
| target_folder_id | ID of the company's '<Company> Call Transcripts' folder in SharePoint |
Gotchas
The things you only learn by running this in anger across a year of calls.
- 01CRM-name-first, LLM-second: the resolver prompt is told 'if our CRM returned a name, use it exactly.' The model only invents a name when no company is linked to the meeting. This stops the LLM from overriding the partners' canonical naming (e.g. 'Acme' vs 'Acme, Inc.' vs 'Acme Labs').
- 02Always-startup rule: meetings with other investors about a portfolio target are common. The prompt explicitly says return the startup name, never the co-investor. Otherwise transcripts land under 'Sequoia' instead of the actual company.
- 03Pagination is non-optional: the transcript endpoint cursors through pages. A one-shot fetch silently truncates long diligence calls — the ones you most need archived in full.
- 04Filename sanitization: SharePoint rejects `/ \ : * ? " < > |` and trims trailing dots. Strip to `[a-zA-Z0-9 ]+`, collapse whitespace. Without this, ~5% of uploads silently fail.
- 05Folder search by convention, not ID: looks up '<Company> Call Transcripts' inside the configured SharePoint drive. Folder naming convention IS the routing logic — no hardcoded folder map to maintain.
- 06Conflict behavior = rename: two calls with the same title don't overwrite each other. SharePoint appends a numeric suffix.
- 07No-match = no upload: success_on_miss is false on the folder search. Better to leave a transcript unfiled than to dump it in the wrong company's folder — pair this with the 'archive unmapped transcripts' automation to catch the orphans.
Swap matrix
Every layer is replaceable. The orchestrator owns the wiring so the notetaker, the CRM, and the document store all move independently.