Orchestrate full AI company fact sheet generation.
Trigger one button to draft a structured, AI-assembled company fact sheet in parallel from the underlying source corpus. Not an IC memo or brief — this is the raw, section-by-section research synthesis partners read first.
Prompts used
The exact LLM prompts this automation calls — system instructions and user messages, in execution order, with the model each step uses.
- Company Fact Sheet →9 prompts
Nine production prompts that compose a structured, AI-assembled company fact sheet from a private RAG engine over the firm's document store. Eight section drafters (Business, Product, Customers / GTM, Market, Competition, Unit Economics, Leadership, Supply Chain) each fan out 4–7 scoped sub-queries and synthesize one bulleted section, then a final 'Questions' prompt reads the assembled fact sheet and the partners' open questions to produce exactly five sharp diligence questions. This is raw research synthesis — not a partner-vetted IC brief or memo. It feeds the PMB, ICB, and ICM generators but is distinct from them.
Reference build
A working reference build that runs in production. A partner clicks a button on a company's Notion pipeline page; the orchestrator pulls the page metadata, fires off eight research sections in series — each one a parallel fan-out of queries against an internal RAG engine, synthesized by an LLM and appended back to the same Notion page — and finishes with a 'Questions' section generated from the assembled fact sheet itself. ~80 steps, one click, one finished first-pass company fact sheet.
This is not an IC memo, brief, or partner-vetted document. It is raw, AI-assembled research synthesis — the material partners read before writing the PMB, ICB, or ICM. Think of it as a machine-drafted company dossier, not a board-ready investment memo.
Vendors below are our choices. The flow is roles-not-vendors; every layer swaps cleanly.
The eight fact sheet sections
Each section is a fan-out of scoped research queries plus a section-specific synthesizer. Order matters; the final 'Questions' section is generated from the assembled fact sheet, not from research.
| Section | Queries | Topics covered |
|---|---|---|
| Business Description / History | 6 | core products · founding story · financials · differentiation · history · why-invest |
| Product(s) Overview | 6 | why-now · technical architecture · stage of development · IP · roadmap · moats |
| Customers / GTM | 7 | evidence of traction · customer segments · purchase criteria · sales motion · contracts · retention · references |
| Market Size | 6 | TAM · segmentation · growth drivers · adjacent markets · regulatory tailwinds · timing |
| Competitive Landscape | 4 | incumbents vs. disruptors · relative positioning · barriers to entry · likely consolidation |
| Unit Economics / Financial Profile | 7 | business model · unit economics · gross margins · CAC/LTV · capital intensity · path to profitability · benchmarks |
| Leadership | 1 | founders + top 3-4 execs (background, prior wins, gaps) |
| Supply Chain / Manufacturing | 5 | critical suppliers · BOM · scalability · single points of failure · geopolitical exposure |
| Questions (synthesized last) | 0 | 5 bullets generated FROM the assembled fact sheet above — not from research |
Gotchas
The things you only learn by running an 80-step fact sheet orchestrator in anger across a year of deals.
- 01Per-section fan-out is the unit of work, not per-query. Each fact sheet section issues 4–7 parallel research queries against the RAG engine, then ONE LLM synthesizer turns those into a bulleted section. Trying to one-shot a whole fact sheet from one prompt produces vague, repetitive output every time.
- 02Section order matters: business → product → customers → market → competition → unit econ → leadership → supply chain → questions. The 'Questions' section is generated LAST, using every prior section as input, so it asks the right unknowns rather than generic founder questions.
- 03Notion is the surface and the substrate. Each finished section is appended to the same Notion page as it lands — partners can watch the fact sheet build in real time and start reading the early sections while later ones are still drafting.
- 04Marker-based section extraction. A small code step pulls the 'Key Questions' block out of the existing Notion page using string markers ('Key Questions' → 'Pros') so the orchestrator can hand the partners' open questions back into the prompt for the final synthesis.
- 05Two RAG endpoints, not one. The company-specific service ('thebeast') answers questions grounded in the company's own data room + scraped web. A separate thematic service ('thematicbeast') answers sector-level questions for the closing 'why now' synthesis. Mixing them produces garbage citations.
- 063-minute settle delay before the final synthesis: Notion's API has eventual-consistency lag on page content. Reading back the fact sheet too fast misses the last few blocks and the 'Questions' section gets generated against a half-empty page.
- 07Sub-zap chaining: PMB (pipeline brief) and ICM (investment committee memo) generators are kicked off from the same trigger. They share the company-resolution step but otherwise run independently — splitting them keeps any one failure from killing the whole pipeline.
- 08Failure is per-section, not all-or-nothing. A bad RAG call for 'Supply Chain' shouldn't kill the 'Business Overview' write. Each section's write step is independent — partners can re-run just the broken section.
- 09Token budget is the real constraint. Each section's synthesizer eats 30–80k tokens of RAG output. Across 8 sections plus final synthesis, one fact sheet run is ~600k–1M tokens. Cheap by analyst-hour standards, expensive by API-bill standards — worth knowing before you turn it on for the whole pipeline.
Swap matrix
Every layer is replaceable. The contract — section list, per-section fan-out + synthesizer shape, append-to-fact-sheet target — is what the rest of the firm depends on.