Automation · Playbook 02

Orchestrate full AI company fact sheet generation.

Trigger one button to draft a structured, AI-assembled company fact sheet in parallel from the underlying source corpus. Not an IC memo or brief — this is the raw, section-by-section research synthesis partners read first.

Diligence

Prompts used

The exact LLM prompts this automation calls — system instructions and user messages, in execution order, with the model each step uses.

  • Company Fact Sheet
    9 prompts

    Nine production prompts that compose a structured, AI-assembled company fact sheet from a private RAG engine over the firm's document store. Eight section drafters (Business, Product, Customers / GTM, Market, Competition, Unit Economics, Leadership, Supply Chain) each fan out 4–7 scoped sub-queries and synthesize one bulleted section, then a final 'Questions' prompt reads the assembled fact sheet and the partners' open questions to produce exactly five sharp diligence questions. This is raw research synthesis — not a partner-vetted IC brief or memo. It feeds the PMB, ICB, and ICM generators but is distinct from them.

Reference build

A working reference build that runs in production. A partner clicks a button on a company's Notion pipeline page; the orchestrator pulls the page metadata, fires off eight research sections in series — each one a parallel fan-out of queries against an internal RAG engine, synthesized by an LLM and appended back to the same Notion page — and finishes with a 'Questions' section generated from the assembled fact sheet itself. ~80 steps, one click, one finished first-pass company fact sheet.

This is not an IC memo, brief, or partner-vetted document. It is raw, AI-assembled research synthesis — the material partners read before writing the PMB, ICB, or ICM. Think of it as a machine-drafted company dossier, not a board-ready investment memo.

Vendors below are our choices. The flow is roles-not-vendors; every layer swaps cleanly.

Flow
01 · Trigger
Notion button on the company's pipeline page
Webhook fires with the Notion page_id. Partner-initiated, not automatic — partners decide which deals are worth a full fact sheet.
02 · Resolve company
Look up the pipeline DB row by Name
Pulls company name, one-line description, stage, owner — every section prompt is conditioned on these
03 · Extract open questions
String-marker extraction of the 'Key Questions' block
Pulls the partners' existing open questions out of the Notion page so the final synthesis can address them directly
04 · Section loop — repeat for each of the 8 fact sheet sections, in order
↻ 8× per fact sheet
04a · Research fan-out
4–7 parallel POSTs to the RAG engine (/llamaquery)
Each query is a sharp, scoped sub-question for this section (e.g. 'unit economics', 'gross margins', 'CAC/LTV'). Company name + description injected into every query.
04b · Synthesize
LLM — write this section as a bulleted fact sheet
One synthesizer per section, with section-specific instructions (tone, length, what to include/exclude). Inputs: all RAG responses + company metadata.
04c · Format
Code step — normalize to Notion-friendly markdown
Strip stray markdown, fix bullet style, normalize headings, escape Notion-breaking characters
04d · Append to fact sheet
PATCH Notion page — append section blocks
Partners can read finished sections while later ones are still drafting
05 · Sections produced (in order)
Business · Product · Customers · Market · Competition · Unit Econ · Leadership · Supply Chain
Order is deliberate — each later section's synthesizer can implicitly reference earlier ones via the Notion page state
06 · Settle
Delay 3 minutes
Notion eventual-consistency window — without this, the final synthesis reads a half-empty page
07 · Read assembled memo
Pull the full Notion page back down
The whole memo (all 8 sections, just written) becomes context for the closing synthesis
08 · Final synthesis — 'Questions'
LLM — exactly 5 bullets, derived from the assembled memo + partners' open questions
The whole point of this section is that it CAN'T be generated up front — it depends on what the other 8 sections actually said
09 · Thematic context
POST to thematicbeast/llamaquery for sector-level framing
Separate RAG endpoint, sector-scoped — keeps thematic citations clean from company-specific ones
10 · Write final block
PATCH Notion — append 'Questions' + thematic framing to the memo
Memo is now end-to-end on the page partners already read from
11 · Adjacent artifacts
Call sub-zaps: PMB generator + ICM generator
Same trigger, independent flows — pipeline brief and IC memo build off the same company resolution. Failure in one doesn't kill the others.

The eight fact sheet sections

Each section is a fan-out of scoped research queries plus a section-specific synthesizer. Order matters; the final 'Questions' section is generated from the assembled fact sheet, not from research.

SectionQueriesTopics covered
Business Description / History6core products · founding story · financials · differentiation · history · why-invest
Product(s) Overview6why-now · technical architecture · stage of development · IP · roadmap · moats
Customers / GTM7evidence of traction · customer segments · purchase criteria · sales motion · contracts · retention · references
Market Size6TAM · segmentation · growth drivers · adjacent markets · regulatory tailwinds · timing
Competitive Landscape4incumbents vs. disruptors · relative positioning · barriers to entry · likely consolidation
Unit Economics / Financial Profile7business model · unit economics · gross margins · CAC/LTV · capital intensity · path to profitability · benchmarks
Leadership1founders + top 3-4 execs (background, prior wins, gaps)
Supply Chain / Manufacturing5critical suppliers · BOM · scalability · single points of failure · geopolitical exposure
Questions (synthesized last)05 bullets generated FROM the assembled fact sheet above — not from research

Gotchas

The things you only learn by running an 80-step fact sheet orchestrator in anger across a year of deals.

  • 01Per-section fan-out is the unit of work, not per-query. Each fact sheet section issues 4–7 parallel research queries against the RAG engine, then ONE LLM synthesizer turns those into a bulleted section. Trying to one-shot a whole fact sheet from one prompt produces vague, repetitive output every time.
  • 02Section order matters: business → product → customers → market → competition → unit econ → leadership → supply chain → questions. The 'Questions' section is generated LAST, using every prior section as input, so it asks the right unknowns rather than generic founder questions.
  • 03Notion is the surface and the substrate. Each finished section is appended to the same Notion page as it lands — partners can watch the fact sheet build in real time and start reading the early sections while later ones are still drafting.
  • 04Marker-based section extraction. A small code step pulls the 'Key Questions' block out of the existing Notion page using string markers ('Key Questions' → 'Pros') so the orchestrator can hand the partners' open questions back into the prompt for the final synthesis.
  • 05Two RAG endpoints, not one. The company-specific service ('thebeast') answers questions grounded in the company's own data room + scraped web. A separate thematic service ('thematicbeast') answers sector-level questions for the closing 'why now' synthesis. Mixing them produces garbage citations.
  • 063-minute settle delay before the final synthesis: Notion's API has eventual-consistency lag on page content. Reading back the fact sheet too fast misses the last few blocks and the 'Questions' section gets generated against a half-empty page.
  • 07Sub-zap chaining: PMB (pipeline brief) and ICM (investment committee memo) generators are kicked off from the same trigger. They share the company-resolution step but otherwise run independently — splitting them keeps any one failure from killing the whole pipeline.
  • 08Failure is per-section, not all-or-nothing. A bad RAG call for 'Supply Chain' shouldn't kill the 'Business Overview' write. Each section's write step is independent — partners can re-run just the broken section.
  • 09Token budget is the real constraint. Each section's synthesizer eats 30–80k tokens of RAG output. Across 8 sections plus final synthesis, one fact sheet run is ~600k–1M tokens. Cheap by analyst-hour standards, expensive by API-bill standards — worth knowing before you turn it on for the whole pipeline.

Swap matrix

Every layer is replaceable. The contract — section list, per-section fan-out + synthesizer shape, append-to-fact-sheet target — is what the rest of the firm depends on.