Automation · Playbook 01

Monitor news for pipeline companies.

Every new article in the Pipeline Feedly category gets its contents extracted, summarized, and logged into Notion — so the team sees what's happening at pipeline companies without scanning headlines.

Sourcing

Reference build

A long-but-linear flow. Every new article in the Feedly "Pipeline" category is pushed through Clay for full-text extraction, summarization, and company-name detection, then logged into a Notion news database and archived as a file in the matching company's SharePoint folder. The team never opens Feedly — they read the Notion log.

Vendors below are our choices. The flow is roles-not-vendors; every layer swaps cleanly.

The landing spot
Notion database — “Pipeline News Feed”
Pipeline News Feed Notion database showing a list of articles on the left and one selected article with title, company, date, URL, summary, and full article on the right.
Each row is one article. Title, URL, Date, and Company sit on the row; Summary and Full Article render in the page body. This is the surface the team actually reads — and the corpus a weekly agent summarizes into a Slack or email digest.
Flow
01 · Trigger
New article appears in the Feedly 'Pipeline' category
Feedly fires once per new article URL it sees in any source assigned to the Pipeline board. Dedupe is handled inside Feedly.
02 · Create enrichment row
Create a row in Clay's 'Extract Article Contents and Summary' table
Writes only title + URL. Clay's enrichment columns (article fetch, summary, company-name AI column) start running asynchronously the moment the row is created.
03 · Wait
Delay 2 minutes
Gives Clay's async columns time to finish. Too short = empty cells in the next step. Two minutes covers a single-article enrichment; raise if you ever batch.
04 · Re-read the enriched row
Look up the same row back in Clay by title + URL
Pulls the now-populated cells: full article text, summary, and the AI-extracted company name.
05 · Append to Notion news log
Create a database item in Notion's 'Pipeline News Feed'
Fields: Title, URL, Date (Feedly published_date), Company (AI-extracted). Body: a markdown block with '## Summary' followed by '## Full Article'.
06 · Sanitize title
Strip non-alphanumeric characters from the headline
A 1-line JS code step: title.replace(/[^a-zA-Z0-9 ]+/g, '').trim(). SharePoint rejects most punctuation in filenames silently — this prevents that.
07 · Extract company name (second pass)
GPT-4o reads the headline + summary and returns the startup name only
System prompt is narrow: 'deep tech startup in quantum, space, defense, semis, mining…'. Output is the bare name, no prose. Used as the SharePoint folder-search key.
08 · Find the company's SharePoint folder
Search Deal Team / Documents for '{company} Relevant Articles'
If the folder doesn't exist (new pipeline company), the next step fails on purpose — the signal to scaffold the folder before adding the company in Feedly.
09 · Upload the article file
Upload {sanitized title}.html to the matched folder in SharePoint
Durable copy of the article body, named so it sorts cleanly next to the team's other diligence files.

What the flow writes

Two outputs per article. The Notion row is the queryable log; the SharePoint file is the durable archive next to the rest of the company's diligence material.

OutputShape
Notion database itemDatabase: Pipeline News Feed. Properties: Title, URL, Date, Company (rich text). Body: markdown with ## Summary and ## Full Article sections.
SharePoint fileSite: Deal Team · Drive: Documents · Folder: {Company} Relevant Articles · File: {sanitized title}.html.

Gotchas

A nine-step flow has sharp edges. Most of them are about async timing and the join between the AI-extracted company name and the SharePoint folder convention.

  • 01Two-minute delay is load-bearing. Clay's enrichment columns (article fetch + summary + company-name AI column) run asynchronously after the row is created. Reading the row back too soon returns empty cells. Two minutes is the sweet spot for a single article; raise it if you ever batch.
  • 02Create then find — don't trust the create payload. Step 2 writes the row; step 4 re-reads it by title+URL. That's because Clay's enrichment results aren't in the create response — they appear later on the row itself.
  • 03Title is sanitized before SharePoint. Step 6 strips every non-alphanumeric character (including the colons, slashes, em-dashes, and emoji that news headlines love). Without this, SharePoint's filename rules reject the upload silently.
  • 04Company-name extraction is the join key. The whole back half of the flow (find folder → upload article) only works if GPT-4o returns the same company string SharePoint folders are named with. Drift here = orphaned uploads. Keep the system prompt narrow and the model deterministic.
  • 05SharePoint folder lookup is fuzzy on purpose. The search query is '{company} Relevant Articles' across all folders in the Deal Team site. If the folder doesn't exist yet (new pipeline company), the upload step fails — that's the signal to scaffold the folder before adding the company to the Pipeline category in Feedly.
  • 06Notion content is built in markdown, not blocks. Step 5 stuffs Summary + Full Article into one markdown string and lets Notion render it. Cleaner than emitting a dozen block objects, and survives long articles without hitting the per-request block limit.
  • 07Dedupe is Feedly's job. The trigger is 'new article in category,' which Feedly already deduplicates by URL. The Zap doesn't re-check Notion. If you switch readers, you'll need to add a dedupe step.

Swap matrix

Every layer is replaceable. The only hard requirement is that whatever replaces Clay can do article extraction + summarization + entity extraction on the same row, asynchronously.