Question 1

How is this different from ChatGPT with browsing?

Accepted Answer

ChatGPT browsing is a generalist single-turn fetch. A production research agent has a defined task ('produce a 5-page competitive intel brief on company X covering pricing, product roadmap, team, funding, customer reviews'), structured outputs, source deduplication, citation maps, and a scheduled cadence. It runs unattended, produces consistent format, and you can audit where every fact came from.

Question 2

Can it scrape sites with anti-bot protections?

Accepted Answer

Through Browserbase or similar headless browser services, we can render JS-heavy pages and pass simple bot checks. Hard anti-bot protections (Cloudflare Bot Management, hCaptcha) are intentionally adversarial — we will not bypass them as a matter of policy, both because it's an arms race and because it puts your operation at legal risk. Where we hit hard blocks we use official APIs or licensed data instead.

Question 3

How does it handle stale or conflicting sources?

Accepted Answer

Source dating — every fetched chunk has a published date and crawled date. The agent prefers recent sources for time-sensitive claims (pricing, team, news) and tolerates older sources for stable claims (history, mission). For conflicts, the agent surfaces the disagreement in the output rather than picking one — 'Source A says X, Source B (older) says Y.'

Question 4

Does it cite sources?

Accepted Answer

Yes — citations are non-negotiable. Every claim in the output is mapped to one or more sources, rendered as inline footnote-style links. The citation map is queryable: 'show me every source that contributed to the pricing section.' Without citations the output is unauditable and effectively unusable for any serious decision.

Question 5

How long does a research run take?

Accepted Answer

Per-target: typically 5–30 minutes of agent runtime depending on depth. Cost: €0.50–€3 per dossier in LLM + browse costs. We design for a few-minute interactive run for ad-hoc requests and a batched overnight cadence for recurring scans (e.g. 50 prospect briefs a week).

Question 6

Can the output land in our CRM or knowledge base?

Accepted Answer

Yes — that's typically the point. We write structured outputs to HubSpot/Salesforce custom objects, to Notion / Confluence pages, to internal knowledge stores. For sales-team workflows we attach the brief to the contact record in the CRM so it shows up before the call.

Layer	Default
Web fetch	Browserbase (managed headless browser) / Firecrawl
Search APIs	Google CSE / Bing Search API / Tavily
Internal knowledge	pgvector / Pinecone over your docs
Synthesis model	Claude Sonnet 4.6 (1M+ context = whole-dossier reasoning)
Orchestration	LangGraph for multi-step research
Output schema	Zod
Citation graph	Custom — store as JSONB in Postgres
Cadence	Cloud Scheduler / Vercel Cron
Observability	Langfuse

Scope	Investment
Single research workflow (one task type, one output destination)	€15,000–30,000
Multi-workflow research platform (3–5 task types)	€30,000–70,000
Enterprise research platform with multiple data sources and outputs	€70,000–150,000
Retainer (ongoing tuning, new sources, new schemas)	from €1,500/month

Research & Synthesis Agent

What a research agent actually does

Anatomy of a working pipeline

Use cases we have shipped

Stack we tend to reach for

What makes a research agent "production-grade"

Cost and timeline

Where it pairs

Frequently asked questions

Related

AI Agents Development

ChatGPT API vs Claude API vs Gemini: which to pick (2026)

RAG done right: the patterns that survive production

The AI Development playbook: how we ship agents in 6 weeks

Want to scope a research & synthesis agent project?