Now booking May / June 2026 · limited capacity

⊕ CONSULTING · AI agents · financial AI · MCP · XBRL · RAG

Your AI gets financial facts wrong. We fix that.

Q: Do you build MCP server integrations?

Yes. MCP integration is the most common engagement we run today. We build production-grade MCP servers that expose SEC data as tools your LLM agents can call directly — structured outputs, proper error handling, rate-limit safety, and integration with the edgartools library underneath. Typical scope: 2 weeks, $30,000–$50,000.

Q: We already have an SEC data pipeline. Can you improve it?

Yes — that's exactly what the Architecture Review is for. We audit your existing pipeline, identify accuracy gaps and performance bottlenecks, and deliver a prioritized remediation plan. Most teams find issues they didn't know they had.

Q: Do we own the code you build?

100%. All custom code, documentation, and deliverables are yours. We include a team handoff session and 2 weeks of post-delivery support so your engineers can maintain and extend everything independently.

Q: What tech stack do you work with?

Python-first. We build on edgartools (our open-source library), with production deployments typically using PostgreSQL, Redis, and cloud infrastructure (AWS, GCP, or Azure). We integrate with your existing stack — not replace it.

Q: What happens on the discovery call?

30 minutes. You describe your SEC data challenge, we ask clarifying questions, and we tell you exactly how we'd approach it — including which engagement type fits and a rough timeline. No pressure, no pitch deck. If we're not the right fit, we'll tell you.

Q: We're early-stage. Is this too expensive?

The SEC Data Sprint starts at $5,000 and delivers a working prototype in 1-3 days. Compare that to hiring a data engineer ($140K+/year) who'll spend 6 months learning EDGAR's quirks. For early-stage teams, the sprint is often the fastest path to validating whether SEC data solves your problem.

Q: Do you sign NDAs?

Yes. We sign mutual NDAs before any technical discussion and include confidentiality clauses in all engagement agreements. We work with hedge funds and fintech companies where data sensitivity is paramount.

SEC data consulting for teams shipping AI agents and financial AI products. From prototype to production — in weeks, not months.

LLMs hallucinate revenue numbers. Structured XBRL fixes it. Almost nobody knows how to use it. We do.

Get started or start with a free architecture review →

edgar.tools offers SEC data consulting for teams building AI agents and financial AI products. Created by Dwight Gunning — author of the open-source edgartools Python library (6M+ downloads, nearly 1M monthly, 2,000+ GitHub stars) — our consulting practice helps teams solve the hardest problems in SEC data: MCP and AI agent integration, XBRL extraction, RAG grounding for financial accuracy, and production EDGAR infrastructure. Three fixed-price engagements from $5,000.

⊕ Last updated · May 2026

01The problem

The financial AI accuracy problem.

LLMs hallucinate on financial data. Structured XBRL fixes it — but almost nobody knows how to use it.

81%

of financial questions answered incorrectly by GPT-4

FinanceBench benchmark

74×

error reduction with XBRL structured data vs. raw HTML

8.16% errors → 0.11%

17%

LLM accuracy on XBRL taxonomy concept linking

18,000 US-GAAP elements

Per the FinanceBench benchmark, GPT-4 answers 81% of financial questions incorrectly. Research shows that using structured XBRL data instead of raw HTML reduces extraction errors by 74× — from 8.16% to 0.11%. Yet LLMs achieve only 17% accuracy on XBRL taxonomy concept linking across the 18,000-element US-GAAP taxonomy. XBRL is the answer to financial AI accuracy, but with extension mechanisms, dimensional structures, and 18,000 elements — you need someone who's spent years in the weeds.

02Engagements

Three ways to work with us.

Productized engagements with fixed scope, fixed price, and defined deliverables.

⊕ Entry point

SEC Data Sprint

1 – 3 days

From $5,000

Fixed price

Go from zero to working SEC data pipeline in days. We work with your actual data to build a functioning prototype — not slides, not a report.

Working prototype on your data
Architecture recommendations
Data quality assessment
edgartools integration guide

BEST FORTeams evaluating SEC data for a new product or feature

Get started

⊕ Diagnostic

Architecture Review

1 – 2 weeks

From $15,000

Fixed price

Find out what your SEC data pipeline is getting wrong — and exactly how to fix it. A deep audit of your current architecture with prioritized recommendations.

Full pipeline audit and gap analysis
XBRL vs. HTML accuracy analysis
Architecture diagrams (current + recommended)
Written assessment with prioritized fixes

BEST FORTeams with existing SEC data systems that underperform

Start with free review

★ Most popular

Pipeline Build

2 – 4 weeks

From $35,000

Fixed price

Production-grade SEC data infrastructure, built by the team behind edgartools. Working component of your data pipeline or MCP server — code you own and your team can maintain.

Production-ready Python or MCP server
Test suite and documentation
Team handoff session
2 weeks post-delivery support

BEST FORTeams who know what they need built and want it done right

Get started

Need ongoing advisory or an embedded expert? We offer retainer engagements for teams with continuous SEC data needs. Let's talk.

03Capabilities

Built for AI agent & financial AI teams.

We solve the hard problems that sit between raw SEC filings and reliable AI products.

⊕ A · Agents

MCP & AI agent integration

Wire SEC data into your LLM agents via Model Context Protocol. Production-ready tool servers, structured outputs designed for agentic composition, and grounded context for finance workflows.

⊕ B · Grounding

RAG grounding pipelines

Ground your LLM outputs in authoritative, structured SEC data. Reduce financial hallucinations from 81% to near-zero.

⊕ C · XBRL

XBRL extraction pipelines

Structured financial data extraction across thousands of companies. Navigate the 18,000-element US-GAAP taxonomy without getting lost. Learn more →

⊕ D · Infra

Production SEC data infrastructure

Pipelines that handle EDGAR rate limits, format inconsistencies, filing amendments, and the edge cases that break naive parsers.

⊕ E · Architecture

Financial data product architecture

Design systems that ingest, normalize, and serve SEC data at scale. From startup MVP to enterprise-grade platform.

⊕ F · Eval

Accuracy benchmarking

Test your financial AI against FinanceBench or custom evaluation sets. Measure accuracy before and after XBRL integration.

04The math

You could build this in-house. Here's why teams don't.

⊖ Build in-house

×6–12 months learning EDGAR's quirks
×$140K–$204K/yr for a RAG engineer who still gets 44% of financial facts wrong
×15% NLP specialist vacancy rate — good luck hiring
×Your engineers building plumbing instead of your product

⊕ Work with us

✓2–4 weeks from kickoff to production pipeline
✓10,000+ filing edge cases already solved
✓99.9% accuracy with XBRL structured data
✓Your team stays focused on your product

05Buyer profiles

Who we work with.

Teams building the next generation of financial intelligence products.

AI startups

RAG pipeline that hallucinates revenue.

"Our outputs cite numbers that aren't in the filing."

We ground your LLM outputs in structured XBRL data so your financial AI gets facts right.

FinTech companies

SEC pipeline that breaks every earnings season.

"Our parser breaks on every novel filing format."

We build infrastructure that handles EDGAR's rate limits, format changes, and 10,000+ filing edge cases.

Funds & data teams

The data point Bloomberg doesn't have.

"Bloomberg doesn't expose this — and we need it now."

We extract custom SEC data — insider transactions, institutional holdings, executive comp — that terminal vendors don't cover.

06Why us

Not a consulting firm. The author of the library.

I'm Dwight Gunning, creator of edgartools — the most-downloaded open-source Python library for SEC data, with 6M+ lifetime downloads, nearly 1M monthly downloads, and 2,000+ GitHub stars.

When your AI agent gets revenue numbers wrong, when your MCP server returns half-parsed XBRL, when EDGAR's rate limiter blocks your entire pipeline during earnings season — I've already solved those problems. In production.

You're not hiring a generic consulting firm. You're working with the person who wrote the library your developers already use.

6M+

PyPI downloads

~1M

Monthly downloads

20+

Years in FinTech

Where the open-source edgartools library shows up

Public usage signals from the open-source library — these are not consulting customers; they're teams running edgartools in their own products, repos, and pipelines.

230+

Public dependent repositories on GitHub

25+

AI / agent / RAG projects in the dependency graph

Independent MCP server projects built on top

90%

Linux downloads — running in CI/CD & production

Public deployments include hyperscale cloud sample repositories, ML platform tutorials, top-tier consulting firms' GenAI training programs, multi-hundred-million-dollar wealth managers, YC-backed AI startups, and university research groups — alongside the conda-forge feedstock and an active community of MCP server maintainers.

07Process

First call to production code in weeks.

Discovery call

30 minutes. Tell us your SEC data challenge. We'll tell you exactly how we'd solve it — and whether we're the right fit.

Scope & price

You get a fixed-price proposal with clear deliverables, timeline, and acceptance criteria. No surprises, no scope creep.

Build & deliver

We build on your actual data. You get working code, documentation, and a handoff your team can run with. Full source code ownership.

08FAQ

Ten questions, honestly answered.

01Do you build MCP server integrations?+

Yes — MCP integration is the most common engagement we run today. We build production-grade MCP servers that expose SEC data as tools your LLM agents can call directly: structured outputs, proper error handling, rate-limit safety, and integration with the edgartools library underneath. Typical scope: 2 weeks, $30,000–$50,000.

02Can you help us productionize an AI agent built on edgartools?+

That's a frequent ask. Many teams prototype an agent on edgartools and then hit production challenges: SEC rate limits, XBRL inconsistencies across filings, retry logic, ground-truth evaluation, deployment. We work alongside your engineers to take the prototype to production — typically a 2–4 week Pipeline Build engagement.

03Do you offer ongoing support after delivery?+

Every Pipeline Build includes 2 weeks of post-delivery support. Beyond that, we offer monthly retainer engagements for teams with continuous SEC data needs — covering bug fixes, feature additions, EDGAR API changes, and an SLA on response time. Discuss on the discovery call.

04We already have an SEC data pipeline. Can you improve it?+

Yes — that's exactly what the Architecture Review is for. We audit your existing pipeline, identify accuracy gaps and performance bottlenecks, and deliver a prioritized remediation plan. Most teams find issues they didn't know they had.

05Do we own the code you build?+

100%. All custom code, documentation, and deliverables are yours. We include a team handoff session and 2 weeks of post-delivery support so your engineers can maintain and extend everything independently.

06What tech stack do you work with?+

Python-first. We build on edgartools (our open-source library), with production deployments typically using PostgreSQL, Redis, and cloud infrastructure (AWS, GCP, or Azure). We integrate with your existing stack — not replace it.

07What happens on the discovery call?+

30 minutes. You describe your SEC data challenge, we ask clarifying questions, and we tell you exactly how we'd approach it — including which engagement type fits and a rough timeline. No pressure, no pitch deck. If we're not the right fit, we'll tell you.

08We're early-stage. Is this too expensive?+

The SEC Data Sprint starts at $5,000 and delivers a working prototype in 1-3 days. Compare that to hiring a data engineer ($140K+/yr) who'll spend 6 months learning EDGAR's quirks. For early-stage teams, the sprint is often the fastest path to validating whether SEC data solves your problem.

09How quickly can you start?+

Typically within 1-2 weeks of signing. A Sprint can start within days if schedules align. We scope and price within 48 hours of the discovery call, so the only bottleneck is your availability.

10Do you sign NDAs?+

Yes. We sign mutual NDAs before any technical discussion and include confidentiality clauses in all engagement agreements. We work with hedge funds and fintech companies where data sensitivity is paramount.

09Talk to us