Legal · eDiscovery

10,000-doc review reaches the partner this week.

Attorneys still confirm every privilege call. The system does the surfacing at scale so they only see the documents that need a judgment.

Custodian2,184 / 12,407

Privileged 412

Responsive 2,184

Confidential 786

Hot doc 18

Stage 01 Document set ingested

A matter lands as 47,000 documents.

Custodian mailboxes, file shares, and collected exports stream into the matter workspace. No paralegal opens a Bates-numbered folder.

Matter #MT-2841 · collection intake

Ingesting

batch 01 [Custodian A] mailbox 12,840 docs · 4.1 GB Loaded
batch 02 [Custodian B] mailbox 9,217 docs · 3.2 GB Loaded
batch 03 Shared drive export Contracts, board decks, memos Loading

47,000 documents · 12 custodians · 16.4 GB Moving to OCR & parse →

Cost economics · honest framing

Sonnet handles the parse pass across all 47,000 docs. Opus is reserved for the judgment calls. Running Opus on every document would blow the unit economics on a matter this size. The model split is what makes this viable at volume, and it is the only reason a tens-of-thousands-document review can clear in days instead of weeks.

Stage 02 OCR and parse complete

Every doc becomes structured metadata.

PDFs, scans, email chains, attachments. Sender, recipients, dates, threading, and full text extracted to a per-doc record the rest of the pipeline can reason over.

DOC-018472.pdf (scanned) 1.8 MB

RE: Q3 forecast (draft)

From: [Custodian A] · To: [Custodian C], [Outside counsel]

2023-08-14 · 14 attachments

Body excerpt

"Looping in [Outside counsel] for guidance on the disclosure language before we circulate to the board. The revenue assumptions in tab 3 are the piece I want their read on..."

Thread

Parent of DOC-018475, DOC-018482

parsed DOC-018472 Sonnet · 1.9s

{
  "doc_id": "DOC-018472",
  "doc_type": "email_with_attachments",
  "sent_at": "2023-08-14T16:42:00Z",
  "from": "custodian_a",
  "to": ["custodian_c", "outside_counsel"],
  "thread_id": "TH-04918",
  "attachments": 14,
  "ocr_confidence": 0.97,
  "counsel_addressee": true,
  "requires_judgment": true
}

Why this stage uses Sonnet

Parsing is high-volume, low-judgment. Sonnet runs on all 47,000 docs because the work is extraction, not interpretation. Anything that looks like it might involve counsel or attorney work-product gets a flag (requires_judgment: true) and waits for Opus in the next stage. That gate is what keeps the spend rational.

Stage 03 Privilege screening

Privilege candidates surfaced for attorney confirmation.

The judgment layer. Opus reviews every flagged document against the firm's prior privilege calls on similar matters. It does not mark anything "privileged." It surfaces candidates for an attorney to confirm. The log is attorney-confirmed or it does not exist.

DOC-018472 · email to outside counsel

Counsel on To-line, request for legal advice on disclosure language. Pattern matches 14 prior privilege calls on similar disclosure-review threads.

Candidate

DOC-021908 · in-house counsel memo

Authored by [In-house counsel] for [Custodian D]. Work-product pattern.

Candidate

DOC-034115 · ambiguous · third party on thread

Counsel on thread, but a third-party vendor is also addressed. Possible waiver. Flagged for attorney review, not auto-classified either way.

Review

DOC-041203 · vendor invoice

No counsel involvement, no legal-advice content. Not a privilege candidate.

Clear

1,184 candidates surfaced · 0 auto-confirmed Trained on firm's prior privilege calls · Opus

Stage 04 Issue tagging

Every doc gets tagged against your matter taxonomy.

Sonnet handles the straightforward tags. When a document sits between two issues or hits a sensitive boundary, Opus is called in to resolve. Tags map to the issues the matter team defined at kickoff.

DOC-018475 · thread reply

Matter #MT-2841 · taxonomy v3

5 Issue tags

Revenue recognition · tab 3 forecast assumptions (Sonnet, high confidence)

Disclosure language · board-facing draft (Sonnet, high confidence)

Internal controls · references control owner sign-off (Sonnet, high confidence)

Counsel involvement · resolved by Opus (ambiguous, third party on thread)

Hot-doc candidate · Opus elevated, partner-relevance score 0.91

Taxonomy captured from the case team at kickoff. Sonnet tags at volume; Opus is the tiebreaker on ambiguous documents.

Stage 05 Privilege log · attorney confirmation queue

Attorneys confirm. The log writes itself.

Each candidate lands in a queue with the system's reasoning, the prior-call pattern it matched, and the surrounding thread. The reviewing attorney confirms, downgrades, or asks a question. The log entry only exists once an attorney has signed off. Every decision is logged to the reviewing attorney.

Matter #MT-2841 · privilege log

1,184 candidates · attorney-confirmed entries only

Attorney-signed

Confirmation queue (sample)

DOC-018472 Email to outside counsel · request for legal advice on disclosure · matches 14 prior calls Pending
DOC-021908 In-house counsel memo to [Custodian D] · work-product pattern Pending
DOC-034115 Third party on thread · possible waiver, attorney judgment required Pending
DOC-019003 Counsel-authored draft response · confirmed by [Reviewing attorney] at 09:41 Confirmed

Every confirmation, downgrade, and edit is timestamped to the reviewing attorney. The exported privilege log is a record of attorney decisions, never a model output.

Stage 06 Partner brief assembled

A one-pager, plus the 34 docs that matter.

Of 47,000 documents, the partner gets a single page on what the record actually shows, plus links to the documents that drive it. Everything else stays available, but nobody has to read it to be briefed.

Matter #MT-2841 · partner brief

Prepared from 47,000 docs · 34 hot-doc citations

For partner

What the record shows

[Custodian A] looped in counsel on disclosure language two weeks before the board meeting (DOC-018472, DOC-018475).
Internal control owner raised the revenue-recognition issue in writing on 2023-07-22 (DOC-016401), 23 days before disclosure was finalized.
Forecast assumptions in tab 3 were edited by three custodians in a 48-hour window (DOC-019200 through DOC-019218).
1,184 privilege candidates surfaced; 1,096 confirmed by reviewing attorney, 62 downgraded, 26 still in queue.

Hot documents (top 6 of 34)

DOC-016401 Control owner raises revenue recognition concern in writing · pre-disclosure Hot
DOC-018472 [Custodian A] requests counsel guidance on disclosure language Hot
DOC-019215 Tab 3 forecast revision · assumption swap, no documented rationale Hot
DOC-022104 [Custodian C] acknowledges the open question to a peer the morning of the board meeting Hot
DOC-027880 Post-meeting summary references the unresolved item, then drops it Hot
DOC-031044 Investor-relations Q&A prep avoids the same item by name Hot

Questions for the partner

Do we treat the control owner's 2023-07-22 memo as the start of the relevant period for [Plaintiff]'s theory?
How aggressive do we want to be on the tab 3 revision thread in deposition prep?
Two of the 26 still-open privilege candidates may be the strongest documents for the other side. Do we want a second attorney pass before producing?

Stage 07 Outcome

Same attorneys. The doc pile disappears.

Attorneys still confirm every privilege call. They just stop being the bottleneck on getting to the documents that matter.

6 to 8 weeks Under 1 week Review time per 10k docs

400 to 600 hrs 80 to 120 hrs Associate review hours per matter

Missed in prior reviews Caught at confirmation Privilege candidates · attorney decides

This is one of ours

Want one for your firm?

Record a 5-minute voice memo about a typical matter. We'll show up to our call with a system already designed for it.

Tell us about your practice Backed by our 6-week guarantee