Status
Extraction
- Source:
FY-2025-26-Proposed-Budget.pdf(408 pages, sha25687856ff9c25856c0...)
Counts
- Total sections: 229
- Index nodes: 34
- Leaf sections: 195
- Inlined (too small for own file): 0
Known gaps
- 11 pages had no extractable text (likely image-only; may need OCR): 2, 4, 32, 78, 100, 134, 246, 316, 322, 364, 408
Enrichment
- Enriched: 2026-04-26
- Leaves enriched: 195 / 195 (100%)
- Failures: 0
- Model: claude-haiku-4-5
- Runner:
scripts/enrich_2526_proposed.mjs(custom pdf-engine enricher) - Format:
<!-- enrich:begin -->/<!-- enrich:end -->TL;DR blocks with PDF page citations, injected after the H1 title in each leaf. - One leaf (
summary-of-resources-by-fund-fy-2025-26) required 3 attempts due to JSON parse errors on first two Haiku responses; recovered successfully on attempt 3.
Rebuild
Last rebuilt: 2026-04-26 (hand-crafted section plan via scripts/build_2526_proposed_plan.mjs).
Build approach: page-type split (cover/decisions/budget-summary/CIP/FTE per bureau), not program-level.
This PDF has no per-program narrative pages; program names appear only as dollar rows in bureau tables.
# From portland-gov/ repo root:
cd ppd && node ../scripts/build_2526_proposed_plan.mjs && node scripts/emit_2526_proposed.mjs
# Then enrich:
node scripts/enrich_2526_proposed.mjs