Status

Extraction

Source: FY-2025-26-Budget-Amendments.pdf (88 pages, sha256 21927d6405a47805...)

No empty pages detected during extraction.
Hand-crafted section plan. The source PDF has zero PDF bookmarks, so pdf-doctree's outline-driven planner produced an empty plan. The plan at .extracted/section-plan.json is generated by scripts/build_amendments_plan.mjs, which parses the page-1 TOC and the ^<councilor> NN – <title> amendment headings across all 88 pages. 12 councilor indexes, 149 amendment leaves.
Page granularity. Multiple amendments often share a single PDF page (e.g. Avalos 01/02/03 all sit on p. 2). Each amendment has its own leaf file but their TL;DRs may overlap because the enrichment input is the full page text. The leaf body always shows the complete page so nothing is hidden.

Regenerate the plan (needed only if the source PDF changes, or if you want to refine the parser):

node ../scripts/build_amendments_plan.mjs

Then run the normal pipeline:

./scripts/run_budget_2025_26_amendments.sh