tarmac (0.1.0)

Published 2026-05-10 23:34:41 +00:00 by peterswimm

Installation

pip install --index-url  tarmac

About this package

SBI-Aileron Phase 1: PDF extraction pilot + manual data entry facility

tarmac

Codename for the SBI–Aileron Phase 1 engagement (Toilville subcontractor scope).

Forge domain: sbi-aileron (manifest: ~/.spel/domains/sbi-aileron/forge-manifest.spel) Status: scoping — pre-kickoff Total Toilville fee: $13,000 (Phase 1)

Scope (Toilville-owned)

SOW § Workstream Fee Wish
1.2 PDF Extraction Pilot — 41 payout memos, three pluggable extraction tracks $11,500 extraction-pilot
1.3 AI Compute Profiling — workload-driver inputs (Aileron leads) compute-profiling-inputs
1.4 Manual Data Entry Facility — PowerApp, ≤2 forms $1,500 data-entry-facility
4.2 Phase 1 technical documentation + production extraction recommendation tech-documentation

Layout

extraction/             # PDF pipeline (Python)
  intake/               # PDF intake, format detection
  classify/             # document classification
  ocr/                  # Azure Document Intelligence wrappers (Track A, hybrid)
  extract/              # field extraction — three pluggable tracks behind common interface
  validate/             # Pydantic schema validation, confidence scoring
  quarantine/           # low-confidence routing, exception reporting
  output/               # Data Vault sink + JSON/CSV with provenance
schema/                 # Pydantic models — payout memo schema, Data Vault contracts
powerapp/               # exported PowerApp solution + integration glue
compute_profiling/      # workload drivers, calibration data, deliverable to Aileron
infra/bicep/            # Azure resource templates (RG, Cosmos, Storage, DI, OpenAI)
docs/                   # architecture, exception handling, production recommendation
tests/                  # pytest
_legacy/                # spec/plan/design-questions imported from azure-learning-dojo (delete after week 2)
.forgejo/workflows/     # CI (Forgejo Actions)

Extraction tracks

The pilot evaluates three approaches in parallel; the production recommendation deliverable picks one.

  • Track A — Azure Document Intelligence (prebuilt + custom model)
  • Track B — Azure OpenAI vision + JSON mode over PDF page images
  • Track C — Hybrid: DI for OCR + OpenAI for field extraction

All three sit behind a common Extractor interface in extraction/extract/.

Domain isolation

sbi-aileron is a Z3 client-confidential forge domain. Strict rules:

  • No cross-domain reads or writes (DB schema, Vault path, Azure tenant all scoped)
  • No client documents, extracted fields, or embeddings leave the domain
  • Secrets only via Vault path secret/forge/sbi-aileron/* — never .env.local
  • DefaultAzureCredential / managed identity preferred; no long-lived keys in code
  • SOW source: _legacy/sow_phase_1.md (when added)
  • Domain manifest: ~/.spel/domains/sbi-aileron/forge-manifest.spel
  • Forgejo remote: git.toilville.dev/sbi-aileron/tarmac (private; org is the partnership workspace, transferable at engagement close)

Requirements

Requires Python: >=3.13
Details
PyPI
2026-05-10 23:34:41 +00:00
0
Proprietary — Toilville LLC
133 KiB
Assets (2)
Versions (1) View all
0.1.0 2026-05-10