Metadata-Version: 2.4
Name: tarmac
Version: 0.1.0
Summary: SBI-Aileron Phase 1: PDF extraction pilot + manual data entry facility
Author-email: Peter Swimm <peterswimm@gmail.com>
License: Proprietary — Toilville LLC
Requires-Python: >=3.13
Requires-Dist: azure-ai-documentintelligence>=1.0.2
Requires-Dist: azure-cosmos>=4.15
Requires-Dist: azure-identity>=1.25
Requires-Dist: azure-storage-blob>=12.28
Requires-Dist: openai>=2.36
Requires-Dist: pdfminer-six>=20260107
Requires-Dist: pillow>=12.2
Requires-Dist: pydantic>=2.13
Requires-Dist: pymupdf>=1.27
Requires-Dist: rich>=15.0
Requires-Dist: structlog>=25.5
Requires-Dist: tenacity>=9.1
Requires-Dist: typer>=0.25
Provides-Extra: dev
Requires-Dist: fastapi>=0.136; extra == 'dev'
Requires-Dist: httpx>=0.28; extra == 'dev'
Requires-Dist: jinja2>=3.1.6; extra == 'dev'
Requires-Dist: mypy>=2.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=1.3; extra == 'dev'
Requires-Dist: pytest-cov>=7.1; extra == 'dev'
Requires-Dist: pytest>=9.0; extra == 'dev'
Requires-Dist: python-multipart>=0.0.28; extra == 'dev'
Requires-Dist: ruff>=0.15; extra == 'dev'
Requires-Dist: types-pyyaml; extra == 'dev'
Requires-Dist: uvicorn>=0.46; extra == 'dev'
Provides-Extra: fallback
Requires-Dist: fastapi>=0.136; extra == 'fallback'
Requires-Dist: jinja2>=3.1.6; extra == 'fallback'
Requires-Dist: python-multipart>=0.0.28; extra == 'fallback'
Requires-Dist: uvicorn>=0.46; extra == 'fallback'
Description-Content-Type: text/markdown

# tarmac

Codename for the SBI–Aileron Phase 1 engagement (Toilville subcontractor scope).

**Forge domain**: `sbi-aileron` (manifest: `~/.spel/domains/sbi-aileron/forge-manifest.spel`)
**Status**: scoping — pre-kickoff
**Total Toilville fee**: $13,000 (Phase 1)

## Scope (Toilville-owned)

| SOW § | Workstream | Fee | Wish |
|-------|------------|-----|------|
| 1.2 | PDF Extraction Pilot — 41 payout memos, three pluggable extraction tracks | $11,500 | extraction-pilot |
| 1.3 | AI Compute Profiling — workload-driver inputs (Aileron leads) | — | compute-profiling-inputs |
| 1.4 | Manual Data Entry Facility — PowerApp, ≤2 forms | $1,500 | data-entry-facility |
| 4.2 | Phase 1 technical documentation + production extraction recommendation | — | tech-documentation |

## Layout

```
extraction/             # PDF pipeline (Python)
  intake/               # PDF intake, format detection
  classify/             # document classification
  ocr/                  # Azure Document Intelligence wrappers (Track A, hybrid)
  extract/              # field extraction — three pluggable tracks behind common interface
  validate/             # Pydantic schema validation, confidence scoring
  quarantine/           # low-confidence routing, exception reporting
  output/               # Data Vault sink + JSON/CSV with provenance
schema/                 # Pydantic models — payout memo schema, Data Vault contracts
powerapp/               # exported PowerApp solution + integration glue
compute_profiling/      # workload drivers, calibration data, deliverable to Aileron
infra/bicep/            # Azure resource templates (RG, Cosmos, Storage, DI, OpenAI)
docs/                   # architecture, exception handling, production recommendation
tests/                  # pytest
_legacy/                # spec/plan/design-questions imported from azure-learning-dojo (delete after week 2)
.forgejo/workflows/     # CI (Forgejo Actions)
```

## Extraction tracks

The pilot evaluates three approaches in parallel; the production recommendation deliverable picks one.

- **Track A** — Azure Document Intelligence (prebuilt + custom model)
- **Track B** — Azure OpenAI vision + JSON mode over PDF page images
- **Track C** — Hybrid: DI for OCR + OpenAI for field extraction

All three sit behind a common `Extractor` interface in `extraction/extract/`.

## Domain isolation

`sbi-aileron` is a **Z3 client-confidential** forge domain. Strict rules:

- No cross-domain reads or writes (DB schema, Vault path, Azure tenant all scoped)
- No client documents, extracted fields, or embeddings leave the domain
- Secrets only via Vault path `secret/forge/sbi-aileron/*` — never `.env.local`
- `DefaultAzureCredential` / managed identity preferred; no long-lived keys in code

## Key links

- SOW source: `_legacy/sow_phase_1.md` (when added)
- Domain manifest: `~/.spel/domains/sbi-aileron/forge-manifest.spel`
- Forgejo remote: `git.toilville.dev/sbi-aileron/tarmac` (private; org is the partnership workspace, transferable at engagement close)
