Object-Keyed, Hash-Pinned Ontology Vendoring Index
ADR-0002: Object-Keyed, Hash-Pinned Ontology Vendoring Index
Section titled “ADR-0002: Object-Keyed, Hash-Pinned Ontology Vendoring Index”Status
Section titled “Status”accepted
Context
Section titled “Context”Background and Problem Statement
Section titled “Background and Problem Statement”The ontology corpus is served at https://mif-spec.dev/ontologies/ from a static
mirror generated by modeled-information-format/MIF’s
scripts/snapshot-ontology-version.py. Its machine-readable catalog,
index.json, is the entry point a consumer reads to discover and fetch an
ontology. The flagship consumer is the research harness
(modeled-information-format/research-harness-template), whose ADR-0012 vendors
domain ontologies on demand: scripts/fetch-ontology.sh reads the index,
resolves an ontology’s extends closure, downloads each layer, verifies its
sha256 against the index fail-closed, materializes it under
packs/ontologies/<id>/, and pins it in ontologies.lock.json;
scripts/check-ontology-lock.sh then proves no drift.
For that contract to hold, the index must answer three questions per ontology, by
key: what file to fetch, what its sha256 must be, and what it extends. The
corpus authoring repo’s own scripts/gen-ontology-index.sh already emits exactly
that shape — an object keyed by id, each entry {version, file, sha256, extends[]}.
Current Limitations
Section titled “Current Limitations”The served index.json does not match. The MIF snapshot generator emits a
discovery-oriented array of {id, version, canonical, yaml, versioned} with
no integrity hash and no extends. A consumer that does index.ontologies[id]
indexes an array with a string and fails; even corrected to a scan, there is no
sha256 to verify against, so the fail-closed fetch cannot complete at all. The
two index designs — the array the mirror serves and the object the harness and the
authoring generator expect — were built separately and never reconciled. The
result: on-demand vendoring cannot be adopted, which blocks the harness epic’s
children #222 (a present, gate-clean lock) and #224 (flipping bundled packs to an
on-demand cache).
Decision Drivers
Section titled “Decision Drivers”Primary Decision Drivers
Section titled “Primary Decision Drivers”- When a consumer reads the index for an ontology id, the consumer shall obtain
that ontology’s fetch file, its sha256, and its
extendslist by key lookup, with no scan and no external table. - When a consumer fetches an ontology layer, the consumer shall verify the downloaded bytes against an index-supplied sha256 and refuse on mismatch (fail-closed); the index shall therefore carry a per-entry integrity hash.
- The served index and the authoring repo’s
gen-ontology-index.shoutput shall be the same shape, so the corpus has one index contract, not two.
Secondary Decision Drivers
Section titled “Secondary Decision Drivers”- When a person or a discovery tool reads the index, the entry shall still expose the canonical, yaml, and versioned URLs it exposes today.
- The change shall keep the human
index.htmlcatalog and the snapshot--checkgate working.
Considered Options
Section titled “Considered Options”Option 1: Keep the served array index, no integrity hash
Section titled “Option 1: Keep the served array index, no integrity hash”Leave snapshot-ontology-version.py emitting the array {id, version, canonical, yaml, versioned}.
- Pro: No generator change; the discovery site and
--checkgate are untouched. - Con: The fail-closed fetch contract is unsatisfiable — no key lookup, no
sha256, noextends. On-demand vendoring stays blocked indefinitely.
Risk Assessment
Section titled “Risk Assessment”- Technical: High. The published contract cannot serve its only machine consumer; the harness fetcher cannot run against it.
- Schedule: Blocks the dependent harness epic (#222, #224) with no path forward.
- Ecosystem: Two divergent index shapes persist across three repos.
Option 2: Object-keyed, hash-pinned index, discovery fields preserved (chosen)
Section titled “Option 2: Object-keyed, hash-pinned index, discovery fields preserved (chosen)”Change snapshot-ontology-version.py so index.ontologies is an object keyed
by id, each entry {version, file, sha256, extends[], canonical, yaml, versioned}. The file/sha256/extends fields satisfy the fetch contract (and
match gen-ontology-index.sh); the canonical/yaml/versioned fields preserve
discovery. The index.html builder iterates the object’s values instead of the
array.
- Pro: One index contract across the mirror, the authoring generator, and the harness fetcher. Fail-closed integrity is satisfiable by key lookup. Discovery URLs are retained, so no consumer loses information.
- Con: It is a breaking change to a published catalog contract: any
external reader of the array shape must update, and the MIF
--checkgate plusindex.htmlbuilder must change in lockstep.
Risk Assessment
Section titled “Risk Assessment”- Technical: Low-medium. The shape is already proven by
gen-ontology-index.sh; the work is porting it into the MIF snapshot generator and updating the two in-repo consumers (HTML builder,--check). - Schedule: Medium. Requires a coordinated MIF change + redeploy before the harness can flip; the redeploy is async (Pages CI).
- Ecosystem: Medium. Breaking the published array shape affects any unknown external reader; mitigated by the corpus being pre-1.0 and the array index having no known integrity-bearing consumer.
Option 3: Keep the array index; weaken the consumer to trust-on-first-use
Section titled “Option 3: Keep the array index; weaken the consumer to trust-on-first-use”Leave the served index as an array and change the harness fetcher to scan it and pin the sha256 it computes on first download (TOFU), dropping the index cross-check.
- Pro: No MIF generator change; smallest surface.
- Con: Downgrades the supply-chain posture from fail-closed index-cross-checked integrity to trust-on-first-use, removing the defense against a compromised mirror/CDN. The harness constitution marks fail-closed supply chain non-negotiable.
Risk Assessment
Section titled “Risk Assessment”- Technical: Low to build.
- Schedule: Fast.
- Ecosystem: Unacceptable posture downgrade on a published registry; pushes the integrity burden onto every consumer and weakens the guarantee for all of them.
Decision
Section titled “Decision”Adopt Option 2. The served https://mif-spec.dev/ontologies/index.json
becomes an object keyed by ontology id, each entry carrying the fetch fields
{version, file, sha256, extends[]} alongside the existing discovery fields
{canonical, yaml, versioned}. snapshot-ontology-version.py is changed to emit
this shape and to compute each entry’s sha256 over the served *.ontology.yaml
(the bytes the fetcher downloads and pins); its index.html builder iterates the
object’s values; the snapshot --check gate validates the new shape. The shape
matches the authoring repo’s gen-ontology-index.sh, so the corpus carries one
index contract end to end, and the harness’s fail-closed fetch-ontology.sh /
check-ontology-lock.sh work unchanged against it.
Consequences
Section titled “Consequences”Positive
Section titled “Positive”- The fail-closed on-demand vendoring contract becomes satisfiable; the harness epic’s #222 (present, gate-clean lock) and #224 (on-demand cache flip) unblock.
- One index shape across the mirror, the authoring generator, and every consumer — no second design to keep in sync.
- Per-entry
sha256gives mirror/CDN-tamper detection at fetch time, not just post-vendor drift detection.
Negative
Section titled “Negative”- It breaks the published array contract: any external reader of the current
shape must update, and the change must land with the MIF
--checkgate andindex.htmlbuilder in one move or the build fails. - Completion depends on an async public redeploy of
mif-spec.dev; the harness flip cannot be verified end to end until that deploy is live.
Neutral
Section titled “Neutral”- The discovery fields (
canonical,yaml,versioned) are unchanged; only the container shape changes and the fetch fields are added. - Per-ontology
versionsemantics are unchanged — each entry still reports the ontology’s own version, independent of the corpus releaseversions.
Decision Outcome
Section titled “Decision Outcome”The decision meets its drivers. A consumer resolves fetch file, sha256, and
extends by key lookup (primary driver one); the per-entry sha256 lets the
fetcher verify downloaded bytes fail-closed (primary driver two); and the shape is
the one gen-ontology-index.sh already emits, so the corpus has a single index
contract (primary driver three). Discovery URLs are retained (secondary driver
one), and the generator change updates the index.html builder and --check
gate together (secondary driver two).
The residual cost — a breaking change to a published contract gated on an async
redeploy — is mitigated by the corpus being pre-1.0, by the array index having no
known integrity-bearing consumer, and by sequencing the rollout: land and deploy
the MIF index change first, verify the live index.json, and only then flip the
harness to fetch on demand.
Related Decisions
Section titled “Related Decisions”- ADR-0001: Underscore-Prefixed Base-Type Namespaces — the prior corpus decision.
research-harness-templateADR-0012 (on-demand vendoring) — the consumer-side decision this index contract serves.
modeled-information-format/MIFscripts/snapshot-ontology-version.py: the served-mirror +index.jsongenerator changed by this decision.scripts/gen-ontology-index.sh: this repo’s generator, already emitting the object+sha256+extends shape this decision standardizes on.research-harness-templatescripts/fetch-ontology.sh/scripts/check-ontology-lock.sh: the fail-closed consumer.
More Information
Section titled “More Information”The contract is auditable in three places that must agree: the served
mif-spec.dev/ontologies/index.json shape, this repo’s gen-ontology-index.sh
output, and the harness fetcher’s reader. Any one diverging from the others is the
signal that this decision has been violated.
- 2026-06-30: Pending. Proposed. The served
index.jsonis still the array shape; the MIF generator change and redeploy, and the harness flip, are not yet landed. Flip to Compliant oncemif-spec.dev/ontologies/index.jsonserves the object+sha256 shape and the harness vendors against it fail-closed.