Dual-Format Design (Markdown + JSON-LD)
ADR-002: Dual-Format Design (Markdown + JSON-LD)
Section titled “ADR-002: Dual-Format Design (Markdown + JSON-LD)”Status
Section titled “Status”Accepted
Refined by ADR-011 (Markdown is canonical; JSON-LD is a derived projection).
The dual-format decision below stands: MIF retains two representations of a
memory, Markdown and JSON-LD. What ADR-011 refined is the relationship
between them. The original ADR framed the two formats as “semantically
equivalent” and “bidirectionally convertible” co-equals. ADR-011 makes the
Markdown .md concept file the canonical source of truth and the JSON-LD
.jsonld document a derived projection reproducible from it (SPECIFICATION.md
Invariant 2). The original co-equality reasoning is preserved below and
annotated inline where it appears; see the ## Amendment section for the
refinement.
Context
Section titled “Context”Background and Problem Statement
Section titled “Background and Problem Statement”MIF needs to serve audiences with opposed ergonomic needs from one memory model:
- Human authoring and reading of memories.
- Machine processing and validation.
- Semantic-web integration for linked data.
- Existing knowledge-management workflows (notably Obsidian vaults).
No single serialization satisfies all four well. A human-friendly format (Markdown) lacks the structure machines want; a machine-friendly format (JSON-LD, RDF) is hostile to hand-authoring. The architectural question is which representation(s) MIF should adopt as first-class, and how they relate.
Current Limitations
Section titled “Current Limitations”- A single human-readable format gives no validatable, machine-addressable structure and no semantic-web identity.
- A single machine format makes authoring and review painful and breaks the Obsidian-compatibility goal (ADR-003).
- Supporting two formats without a defined relationship risks silent divergence: two representations of “the same” memory that disagree, with no rule for which one wins. (This is exactly the gap ADR-011 later closed.)
Decision Drivers
Section titled “Decision Drivers”Primary Decision Drivers
Section titled “Primary Decision Drivers”- Human authorability: Memories must be writable and readable by people in a familiar, low-friction format.
- Machine processability: Memories must be structured, validatable, and addressable for tooling and semantic-web consumers.
- Ecosystem fit: The human format must work in existing knowledge-management tools (Obsidian, VS Code, plain text editors).
Secondary Decision Drivers
Section titled “Secondary Decision Drivers”- Progressive disclosure: Simple for basic use, rich for advanced use — one shouldn’t pay the cost of the other.
- Convertibility: The two representations must be mechanically derivable from each other so they don’t drift by hand.
Considered Options
Section titled “Considered Options”Option 1: JSON-only
Section titled “Option 1: JSON-only”Description: Represent every memory solely as JSON (later JSON-LD).
Advantages:
- Fully structured and machine-processable.
- Trivial to validate and address programmatically.
Disadvantages:
- Poor human readability; awkward to edit by hand.
- Breaks the Obsidian / plain-text authoring workflow entirely.
Risk Assessment:
- Technical Risk: Low.
- Ecosystem Risk: High. Abandons the human-authoring and Obsidian goals that motivate MIF.
Option 2: Markdown-only
Section titled “Option 2: Markdown-only”Description: Represent every memory solely as Markdown with YAML frontmatter.
Advantages:
- Excellent human authoring and reading; native Obsidian fit.
Disadvantages:
- Limited semantic structure; no JSON-Schema validation surface.
- No semantic-web / linked-data identity.
Risk Assessment:
- Technical Risk: Low.
- Ecosystem Risk: Medium. Closes the door on machine/semantic-web consumers.
Option 3: YAML
Section titled “Option 3: YAML”Description: Use YAML documents as the single memory format.
Advantages:
- Reasonable balance of human and machine legibility.
Disadvantages:
- No semantic-web support (no
@context, no linked-data identity). - Body content (prose) sits awkwardly inside a data-oriented format.
Risk Assessment:
- Technical Risk: Low.
- Ecosystem Risk: Medium. Lacks the linked-data integration MIF targets.
Option 4: RDF / Turtle
Section titled “Option 4: RDF / Turtle”Description: Represent memories in RDF, serialized as Turtle.
Advantages:
- Excellent semantics; first-class linked-data graph.
Disadvantages:
- Poor human authoring; steep learning curve.
- Incompatible with the Obsidian / plain-text workflow.
Risk Assessment:
- Technical Risk: Medium.
- Ecosystem Risk: High. Sacrifices human authorability, a primary driver.
Option 5: Dual format — Markdown + JSON-LD (chosen)
Section titled “Option 5: Dual format — Markdown + JSON-LD (chosen)”Description: Support both Markdown (.md) and JSON-LD (.jsonld) as
first-class representations of a memory. Markdown serves human authoring and
Obsidian compatibility; JSON-LD serves machine processing and semantic-web
integration. The two are mechanically convertible.
Technical Characteristics:
- YAML frontmatter in the Markdown file maps directly to JSON-LD properties.
- The JSON-LD
@contextprovides the RDF vocabulary mapping. - The Markdown body is carried in the JSON-LD
contentfield.
Advantages:
- Human authors write Markdown; machines consume JSON-LD — both first-class.
- Semantic-web compatibility via the JSON-LD
@context. - Works with existing tools (Obsidian, VS Code).
- Progressive disclosure: simple for basic use, rich for advanced.
Disadvantages:
- Dual-representation complexity.
- A relationship between the two must be defined and enforced or they drift.
(The original ADR assumed symmetric “format parity”; ADR-011 replaced that
with an asymmetric canonical-source / derived-projection rule — see
## Amendment.) - Conversion tooling is required (
scripts/mif_convert.py).
Risk Assessment:
- Technical Risk: Low to Medium. Two serializations to keep in sync.
- Schedule Risk: Low.
- Ecosystem Risk: Low. Satisfies both human and machine consumers.
Decision
Section titled “Decision”Support both Markdown and JSON-LD as first-class formats for a MIF memory.
Markdown Format
Section titled “Markdown Format”---id: uuidtype: semanticnamespace: _semantic/knowledgecreated: 2026-01-27T10:00:00Z---
# Memory Title
Memory content in readable Markdown...JSON-LD Format
Section titled “JSON-LD Format”Historical snapshot (original 2026-01-27 decision). The JSON-LD object below is the example exactly as the original ADR recorded it. It is not the current emitted schema: the v1.0 converter emits
conceptType(notmemoryType) and an@idof the formurn:mif:{id}(nomemory:segment), and the canonical source is now the Markdown file (ADR-011). It is retained here as a record of the original decision, not as a current projection example.
{ "@context": "https://mif-spec.dev/schema/context.jsonld", "@type": "Memory", "@id": "urn:mif:memory:uuid", "memoryType": "semantic", "namespace": "_semantic/knowledge", "created": "2026-01-27T10:00:00Z", "content": "Memory content..."}Both formats are semantically equivalent and bidirectionally convertible.
(Original co-equality framing. Refined by ADR-011: Markdown is the
canonical source of truth and JSON-LD is a derived projection reproducible from
it — the conversion is lossless on a markdown → json-ld → markdown round trip
for conformance-level data, but the two are no longer treated as co-equal peers.
See ## Amendment.)
Consequences
Section titled “Consequences”Positive
Section titled “Positive”- Human authoring: Authors write Markdown — familiar, readable, reviewable.
- Machine processing: Machines consume structured, validatable JSON-LD.
- Semantic-web compatibility: The JSON-LD
@contextgives memories a linked-data identity. - Tool support: Existing tools (Obsidian, VS Code) work unchanged.
- Progressive disclosure: Simple for basic use, rich for advanced.
Negative
Section titled “Negative”- Dual-representation complexity: Two serializations exist for one memory.
- Must maintain format parity: The two representations must stay in sync. (Original framing. Refined by ADR-011: the obligation is not symmetric “parity” but a one-way derivation — JSON-LD is regenerated from canonical Markdown, so there is one source to maintain and a reproducible projection, not two peers to reconcile.)
- Conversion tooling required:
scripts/mif_convert.pyperforms and round-trip-verifies the conversion.
Neutral
Section titled “Neutral”- Two artifact classes by design:
.mdconcept files and.jsonldprojections coexist; after ADR-011 their roles are asymmetric (source vs. derived) rather than equivalent.
Decision Outcome
Section titled “Decision Outcome”The dual-format approach achieves the primary drivers: human authorability (Markdown / Obsidian), machine processability (JSON-LD), and ecosystem fit. The one gap in the original decision — an undefined relationship between the two formats — was closed by ADR-011, which designates Markdown canonical and JSON-LD derived. Mitigations:
- The conversion is implemented and round-trip-verified by
scripts/mif_convert.py(roundtripandemit-jsonldsubcommands). - The canonical/derived rule (ADR-011, SPECIFICATION.md Invariant 2) removes the silent-divergence risk inherent in treating the two formats as co-equal.
Related Decisions
Section titled “Related Decisions”- ADR-003: Obsidian Compatibility — the Markdown format is designed to be valid Obsidian notes, which is the constraint that makes Markdown the natural human-authoring surface here.
- ADR-007: GitHub Raw URLs for Schema IDs — the JSON-LD projection requires resolvable schema
$id/@contextURIs, the scheme that ADR establishes. - ADR-011: Markdown-Canonical with Derived JSON-LD — refines this decision: Markdown is the canonical source of truth and JSON-LD is a derived projection, replacing the original co-equality framing.
- JSON-LD 1.1 — the linked-data serialization MIF projects to.
- Obsidian — the knowledge-management tool the Markdown format targets (ADR-003).
More Information
Section titled “More Information”- Date: 2026-01-27 (original); refined by ADR-011
- Source: SPECIFICATION.md §2.1 “Dual Representation”;
scripts/mif_convert.py(markdown ↔ JSON-LD converter). - Related ADRs: ADR-003, ADR-007, ADR-011
Amendment
Section titled “Amendment”Refined by ADR-011 — Markdown canonical, JSON-LD derived
Section titled “Refined by ADR-011 — Markdown canonical, JSON-LD derived”The original decision (2026-01-27) framed Markdown and JSON-LD as co-equal representations: “semantically equivalent,” with “format parity” to be maintained between two peers. ADR-011 refined this without discarding the dual-format decision itself.
Under the refined model:
- The Markdown
.mdconcept file is the canonical source of truth (SPECIFICATION.md Invariant 2). - The JSON-LD
.jsonlddocument is a derived projection, reproducible by running the converter over the canonical Markdown. - The relationship is a one-way derivation that is lossless on a
markdown → json-ld → markdownround trip for conformance-level data, not a symmetric parity obligation between two independently-authored artifacts.
Rationale for the refinement: Treating two serializations as co-equal leaves
no rule for which wins when they disagree, inviting silent divergence. Naming
Markdown canonical and JSON-LD derived gives a single editable source, makes the
projection reproducible and CI-verifiable, and keeps the .jsonld artifact
outside OKF’s *.md legibility surface (ADR-009) without sacrificing the
machine/semantic-web consumers that motivated the dual-format decision. The
dual-format decision is therefore intact; only the formats’ relationship
changed from peer-equivalence to source-and-projection.
2026-06-18
Section titled “2026-06-18”Status: Compliant
Findings:
| Finding | Files | Lines | Assessment |
|---|---|---|---|
Dual representation (.md human-readable, .jsonld machine-processable) defined as first-class | SPECIFICATION.md | L103-L108 | compliant |
| ”Both representations MUST be losslessly convertible to each other” (the convertibility claim this ADR makes) | SPECIFICATION.md | L110 | compliant |
| Refinement: Markdown is the source of truth (Invariant 2) and JSON-LD is a derived projection — basis for the ADR-011 amendment | scripts/mif_convert.py | L2-L9 | compliant |
Converter implements to-jsonld / to-markdown and lossless roundtrip over bundles | scripts/mif_convert.py | L19-L22, L286-L295 | compliant |
Derived-projection emitter (emit-jsonld) regenerates .jsonld from canonical .md | scripts/mif_convert.py | L267-L279 | compliant |
Summary: The dual-representation design and the lossless-convertibility requirement are both stated normatively in SPECIFICATION.md §2.1. The canonical-source / derived-projection refinement attributed to ADR-011 is directly evidenced by the converter’s own module docstring (“.md concept file is the source of truth (Invariant 2)”; “JSON-LD is a derived projection”), and the converter implements both the round-trip verification and the derived-JSON-LD emitter described in this ADR.
Action Required: None.