This page is for
When the AI replies "based on internal policy section X," can you later prove that the citation came from the exact version of the policy that was live at the time? Are vector DBs being rebuilt, embeddings drifting, and citation-to-source mappings quietly slipping?
Cited ≠ verified. Do you have a path that cryptographically traces the basis behind an AI's answer back to the source?
- Product leads and legal heads at legal-tech (case research, contract analysis AI)
- Teams running enterprise knowledge platforms where citation integrity in AI answers has become an operational issue
- Financial compliance functions generating regulatory reports and audit trails through AI
- AI governance leads preparing for EU AI Act / ISO 42001 by setting up citation-source verification
- Engineering leads who see "same question, different citation" and "post-rebuild citation drift" as an operational risk
How Lemma approaches it
For every source the AI cites in an answer, Lemma generates a ZK proof binding the citation to the docHash of the precise document version it references. A citation stops being a label and becomes a cryptographically bound reference. Source documents themselves never enter the index or the answer; what crosses to the verifier is only the cryptographic fact "this citation traces to paragraph B of policy v3, which was live at answer time."
When the vector DB is rebuilt and the policy is revised, the citation proofs attached to past AI answers remain intact. There's no reconstructing "what was that answer based on" after the fact — you reference the citation proof directly.
Where the citation-proof layer fits between your answer generation and citation step is what we map out in a first conversation.
Lemma Discovery Call — Start with a 30-minute conversation
Tell us how your AI workflow is wired today, what citations get attached to answers, and where citation integrity is hurting most. We'll explore together whether Lemma's citation-proof layer could fit. No answer logs or source documents required.
If we see a fit, we move to NDA and then into sector-specific citation requirement design, reference architecture, and PoC design.
A real-world example: a legal AI citing a pre-revision policy
An internal AI at a legal-tech firm reviews contracts and replies, "Clause 4.2 aligns with internal guideline GL-2025-08, section C." Six months later, after a regulator's reinterpretation, the internal guideline is revised to GL-2026-02. The RAG index has been rebuilt since.
A problem surfaces on one of those contracts, and the team needs to re-examine the past reviews. Which version of which guideline did the AI rely on, in which paragraph? The answer log says "GL-2025-08, section C," but the vector DB has been rebuilt and embeddings have shifted. There is no cryptographic evidence that the paragraph quoted at the time and the same-named paragraph in the current index are the same content.
With Lemma in place, every citation in an AI answer ships with a docHash proof. Differences between the paragraph C of the guideline as it was cited and paragraph C in the current revision are structurally detectable. No need to dig through answer logs — you reference the citation proof to cryptographically present "the answer relied on this exact docHash of section C in GL-2025-08."
Sector-specific citation requirement design, integration patterns with RAG and retrieval frameworks (LangChain, LlamaIndex, etc.), and evidence-trail design for EU AI Act / ISO 42001 are shared in the sector-specific kit we send after the consultation call.
Architecture in concept
Lemma does not replace your AI answer generation (LLM, prompts, RAG retrieval). We add a citation-attestation step between answer generation and citation emission.
When the LLM produces an answer carrying citations, Lemma generates a ZK proof that bundles each citation's docHash, the answer timestamp, and the RAG index version in use. The source documents themselves do not appear in the citation or in the proof. Downstream audit and compliance checks simply reference the citation proof chain to cryptographically reproduce the original mapping between citation and source content.
Integration patterns with LLM and RAG frameworks (Anthropic Claude, OpenAI, LangChain, LlamaIndex, etc.), citation-format design (footnotes, sidenotes, JSON metadata), and evidence-trail design for EU AI Act / ISO 42001 are detailed in the whitepaper and the post-call technical kit.
What Lemma cryptographically guarantees
- A cryptographic binding between every citation and the docHash of the referenced document version
- The answer timestamp, the RAG index version in use, and the LLM identifier
- No exposure of source data, with the citation proof chain verifiable by third parties
- The cryptographic identity of past citations, unchanged across document revisions and index rebuilds
Ready to prove?
Talk to us about your use case. We respond within one business day.