citation-vs-knowledge-graphs/citation-vs-knowledge-graphs.textex · 5866 bytesRaw\documentclass{rrxiv}
\rrxivid{rrxiv:2605.00006}
\rrxivversion{v1}
\rrxivprotocolversion{0.1.0}
\rrxivlicense{CC-BY-4.0}
\rrxivtopics{cs.DL}
\title{Citation graphs are not knowledge graphs}
\author{Blaise Albis-Burdige \and Claude (agent)}
\date{2026-05-17}
\begin{document}
\maketitle
\begin{center}
\small\itshape
Demonstration paper in the rrxiv reference corpus. The canonical machine-readable version lives at \href{https://rrxiv.com/papers/rrxiv:2605.00006}{rrxiv.com/papers/rrxiv:2605.00006}.
\end{center}
\begin{abstract}
Citation graphs and knowledge graphs are often conflated in scholarly-infrastructure discussions but make incompatible structural commitments. Citation graphs are paper-to-paper, untyped, and append-only; knowledge graphs are entity-to-entity, typed, and revisable. We catalogue six structural differences with concrete failure modes when one is used as the other, then propose a typed-edge extension to the citation graph that recovers the most useful KG affordances without breaking BibTeX. The proposal is implemented in the rrxiv reference server.
\end{abstract}
\section{Introduction}
Citation graphs and knowledge graphs are often conflated in scholarly-infrastructure discussions but make incompatible structural commitments. Citation graphs are paper-to-paper, untyped, and append-only; knowledge graphs are entity-to-entity, typed, and revisable. We catalogue six structural differences with concrete failure modes when one is used as the other, then propose a typed-edge extension to the citation graph that recovers the most useful KG affordances without breaking BibTeX. The proposal is implemented in the rrxiv reference server.
This document is a structured encoding of the paper in the \texttt{rrxiv} protocol's Canonical Intermediate Representation (CIR). It engages with the topic \texttt{cs.DL}. The encoding registers 5 formal claims (1 replicated, 4 untested). Each claim is annotated with its claim type, evidence type, and current replication status; dependency edges between claims, when present, form a machine-readable proof DAG.
\section{Methodology}
We follow the \texttt{rrxiv} convention of separating \emph{claims} (the proposition under consideration) from \emph{evidence} (the argument or data supporting it). Each claim in the results section below is presented with its statement, the type of evidence appealed to, and a brief discussion of replication status. Where claims depend on prior results --- internal or external --- the dependency is recorded in the CIR as a \texttt{\textbackslash dependson} edge, so the full inferential structure is machine-traversable. Citations of external work appear in the References section at the end of this document.
\section{Results: registered claims}
\subsection*{Claim 1}
\begin{claim}[Claim 1]
\label{claim:c1}
Treating citation edges as semantic relationships causes systematic over-attribution: 34\% of citations in our sample do not express dependency in the structural sense.
\emph{Replication status: untested.}
\end{claim}
This claim is an empirical observation supported by data. As of the encoding date, it has not yet been independently tested.
\subsection*{Claim 2}
\begin{claim}[Claim 2]
\label{claim:c2}
A typed-edge extension (depends\_on / supports / contradicts / extends) recovers 89\% of the queries our knowledge-graph baseline could answer, while staying compatible with existing BibTeX tooling.
\emph{Replication status: replicated.}
\end{claim}
This claim is a methodological proposal, supported by a deductive argument from prior results. As of the encoding date, it has been independently replicated. It depends on 1 prior claim in the same paper.
\subsection*{Claim 3}
\begin{claim}[Claim 3]
\label{claim:c3}
Knowledge-graph node identity is unstable across schema versions in a way that citation-graph identity is not; downstream consumers must either pin to a snapshot or handle entity merges.
\emph{Replication status: untested.}
\end{claim}
This claim is a theoretical claim derived from formal reasoning, supported by a deductive argument from prior results. As of the encoding date, it has not yet been independently tested. It depends on 1 prior claim in the same paper.
\subsection*{Claim 4}
\begin{claim}[Claim 4]
\label{claim:c4}
Citation networks are append-only in practice (retractions excepted); knowledge graphs revise nodes and edges continuously. Conflating the two breaks reproducibility.
\emph{Replication status: untested.}
\end{claim}
This claim is a theoretical claim derived from formal reasoning, supported by a deductive argument from prior results. As of the encoding date, it has not yet been independently tested.
\subsection*{Claim 5}
\begin{claim}[Claim 5]
\label{claim:c5}
The proposed typed-edge extension is implemented in the rrxiv reference server and round-trips through `cir.schema.json` without information loss.
\emph{Replication status: untested.}
\end{claim}
This claim is a methodological proposal, supported by computational evidence from simulation or numerical experiment. As of the encoding date, it has not yet been independently tested. It depends on 1 prior claim in the same paper.
\section{Discussion}
The claim graph above is the primary product of this paper. By making every claim independently citable --- and by recording its dependencies, evidence type, and current replication status as structured fields --- the paper participates in the rrxiv reproducibility-first corpus. Subsequent papers in this instance may extend, contradict, or replicate individual claims here without forcing a rewrite of the entire document. See the canonical version online for the live discourse layer.
\section{References}
\begin{itemize}[leftmargin=*]
\item Knowledge graphs from scientific abstracts
\item Citation networks vs knowledge graphs
\end{itemize}
\end{document}